Tutorial

Getting started

To start using Layers, simply

clone/download this repo and run

$ cd PoorNN/
$ pip install -r requirements.txt
$ python setup.py install

Layers is built on a high-performance Fortran 90 code. Please install lapack/mkl and gfortran/ifort before running above installation.

Construct a first feed forward network

'''
Build a simple neural network that can be used to study mnist data set.
'''

import numpy as np
from poornn.nets import ANN
from poornn.checks import check_numdiff
from poornn import functions, Linear
from poornn.utils import typed_randn


def build_ann():
    '''
    builds a single layer network for mnist classification problem.
    '''
    F1 = 10
    I1, I2 = 28, 28
    eta = 0.1
    dtype = 'float32'

    W_fc1 = typed_randn(dtype, (F1, I1 * I2)) * eta
    b_fc1 = typed_randn(dtype, (F1,)) * eta

    # create an empty vertical network.
    ann = ANN()
    linear1 = Linear((-1, I1 * I2), dtype, W_fc1, b_fc1)
    ann.layers.append(linear1)
    ann.add_layer(functions.SoftMaxCrossEntropy, axis=1)
    ann.add_layer(functions.Mean, axis=0)
    return ann


# build and print it
ann = build_ann()
print(ann)

# random numerical differenciation check.
# prepair a one-hot target.
y_true = np.zeros(10, dtype='float32')
y_true[3] = 1

assert(all(check_numdiff(ann, var_dict={'y_true': y_true})))

# graphviz support
from poornn.visualize import viznn
viznn(ann, filename='./mnist_simple.png')

You will get a terminal output

<ANN|s>: (-1, 784)|s -> ()|s
    <Linear|s>: (-1, 784)|s -> (-1, 10)|s
      - var_mask = (1, 1)
      - is_unitary = False
    <SoftMaxCrossEntropy|s>: (-1, 10)|s -> (-1,)|s
      - axis = 1
    <Mean|s>: (-1,)|s -> ()|s
      - axis = 0

and an illustration of network stored in /mnist_simple.png, it looks like

_static/mnist_simple.png

where the shape and type of data flow is marked on lines, and operations are boxes.

Note

However, the above example raises a lib not found error if you don’t have pygraphviz installed on your host. It is strongly recommended to try out pygraphviz.

Ideology

First, what is a Layer?

Layers is an abstract class (or interface), which defines a protocal. This protocal specifies

  • Interface information, namely input_shape, output_shape, itype, otype and dtype, where dtype is the type of variables in this network and itype, otype are input and output array data type.
  • Data flow manipulation methods, namely forward() and backward(). forward() perform action \(y=f(x)\) and output \(y\), with \(f\) defines the functionality of this network. backward() perform action \((x,y),\frac{\partial J}{\partial y}\to\frac{\partial J}{\partial w},\frac{\partial J}{\partial x}\), where \(J\) and \(w\) are target cost and layer variables respectively. \(x\) and \(y\) are always required as a unified interface (benefits network design), usally they are generated during a forward run.
  • Variable getter and setter, namely get_variables(), set_variables(), num_variables (as property) and set_runtime_vars(). get_variables() always return a 1D array of length num_variables and set_variables take such an array as input. Also, a layer can take runtime variables (which should be specidied in tags, see bellow), like a seed in order to take a control over a DropOut layer. These getter and setter are required because we need a unified interface to access variables but not to make variables unreadable in a layer realization. Notice that reshape in numpy does not change array storage, so don’t worry about performance.
  • Tags (optional), tags attribute defines some additional property of a layer, which is an optional dict type variable which belongs to a class. So far, these properties includes ‘runtimes’ (list), ‘is_inplace’ (bool) and ‘analytical’ (int), see poornn.core.TAG_LIST for details. If tags is not defined, layer use poornn.core.DEFAULT_TAGS as a default tags.

Any object satisfing the above protocal can be used as a Layer. An immediate benefit is that it can be tested. e.g. numerical differenciation test using poornn.checks.check_numdiff().

Through running the above example, we notice the following facts:

  1. Layers take numpy array as inputs and generates array outputs (notice what typed_randn() also generates numpy array).
  2. Network ANN is a subclass of Layer, it realizes all the interfaces claimed in Layer, it is a kind of simplest vertical Container. Here, Container is a special kind of Layer, it take other layers as its entity and has no independant functionality. Containers can be nested, chained, ... to realize complex networks.
  3. -1 is used as a placeholder in a shape, however, using more than 1 place holder in one shape tuple is not recommended, it raises error during reshaping.
  4. Anytime, a Layer take input_shape and itype as first 2 parameters to initialize, even it is not needed! However, you can ommit it by using add_layer() method of ANN or PrallelNN network when you are trying to add a layer to existing network. add_layer() can infer input shape and type from previous layers. Appearantly, it fails when there is no layers in a container. Then you should use net.layers.append() to add a first layer, or give at least one layer when initialing a container.

Note

In linear layers, fortran (‘F’) ordered weights and inputs are used by default.