Tutorial¶

Getting started¶

To start using Layers, simply

$ cd PoorNN/
$ pip install -r requirements.txt
$ python setup.py install

Layers is built on a high-performance Fortran 90 code. Please install lapack/mkl and gfortran/ifort before running above installation.

Construct a first feed forward network¶

'''
Build a simple neural network that can be used to study mnist data set.
'''

import numpy as np
from poornn.nets import ANN
from poornn.checks import check_numdiff
from poornn import functions, Linear
from poornn.utils import typed_randn


def build_ann():
    '''
    builds a single layer network for mnist classification problem.
    '''
    F1 = 10
    I1, I2 = 28, 28
    eta = 0.1
    dtype = 'float32'

    W_fc1 = typed_randn(dtype, (F1, I1 * I2)) * eta
    b_fc1 = typed_randn(dtype, (F1,)) * eta

    # create an empty vertical network.
    ann = ANN()
    linear1 = Linear((-1, I1 * I2), dtype, W_fc1, b_fc1)
    ann.layers.append(linear1)
    ann.add_layer(functions.SoftMaxCrossEntropy, axis=1)
    ann.add_layer(functions.Mean, axis=0)
    return ann


# build and print it
ann = build_ann()
print(ann)

# random numerical differenciation check.
# prepair a one-hot target.
y_true = np.zeros(10, dtype='float32')
y_true[3] = 1

assert(all(check_numdiff(ann, var_dict={'y_true': y_true})))

# graphviz support
from poornn.visualize import viznn
viznn(ann, filename='./mnist_simple.png')

You will get a terminal output

<ANN|s>: (-1, 784)|s -> ()|s
    <Linear|s>: (-1, 784)|s -> (-1, 10)|s
      - var_mask = (1, 1)
      - is_unitary = False
    <SoftMaxCrossEntropy|s>: (-1, 10)|s -> (-1,)|s
      - axis = 1
    <Mean|s>: (-1,)|s -> ()|s
      - axis = 0

and an illustration of network stored in /mnist_simple.png, it looks like

where the shape and type of data flow is marked on lines, and operations are boxes.

Note

However, the above example raises a lib not found error if you don’t have pygraphviz installed on your host. It is strongly recommended to try out pygraphviz.

Ideology¶

First, what is a Layer?

Layers is an abstract class (or interface), which defines a protocal. This protocal specifies

Interface information, namely input_shape, output_shape, itype, otype and dtype, where dtype is the type of variables in this network and itype, otype are input and output array data type.
Data flow manipulation methods, namely forward() and backward(). forward() perform action \(y=f(x)\) and output \(y\), with \(f\) defines the functionality of this network. backward() perform action \((x,y),\frac{\partial J}{\partial y}\to\frac{\partial J}{\partial w},\frac{\partial J}{\partial x}\), where \(J\) and \(w\) are target cost and layer variables respectively. \(x\) and \(y\) are always required as a unified interface (benefits network design), usally they are generated during a forward run.
Variable getter and setter, namely get_variables(), set_variables(), num_variables (as property) and set_runtime_vars(). get_variables() always return a 1D array of length num_variables and set_variables take such an array as input. Also, a layer can take runtime variables (which should be specidied in tags, see bellow), like a seed in order to take a control over a DropOut layer. These getter and setter are required because we need a unified interface to access variables but not to make variables unreadable in a layer realization. Notice that reshape in numpy does not change array storage, so don’t worry about performance.
Tags (optional), tags attribute defines some additional property of a layer, which is an optional dict type variable which belongs to a class. So far, these properties includes ‘runtimes’ (list), ‘is_inplace’ (bool) and ‘analytical’ (int), see poornn.core.TAG_LIST for details. If tags is not defined, layer use poornn.core.DEFAULT_TAGS as a default tags.

Any object satisfing the above protocal can be used as a Layer. An immediate benefit is that it can be tested. e.g. numerical differenciation test using poornn.checks.check_numdiff().

Through running the above example, we notice the following facts:

Layers take numpy array as inputs and generates array outputs (notice what typed_randn() also generates numpy array).
Network ANN is a subclass of Layer, it realizes all the interfaces claimed in Layer, it is a kind of simplest vertical Container. Here, Container is a special kind of Layer, it take other layers as its entity and has no independant functionality. Containers can be nested, chained, ... to realize complex networks.
-1 is used as a placeholder in a shape, however, using more than 1 place holder in one shape tuple is not recommended, it raises error during reshaping.
Anytime, a Layer take input_shape and itype as first 2 parameters to initialize, even it is not needed! However, you can ommit it by using add_layer() method of ANN or PrallelNN network when you are trying to add a layer to existing network. add_layer() can infer input shape and type from previous layers. Appearantly, it fails when there is no layers in a container. Then you should use net.layers.append() to add a first layer, or give at least one layer when initialing a container.

Note

In linear layers, fortran (‘F’) ordered weights and inputs are used by default.