Tutorial¶
Getting started¶
To start using Layers, simply
clone/download this repo and run
$ cd PoorNN/
$ pip install -r requirements.txt
$ python setup.py install
Layers is built on a high-performance Fortran 90 code. Please install lapack/mkl and gfortran/ifort before running above installation.
Construct a first feed forward network¶
'''
Build a simple neural network that can be used to study mnist data set.
'''
import numpy as np
from poornn.nets import ANN
from poornn.checks import check_numdiff
from poornn import functions, Linear
from poornn.utils import typed_randn
def build_ann():
'''
builds a single layer network for mnist classification problem.
'''
F1 = 10
I1, I2 = 28, 28
eta = 0.1
dtype = 'float32'
W_fc1 = typed_randn(dtype, (F1, I1 * I2)) * eta
b_fc1 = typed_randn(dtype, (F1,)) * eta
# create an empty vertical network.
ann = ANN()
linear1 = Linear((-1, I1 * I2), dtype, W_fc1, b_fc1)
ann.layers.append(linear1)
ann.add_layer(functions.SoftMaxCrossEntropy, axis=1)
ann.add_layer(functions.Mean, axis=0)
return ann
# build and print it
ann = build_ann()
print(ann)
# random numerical differenciation check.
# prepair a one-hot target.
y_true = np.zeros(10, dtype='float32')
y_true[3] = 1
assert(all(check_numdiff(ann, var_dict={'y_true': y_true})))
# graphviz support
from poornn.visualize import viznn
viznn(ann, filename='./mnist_simple.png')
You will get a terminal output
<ANN|s>: (-1, 784)|s -> ()|s
<Linear|s>: (-1, 784)|s -> (-1, 10)|s
- var_mask = (1, 1)
- is_unitary = False
<SoftMaxCrossEntropy|s>: (-1, 10)|s -> (-1,)|s
- axis = 1
<Mean|s>: (-1,)|s -> ()|s
- axis = 0
and an illustration of network stored in /mnist_simple.png
, it looks like
where the shape and type of data flow is marked on lines, and operations are boxes.
Note
However, the above example raises a lib not found error if you don’t have pygraphviz installed on your host. It is strongly recommended to try out pygraphviz.
Ideology¶
First, what is a Layer
?
Layers
is an abstract class (or interface), which defines a protocal.
This protocal specifies
- Interface information, namely
input_shape
,output_shape
,itype
,otype
anddtype
, wheredtype
is the type of variables in this network anditype
,otype
are input and output array data type. - Data flow manipulation methods, namely
forward()
andbackward()
.forward()
perform action \(y=f(x)\) and output \(y\), with \(f\) defines the functionality of this network.backward()
perform action \((x,y),\frac{\partial J}{\partial y}\to\frac{\partial J}{\partial w},\frac{\partial J}{\partial x}\), where \(J\) and \(w\) are target cost and layer variables respectively. \(x\) and \(y\) are always required as a unified interface (benefits network design), usally they are generated during a forward run. - Variable getter and setter, namely
get_variables()
,set_variables()
,num_variables
(as property) andset_runtime_vars()
.get_variables()
always return a 1D array of lengthnum_variables
and set_variables take such an array as input. Also, a layer can take runtime variables (which should be specidied in tags, see bellow), like aseed
in order to take a control over aDropOut
layer. These getter and setter are required because we need a unified interface to access variables but not to make variables unreadable in a layer realization. Notice that reshape in numpy does not change array storage, so don’t worry about performance. - Tags (optional),
tags
attribute defines some additional property of a layer, which is an optional dict type variable which belongs to a class. So far, these properties includes ‘runtimes’ (list), ‘is_inplace’ (bool) and ‘analytical’ (int), seepoornn.core.TAG_LIST
for details. Iftags
is not defined, layer usepoornn.core.DEFAULT_TAGS
as a default tags.
Any object satisfing the above protocal can be used as a Layer
. An immediate benefit is that it can be tested.
e.g. numerical differenciation test using poornn.checks.check_numdiff()
.
Through running the above example, we notice the following facts:
- Layers take numpy array as inputs and generates array outputs (notice what
typed_randn()
also generates numpy array). - Network
ANN
is a subclass ofLayer
, it realizes all the interfaces claimed inLayer
, it is a kind of simplest verticalContainer
. Here,Container
is a special kind ofLayer
, it take other layers as its entity and has no independant functionality. Containers can be nested, chained, ... to realize complex networks. -1
is used as a placeholder in a shape, however, using more than 1 place holder in one shape tuple is not recommended, it raises error during reshaping.- Anytime, a
Layer
takeinput_shape
anditype
as first 2 parameters to initialize, even it is not needed! However, you can ommit it by usingadd_layer()
method ofANN
orPrallelNN
network when you are trying to add a layer to existing network.add_layer()
can infer input shape and type from previous layers. Appearantly, it fails when there is no layers in a container. Then you should usenet.layers.append()
to add a first layer, or give at least one layer when initialing a container.
Note
In linear layers, fortran (‘F’) ordered weights and inputs are used by default.