Basic of Neural Network using torch

—————————————-
–for an RGB image
—————————————-
 require ‘image’
l= image.lena ()
 image.display(l)
—————————————-
— for gray scale image
—————————————-
fabio = image.fabio ()
image.display(fabio)
—————————————-
— weights of a neural network can also be displayed
—————————————-
 require ‘nn’
 net = nn.SpatialConvolution (1, 64, 16, 16)
 image.display{image=net.weight ,zoom =3}
—————————————-
— and of course the results of running a neural network
—————————————-
 n = nn.SpatialConvolution (1 ,16 ,12 ,12)
 res = n:forward(image.rgb2y(image.lena ()))
 image.display{image=res:view (16 ,1 ,501 ,501) , zoom=0.5}
—————————————-
Link;
display in torch
—————————————-
For upsample:  or resize image
—————————————-
A simple nearest neighbor upsampler applied to every channel of the feature map

methods

1. scale

image.scale()

2. SpatialUpSamplingNearest

module = nn.SpatialUpSamplingNearest(scale)

3. SpatialUpSamplingBilinear

module = nn.SpatialUpSamplingBilinear(scale)
module = nn.SpatialUpSamplingBilinear({oheight=H, owidth=W})
—————————————-
For Reshape:
—————————————-

 1. nn.reshape(d1,d2):forward(input)

Example: 
nn.Reshape(2,8):forward(x)

2. nn.view([..]):forward(input)  


----------------------------------------
For Display Image:
—————————————-

image.display(input, […])

The input, which is either a Tensor of size HxW, KxHxW or Kx3xHxW, or list, is first prepared for display by passing it through image.toDisplayTensor:

input = image.toDisplayTensor{
   input=input, padding=padding, nrow=nrow, saturate=saturate, 
   scaleeach=scaleeach, min=min, max=max, symmetric=symm
}

image.display(input)

Run using :  qlua  my_display.lua

—————————————-
For Display Image in ipython notebook:
—————————————-
itorch.image()
—————————————-
–For element wise product
—————————————-
torch.cmul(q,w)

example
q
 1  2  3
 2  4  6

w
 1  2  3
 2  4  6

It should yield:

1   4   9
4  16  36

th> torch.cmul(q,w)
  1   4   9
  4  16  36


----------------------------------------
Gaussian Filter

----------------------------------------

d=image.gaussian(3)
print('d',d)
>>d	
 0.1690  0.4111  0.1690
 0.4111  1.0000  0.4111
 0.1690  0.4111  0.1690
[torch.DoubleTensor of size 3x3]


http://hunch.net/~nyoml/torch7.pdf
model:add( nn.SpatialContrastiveNormalization(16, image.gaussian(3)) )
—————————————-
–opencv  filters using torch
—————————————-
cv.imshow{winname="Original image with text", image=image}


https://github.com/torch/image/tree/master/test

--------------------------------------------------------------
-- torch  Error
-------------------------------------------------------------

Issue 1: I got  this error "torch.FloatTensor has no call operator"

If my function name and variable have same name then then i got this error
 after changing  function name this error is solved. 

Issue 2:
Type>, TensorInfo<T, IndexType>, TensorInfo<long, IndexType>, int, int, IndexType, IndexType, long) [with T = float, IndexType = unsigned int, DstDim = 2, SrcDim = 2, IdxDim = -2]: block: [27,0,0], thread: [56,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/tmp/luarocks_cutorch-scm-1-3948/cutorch/lib/THC/THCTensorIndex.cu:321: void indexSelectLargeIndex(TensorInfo<T, IndexType>, TensorInfo<T, IndexType>, TensorInfo<long, IndexType>, int, int, IndexType, IndexType, long) [with T = float, IndexType = unsigned int, DstDim = 2, SrcDim = 2, IdxDim = -2]: block: [27,0,0], thread: [57,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/tmp/luarocks_cutorch-scm-1-3948/cutorch/lib/THC/THCTensorIndex.cu:321: void indexSelectLargeIndex(TensorInfo<T, IndexType>, TensorInfo<T, IndexType>, TensorInfo<long, IndexType>, int, int, IndexType, IndexType, long) [with T = float, IndexType = unsigned int, DstDim = 2, SrcDim = 2, IdxDim = -2]: block: [27,0,0], thread: [58,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/tmp/luarocks_cutorch-scm-1-3948/cutorch/lib/THC/THCTensorIndex.cu:321: void indexSelectLargeIndex(TensorInfo<T, IndexType>, TensorInfo<T, IndexType>, TensorInfo<long, IndexType>, int, int, IndexType, IndexType, long) [with T = float, IndexType = unsigned int, DstDim = 2, SrcDim = 2, IdxDim = -2]: block: [27,0,0], thread: [59,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/tmp/luarocks_cutorch-scm-1-3948/cutorch/lib/THC/THCTensorIndex.cu:321: void indexSelectLargeIndex(TensorInfo<T, IndexType>, TensorInfo<T, IndexType>, TensorInfo<long, IndexType>, int, int, IndexType, IndexType, long) [with T = float, IndexType = unsigned int, DstDim = 2, SrcDim = 2, IdxDim = -2]: block: [27,0,0], thread: [60,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/tmp/luarocks_cutorch-scm-1-3948/cutorch/lib/THC/THCTensorIndex.cu:321: void indexSelectLargeIndex(TensorInfo<T, IndexType>, TensorInfo<T, IndexType>, TensorInfo<long, IndexType>, int, int, IndexType, IndexType, long) [with T = float, IndexType = unsigned int, DstDim = 2, SrcDim = 2, IdxDim = -2]: block: [27,0,0], thread: [61,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/tmp/luarocks_cutorch-scm-1-3948/cutorch/lib/THC/THCTensorIndex.cu:321: void indexSelectLargeIndex(TensorInfo<T, IndexType>, TensorInfo<T, IndexType>, TensorInfo<long, IndexType>, int, int, IndexType, IndexType, long) [with T = float, IndexType = unsigned int, DstDim = 2, SrcDim = 2, IdxDim = -2]: block: [27,0,0], thread: [62,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/tmp/luarocks_cutorch-scm-1-3948/cutorch/lib/THC/THCTensorIndex.cu:321: void indexSelectLargeIndex(TensorInfo<T, IndexType>, TensorInfo<T, IndexType>, TensorInfo<long, IndexType>, int, int, IndexType, IndexType, long) [with T = float, IndexType = unsigned int, DstDim = 2, SrcDim = 2, IdxDim = -2]: block: [27,0,0], thread: [63,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
qlua: .../gpu/badri1/torch/install/share/lua/5.1/nn/Container.lua:67: 
In 4 module of nn.Sequential:

./misc/LSTM.lua:160: cublas runtime error : library not initialized at /tmp/luarocks_cutorch-scm-1-3948/cutorch/lib/THC/THCGeneral.c:378
stack traceback:


Solution :excute  
source ~/.bashrc   in terminal


 Issue 
In 3 module of nn.Sequential:
/home/cse/torch/install/share/lua/5.1/nn/Linear.lua:86: invalid arguments: CudaTensor number number nil CudaTensor 
expected arguments: *CudaTensor~2D* [CudaTensor~2D] [float] CudaTensor~2D CudaTensor~2D | *CudaTensor~2D* float [CudaTensor~2D] float CudaTensor~2D CudaTensor~2D

Ans
dimension mismatch or feeding some  unknown value 


 ---------------------------------------------------------
DistanceRatioCriterion in triplet
 ---------------------------------------------------------

https://github.com/torch/nn/blob/master/doc/criterion.md#nn.DistanceRatioCriterion
https://github.com/torch/nn/blob/master/DistanceRatioCriterion.lua
---------------------------------------------------------------------------------------


https://github.com/Atcold/torch-TripletEmbedding/blob/master/test.lua
print(b(Fresh-embeddings-computation network:)); print(parallel)
Cost function
loss = nn.TripletEmbeddingCriterion()
for i = 1, 9 do
print(colour.green(Epoch .. i))
predict = parallel:forward({aImgs, pImgs, nImgs})
err = loss:forward(predict)
errGrad = loss:backward(predict)
parallel:zeroGradParameters()
parallel:backward({aImgs, pImgs, nImgs}, errGrad)
parallel:updateParameters(0.01)

criterion = nn.TripletEmbeddingCriterion([alpha])
-----------------------------------------------------------------------------------
https://github.com/torch/nn/blob/master/DistanceRatioCriterion.lua
https://github.com/torch/nn/blob/master/doc/criterion.md#nn.DistanceRatioCriterion
https://github.com/vbalnt/pnnet/blob/master/train/run.lua
local EmbeddingNet = require(opt.network)
local TripletNet = nn.TripletNet(EmbeddingNet)
local Loss = nn.DistanceRatioCriterion()
   loss = -log( exp(-Ds) / ( exp(-Ds) + exp(-Dd) ) )

Sample example

   torch.setdefaulttensortype("torch.FloatTensor")

   require 'nn'

   -- triplet : with batchSize of 32 and dimensionality 512
   sample = {torch.rand(32, 512), torch.rand(32, 512), torch.rand(32, 512)}

   embeddingModel = nn.Sequential()
   embeddingModel:add(nn.Linear(512, 96)):add(nn.ReLU())

   tripleModel = nn.ParallelTable()
   tripleModel:add(embeddingModel)
   tripleModel:add(embeddingModel:clone('weight', 'bias', 
                                        'gradWeight', 'gradBias'))
   tripleModel:add(embeddingModel:clone('weight', 'bias',
                                        'gradWeight', 'gradBias'))

   -- Similar sample distance w.r.t anchor sample
   posDistModel = nn.Sequential()
   posDistModel:add(nn.NarrowTable(1,2)):add(nn.PairwiseDistance())

   -- Different sample distance w.r.t anchor sample
   negDistModel = nn.Sequential()
   negDistModel:add(nn.NarrowTable(2,2)):add(nn.PairwiseDistance())

   distanceModel = nn.ConcatTable():add(posDistModel):add(negDistModel)

   -- Complete Model
   model = nn.Sequential():add(tripleModel):add(distanceModel)

   -- DistanceRatioCriterion
   criterion = nn.DistanceRatioCriterion(true)

   -- Forward & Backward
   output = model:forward(sample)
   loss   = criterion:forward(output)
   dLoss  = criterion:backward(output)
   model:backward(sample, dLoss)


 ---------------------------------------------------------
 GooD triplet loss code in torch 
-----------------------------------------------------------
 https://github.com/eladhoffer/TripletNet/blob/master/Main.lua
 https://github.com/Atcold/torch-TripletEmbedding/blob/master/test.lua 
 
https://github.com/jhjin/triplet-criterion/blob/master/TripletCriterion.lua
 --------------------------------------------------------------------------- 
Deep Learning Concepts  Example LSTM with SGD

We take the example lstm with sequencer and replace the iteration for and the gradientUpdate with a feval function.

require 'rnn'
require 'optim'

batchSize = 50
rho = 5
hiddenSize = 64
nIndex = 10000

-- define the model
model = nn.Sequential()
model:add(nn.Sequencer(nn.LookupTable(nIndex, hiddenSize)))
model:add(nn.Sequencer(nn.FastLSTM(hiddenSize, hiddenSize, rho)))
model:add(nn.Sequencer(nn.Linear(hiddenSize, nIndex)))
model:add(nn.Sequencer(nn.LogSoftMax()))
criterion = nn.SequencerCriterion(nn.ClassNLLCriterion())

Defines the model decorated with a Sequencer. Note that the criterion is decorated with nn.SequenceCriterion.


-- create a Dummy Dataset, dummy dataset (task predict the next item)
dataset = torch.randperm(nIndex)

-- offset is a convenient pointer to iterate over the dataset
offsets = {}
for i= 1, batchSize do
   table.insert(offsets, math.ceil(math.random() * batchSize))
end
offsets = torch.LongTensor(offsets)


-- method to compute a batch
function nextBatch()
	local inputs, targets = {}, {}
   for step = 1, rho do
      --get a batch of inputs
      table.insert(inputs, dataset:index(1, offsets))
      -- shift of one batch indexes
      offsets:add(1)
      for j=1,batchSize do
         if offsets[j] > nIndex then
            offsets[j] = 1
         end
      end
      -- fill the batch of targets
      table.insert(targets, dataset:index(1, offsets))
   end
	return inputs, targets
end

Defines:

  • a dummy dataset composed of a random permutation from 1 to nIndex,
  • an offset table to store a list of pointers to scan the dataset, and
  • a method to get the next batch.

-- get weights and loss wrt weights from the model
x, dl_dx = model:getParameters()

-- In the following code, we define a closure, feval, which computes
-- the value of the loss function at a given point x, and the gradient of
-- that function with respect to x. weigths is the vector of trainable weights,
-- it extracts a mini_batch via the nextBatch method
feval = function(x_new)
	-- copy the weight if are changed
	if x ~= x_new then
		x:copy(x_new)
	end
	-- select a training batch
	local inputs, targets = nextBatch()
	-- reset gradients (gradients are always accumulated, to accommodate
	-- batch methods)
	dl_dx:zero()

	-- evaluate the loss function and its derivative with respect to x, given a mini batch
	local prediction = model:forward(inputs)
	local loss_x = criterion:forward(prediction, targets)
	model:backward(inputs, criterion:backward(prediction, targets))

	return loss_x, dl_dx
end

Get the parameters from the built model and defines the feval function for the optimizer method


sgd_params = {
   learningRate = 0.1,
   learningRateDecay = 1e-4,
   weightDecay = 0,
   momentum = 0
}

-- cycle on data
for i = 1,1e4 do
	-- train a mini_batch of batchSize in parallel
	_, fs = optim.sgd(feval,x, sgd_params)

	if sgd_params.evalCounter % 100 == 0 then
		print('error for iteration ' .. sgd_params.evalCounter  .. ' is ' .. fs[1] / rho)
		-- print(sgd_params)
	end
end

Defines the parameter for the method and the main for to perform mini-batches on the dataset. Each 100 mini-batches it prints the error.

REF:

http://rnduja.github.io/2015/10/26/deep_learning_with_torch_step_7_optim/REF

https://github.com/torch/optim/blob/master/sgd.lua

——————————————————————-

Discussion on DL training and testing

——————————————————-

  • 1. what optim is actually doing? We need to do almost everything, including cost and gradient evaluation in feval(). And only extra thing we need to do is iterate using a for loop. So what’s the actual use of optim?
    2. How feval() definition should change wrt. different optim algorithms?

  • Hi,
    First of all, thank you for these very helpful series on Torch, I’m just taking my first steps in Deep Learning and they are a great starting point!
    I’m a bit confused by your feval function: you reset the dl_dx to zeros, but you do not assign it any new values before returning… Shouldn’t the last lines be

    dl_dx = criterion:backward(prediction, targets)
    model:backward(inputs, dl_dx)
    return loss_x, dl_dx

     

  • Hi! In feval = function(x_new) function the dl_dx is reset to 0, so “return loss_x, dl_dx” is actually “return loss_x, 0”.

    The gradient dl_dx is then accumulated again within the function optim.sgd(feval,x,sgd_params) in each iteration. You can refer to the implementation of sgd in https://github.com/torch/op… .

  • Actually, that is not correct.

    dl_dx is a pointer to the accumulated gradients in the model. These gradients are updated on model:backwards() calls, not in optim:sgd() calls.
    The optim.sgd() function only updates the weights using the gradients; it doesn’t *accumulate* the gradient values at all.

    To correctly answer pbeaug; you are mixing up the last two steps of optimisation:
    Step 1. Get a prediction.
    Step 2. Use difference in prediction and expected values to calculate gradients for each weight.
    Step 3. Use an optimisation algorithm to USE the gradients to update the weights.

    Step 2 (more detail)
    The result of criterion:backward() is just the loss after being back-propagated through the error function. You still have to back-propagate through the model:
    model:backward(inputs, criterion:backward(prediction, targets))
    When you do this call, each layer in the model has the gradients calculated and accumulated (added together) in the tensor pointed to by dl_dx, but nothing is done with it until the optim.sgd() call.

    Step 3 (more detail)
    In the current implementation https://github.com/torch/op…
    Lines 46-55 modify the gradients based on weight decay settings (otherwise known as L2 regularisation)
    Lines 58-69 modify the gradients based on momentum settings (i.e. uses past gradients as a guide for new gradients)
    Lines 72-83 finally modify the weights using the gradients.

     

    —————————————————————

    What is Module

    ——————————————————

    https://github.com/torch/nn/blob/master/doc/module.md#nn.Module.updateOutput

    https://github.com/torch/nn/blob/master/Module.lua

    ——————————————————

    Describe module with example

    ———————————————

    https://github.com/torch/nn/blob/master/Linear.lua

     

  • Concept of share paramenter
  • https://groups.google.com/forum/#!topic/torch7/VtG46T6jxlM
Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: