Chainer - Python - Logistic Regression

.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty margin-bottom:0;

up vote
1
down vote

favorite

I created a simple Logistic Regression model using Python and Chainer. I would like to optimize the code for which I like to get some help.

One restriction: interchanging the implemented functionalities by already existing functionalities of Chainer is not allowed. I know that there are loss functions in Chainer which achieve almost the same, but a more complex model I am creating is using a custom loss function.

The code is the following:

# The goal of this Gist is to implement a simple (Logistic Regression) model such that most of the functionalities of Chainer are used
# @author K.M.J. Jacobs
# @date 2018-02-24
# @website https://www.data-blogger.com

import chainer
from chainer import reporter as reporter_module
from chainer.training.extensions import LogReport
from chainer import iterators
from chainer import training
from chainer.datasets import TransformDataset
from chainer.training import extensions
from chainer.datasets import split_dataset
from chainer import optimizers
import chainer.optimizer
import chainer.initializers
import chainer.links as L
import chainer.functions as F
from chainer import Chain
import numpy as np

class LogisticRegressionModel(Chain):

 def __init__(self):
 super(LogisticRegressionModel, self).__init__()
 with self.init_scope():
 self.w = chainer.Parameter(initializer=chainer.initializers.Normal())
 self.w.initialize([3, 1])

 def __call__(self, x, t):
 # Call the loss function
 return self.loss(x, t)

 def predict(self, x):
 # Predict given an input (a, b, 1)
 z = F.matmul(x, self.w)
 return 1. / (1. + F.exp(-z))

 def loss(self, x, t):
 # Compute the loss for a given input (a, b, 1) and target
 y = self.predict(x)
 loss = -t * F.log(y) - (1 - t) * F.log(1 - y)
 reporter_module.report('loss': loss.data[0, 0], self)
 reporter_module.report('w': self.w[0, 0], self)
 return loss

def converter(minibatch, device=None):
 # For splitting array into inputs / targets
 inputs = 
 targets = 
 for item in minibatch:
 inputs.append(item[:3])
 targets.append(item[3])
 inputs = np.matrix(inputs)
 targets = np.array(targets)
 return inputs, targets

# Set the seed for reproduction
np.random.seed(0)

# The dataset consists of samples (a, b, 1) and the target is a function f such that f(a, b, 1) = a > b
# So for example: f(0.5, 0.6, 1) = 0. (False) and f(0.8, 0.2, 1) = 1. (True) since 0.8 > 0.2
# The 1 serves as bias so the model can train for a constant offset
N = 10000
data = np.random.random((N, 4))
data[:, 2] = 1.
data[:, 3] = data[:, 0] > data[:, 1]

# Split the data into a train and a test set such that there are 10 examples in the test set
data_test, data_train = split_dataset(data, 10)
train_iter = iterators.SerialIterator(data_train, 1, False, False)
test_iter = iterators.SerialIterator(data_test, 1, False, False)

# Setup the model
model = LogisticRegressionModel()

# Create the optimizer for the model
optimizer = optimizers.SGD()
optimizer.use_cleargrads(True)
optimizer.setup(model)

# Setup the training loop (and use the Evaluator, LogReport and PrintReport extension) with the following properties:
# - Run for 10.000 iterations
# - Evaluate every 1.000 iterations
# - Write logs every 1.000 iterations
# - Print the losses and the epoch, iteration and elapsed_time
trainer = training.Trainer(training.StandardUpdater(train_iter, optimizer, converter), (10001, 'iteration'), out='result')
trainer.extend(extensions.Evaluator(test_iter, model, converter), trigger=(1000, 'iteration'))
trainer.extend(extensions.LogReport(trigger=(1000, 'iteration')))
trainer.extend(extensions.PrintReport(['epoch', 'iteration', 'main/loss', 'validation/main/loss', 'main/w', 'validation/main/w', 'elapsed_time']))#, 'main/loss', 'validation/main/loss', 'elapsed_time'], ))
trainer.run()

I would like to keep the model code as clean as possible, but as you can see, the __call__ method pointing towards the loss method and I suspect there is a cleaner way to invoke the loss method in the training loop. I thought that it would be cleaner if the __call__ method outputs the prediction and there is a seperate loss method for computing the loss. What are your thoughts on this?

I am also not sure on the converter function. Is there a better way to achieve the same result?

Do you have any remarks or best practices for writing Chainer code?

edited Jul 3 at 17:27

Billal BEGUERADJ

asked Feb 24 at 11:50

www.data-blogger.com

1294

1

Please post the relevant code directly in the question! Don't want to have to navigate outside this website, or the link may break in the future.
â€“Â Arnav Borborah
Feb 24 at 14:42

Code added :-).
â€“Â www.data-blogger.com
Feb 24 at 17:21

Adding the code doesn't matter: "not fully satisfied with the end result" makes the question off-topic.
â€“Â 200_success
Feb 24 at 18:01

Now it should not be off-topic anymore?
â€“Â www.data-blogger.com
Feb 24 at 18:15

add a commentÂ |Â

up vote
1
down vote

favorite

I created a simple Logistic Regression model using Python and Chainer. I would like to optimize the code for which I like to get some help.

The code is the following:

# The goal of this Gist is to implement a simple (Logistic Regression) model such that most of the functionalities of Chainer are used
# @author K.M.J. Jacobs
# @date 2018-02-24
# @website https://www.data-blogger.com

import chainer
from chainer import reporter as reporter_module
from chainer.training.extensions import LogReport
from chainer import iterators
from chainer import training
from chainer.datasets import TransformDataset
from chainer.training import extensions
from chainer.datasets import split_dataset
from chainer import optimizers
import chainer.optimizer
import chainer.initializers
import chainer.links as L
import chainer.functions as F
from chainer import Chain
import numpy as np

class LogisticRegressionModel(Chain):

 def __init__(self):
 super(LogisticRegressionModel, self).__init__()
 with self.init_scope():
 self.w = chainer.Parameter(initializer=chainer.initializers.Normal())
 self.w.initialize([3, 1])

 def __call__(self, x, t):
 # Call the loss function
 return self.loss(x, t)

 def predict(self, x):
 # Predict given an input (a, b, 1)
 z = F.matmul(x, self.w)
 return 1. / (1. + F.exp(-z))

 def loss(self, x, t):
 # Compute the loss for a given input (a, b, 1) and target
 y = self.predict(x)
 loss = -t * F.log(y) - (1 - t) * F.log(1 - y)
 reporter_module.report('loss': loss.data[0, 0], self)
 reporter_module.report('w': self.w[0, 0], self)
 return loss

def converter(minibatch, device=None):
 # For splitting array into inputs / targets
 inputs = 
 targets = 
 for item in minibatch:
 inputs.append(item[:3])
 targets.append(item[3])
 inputs = np.matrix(inputs)
 targets = np.array(targets)
 return inputs, targets

# Set the seed for reproduction
np.random.seed(0)

# The dataset consists of samples (a, b, 1) and the target is a function f such that f(a, b, 1) = a > b
# So for example: f(0.5, 0.6, 1) = 0. (False) and f(0.8, 0.2, 1) = 1. (True) since 0.8 > 0.2
# The 1 serves as bias so the model can train for a constant offset
N = 10000
data = np.random.random((N, 4))
data[:, 2] = 1.
data[:, 3] = data[:, 0] > data[:, 1]

# Split the data into a train and a test set such that there are 10 examples in the test set
data_test, data_train = split_dataset(data, 10)
train_iter = iterators.SerialIterator(data_train, 1, False, False)
test_iter = iterators.SerialIterator(data_test, 1, False, False)

# Setup the model
model = LogisticRegressionModel()

# Create the optimizer for the model
optimizer = optimizers.SGD()
optimizer.use_cleargrads(True)
optimizer.setup(model)

# Setup the training loop (and use the Evaluator, LogReport and PrintReport extension) with the following properties:
# - Run for 10.000 iterations
# - Evaluate every 1.000 iterations
# - Write logs every 1.000 iterations
# - Print the losses and the epoch, iteration and elapsed_time
trainer = training.Trainer(training.StandardUpdater(train_iter, optimizer, converter), (10001, 'iteration'), out='result')
trainer.extend(extensions.Evaluator(test_iter, model, converter), trigger=(1000, 'iteration'))
trainer.extend(extensions.LogReport(trigger=(1000, 'iteration')))
trainer.extend(extensions.PrintReport(['epoch', 'iteration', 'main/loss', 'validation/main/loss', 'main/w', 'validation/main/w', 'elapsed_time']))#, 'main/loss', 'validation/main/loss', 'elapsed_time'], ))
trainer.run()

I am also not sure on the converter function. Is there a better way to achieve the same result?

Do you have any remarks or best practices for writing Chainer code?

edited Jul 3 at 17:27

Billal BEGUERADJ

asked Feb 24 at 11:50

www.data-blogger.com

1294

1

Please post the relevant code directly in the question! Don't want to have to navigate outside this website, or the link may break in the future.
â€“Â Arnav Borborah
Feb 24 at 14:42

Code added :-).
â€“Â www.data-blogger.com
Feb 24 at 17:21

Adding the code doesn't matter: "not fully satisfied with the end result" makes the question off-topic.
â€“Â 200_success
Feb 24 at 18:01

Now it should not be off-topic anymore?
â€“Â www.data-blogger.com
Feb 24 at 18:15

add a commentÂ |Â

up vote
1
down vote

favorite

I created a simple Logistic Regression model using Python and Chainer. I would like to optimize the code for which I like to get some help.

The code is the following:

# The goal of this Gist is to implement a simple (Logistic Regression) model such that most of the functionalities of Chainer are used
# @author K.M.J. Jacobs
# @date 2018-02-24
# @website https://www.data-blogger.com

import chainer
from chainer import reporter as reporter_module
from chainer.training.extensions import LogReport
from chainer import iterators
from chainer import training
from chainer.datasets import TransformDataset
from chainer.training import extensions
from chainer.datasets import split_dataset
from chainer import optimizers
import chainer.optimizer
import chainer.initializers
import chainer.links as L
import chainer.functions as F
from chainer import Chain
import numpy as np

class LogisticRegressionModel(Chain):

 def __init__(self):
 super(LogisticRegressionModel, self).__init__()
 with self.init_scope():
 self.w = chainer.Parameter(initializer=chainer.initializers.Normal())
 self.w.initialize([3, 1])

 def __call__(self, x, t):
 # Call the loss function
 return self.loss(x, t)

 def predict(self, x):
 # Predict given an input (a, b, 1)
 z = F.matmul(x, self.w)
 return 1. / (1. + F.exp(-z))

 def loss(self, x, t):
 # Compute the loss for a given input (a, b, 1) and target
 y = self.predict(x)
 loss = -t * F.log(y) - (1 - t) * F.log(1 - y)
 reporter_module.report('loss': loss.data[0, 0], self)
 reporter_module.report('w': self.w[0, 0], self)
 return loss

def converter(minibatch, device=None):
 # For splitting array into inputs / targets
 inputs = 
 targets = 
 for item in minibatch:
 inputs.append(item[:3])
 targets.append(item[3])
 inputs = np.matrix(inputs)
 targets = np.array(targets)
 return inputs, targets

# Set the seed for reproduction
np.random.seed(0)

# The dataset consists of samples (a, b, 1) and the target is a function f such that f(a, b, 1) = a > b
# So for example: f(0.5, 0.6, 1) = 0. (False) and f(0.8, 0.2, 1) = 1. (True) since 0.8 > 0.2
# The 1 serves as bias so the model can train for a constant offset
N = 10000
data = np.random.random((N, 4))
data[:, 2] = 1.
data[:, 3] = data[:, 0] > data[:, 1]

# Split the data into a train and a test set such that there are 10 examples in the test set
data_test, data_train = split_dataset(data, 10)
train_iter = iterators.SerialIterator(data_train, 1, False, False)
test_iter = iterators.SerialIterator(data_test, 1, False, False)

# Setup the model
model = LogisticRegressionModel()

# Create the optimizer for the model
optimizer = optimizers.SGD()
optimizer.use_cleargrads(True)
optimizer.setup(model)

# Setup the training loop (and use the Evaluator, LogReport and PrintReport extension) with the following properties:
# - Run for 10.000 iterations
# - Evaluate every 1.000 iterations
# - Write logs every 1.000 iterations
# - Print the losses and the epoch, iteration and elapsed_time
trainer = training.Trainer(training.StandardUpdater(train_iter, optimizer, converter), (10001, 'iteration'), out='result')
trainer.extend(extensions.Evaluator(test_iter, model, converter), trigger=(1000, 'iteration'))
trainer.extend(extensions.LogReport(trigger=(1000, 'iteration')))
trainer.extend(extensions.PrintReport(['epoch', 'iteration', 'main/loss', 'validation/main/loss', 'main/w', 'validation/main/w', 'elapsed_time']))#, 'main/loss', 'validation/main/loss', 'elapsed_time'], ))
trainer.run()

I am also not sure on the converter function. Is there a better way to achieve the same result?

Do you have any remarks or best practices for writing Chainer code?

edited Jul 3 at 17:27

Billal BEGUERADJ

asked Feb 24 at 11:50

www.data-blogger.com

1294

I created a simple Logistic Regression model using Python and Chainer. I would like to optimize the code for which I like to get some help.

The code is the following:

# The goal of this Gist is to implement a simple (Logistic Regression) model such that most of the functionalities of Chainer are used
# @author K.M.J. Jacobs
# @date 2018-02-24
# @website https://www.data-blogger.com

import chainer
from chainer import reporter as reporter_module
from chainer.training.extensions import LogReport
from chainer import iterators
from chainer import training
from chainer.datasets import TransformDataset
from chainer.training import extensions
from chainer.datasets import split_dataset
from chainer import optimizers
import chainer.optimizer
import chainer.initializers
import chainer.links as L
import chainer.functions as F
from chainer import Chain
import numpy as np

class LogisticRegressionModel(Chain):

 def __init__(self):
 super(LogisticRegressionModel, self).__init__()
 with self.init_scope():
 self.w = chainer.Parameter(initializer=chainer.initializers.Normal())
 self.w.initialize([3, 1])

 def __call__(self, x, t):
 # Call the loss function
 return self.loss(x, t)

 def predict(self, x):
 # Predict given an input (a, b, 1)
 z = F.matmul(x, self.w)
 return 1. / (1. + F.exp(-z))

 def loss(self, x, t):
 # Compute the loss for a given input (a, b, 1) and target
 y = self.predict(x)
 loss = -t * F.log(y) - (1 - t) * F.log(1 - y)
 reporter_module.report('loss': loss.data[0, 0], self)
 reporter_module.report('w': self.w[0, 0], self)
 return loss

def converter(minibatch, device=None):
 # For splitting array into inputs / targets
 inputs = 
 targets = 
 for item in minibatch:
 inputs.append(item[:3])
 targets.append(item[3])
 inputs = np.matrix(inputs)
 targets = np.array(targets)
 return inputs, targets

# Set the seed for reproduction
np.random.seed(0)

# The dataset consists of samples (a, b, 1) and the target is a function f such that f(a, b, 1) = a > b
# So for example: f(0.5, 0.6, 1) = 0. (False) and f(0.8, 0.2, 1) = 1. (True) since 0.8 > 0.2
# The 1 serves as bias so the model can train for a constant offset
N = 10000
data = np.random.random((N, 4))
data[:, 2] = 1.
data[:, 3] = data[:, 0] > data[:, 1]

# Split the data into a train and a test set such that there are 10 examples in the test set
data_test, data_train = split_dataset(data, 10)
train_iter = iterators.SerialIterator(data_train, 1, False, False)
test_iter = iterators.SerialIterator(data_test, 1, False, False)

# Setup the model
model = LogisticRegressionModel()

# Create the optimizer for the model
optimizer = optimizers.SGD()
optimizer.use_cleargrads(True)
optimizer.setup(model)

# Setup the training loop (and use the Evaluator, LogReport and PrintReport extension) with the following properties:
# - Run for 10.000 iterations
# - Evaluate every 1.000 iterations
# - Write logs every 1.000 iterations
# - Print the losses and the epoch, iteration and elapsed_time
trainer = training.Trainer(training.StandardUpdater(train_iter, optimizer, converter), (10001, 'iteration'), out='result')
trainer.extend(extensions.Evaluator(test_iter, model, converter), trigger=(1000, 'iteration'))
trainer.extend(extensions.LogReport(trigger=(1000, 'iteration')))
trainer.extend(extensions.PrintReport(['epoch', 'iteration', 'main/loss', 'validation/main/loss', 'main/w', 'validation/main/w', 'elapsed_time']))#, 'main/loss', 'validation/main/loss', 'elapsed_time'], ))
trainer.run()

I am also not sure on the converter function. Is there a better way to achieve the same result?

Do you have any remarks or best practices for writing Chainer code?

edited Jul 3 at 17:27

Billal BEGUERADJ

asked Feb 24 at 11:50

www.data-blogger.com

1294

edited Jul 3 at 17:27

Billal BEGUERADJ

edited Jul 3 at 17:27

Billal BEGUERADJ

edited Jul 3 at 17:27

Billal BEGUERADJ

asked Feb 24 at 11:50

www.data-blogger.com

1294

asked Feb 24 at 11:50

www.data-blogger.com

1294

asked Feb 24 at 11:50

www.data-blogger.com

1294

1

Please post the relevant code directly in the question! Don't want to have to navigate outside this website, or the link may break in the future.
â€“Â Arnav Borborah
Feb 24 at 14:42

Code added :-).
â€“Â www.data-blogger.com
Feb 24 at 17:21

Adding the code doesn't matter: "not fully satisfied with the end result" makes the question off-topic.
â€“Â 200_success
Feb 24 at 18:01

Now it should not be off-topic anymore?
â€“Â www.data-blogger.com
Feb 24 at 18:15

add a commentÂ |Â

1

Please post the relevant code directly in the question! Don't want to have to navigate outside this website, or the link may break in the future.
â€“Â Arnav Borborah
Feb 24 at 14:42

Code added :-).
â€“Â www.data-blogger.com
Feb 24 at 17:21

Adding the code doesn't matter: "not fully satisfied with the end result" makes the question off-topic.
â€“Â 200_success
Feb 24 at 18:01

Now it should not be off-topic anymore?
â€“Â www.data-blogger.com
Feb 24 at 18:15

Please post the relevant code directly in the question! Don't want to have to navigate outside this website, or the link may break in the future.
â€“Â Arnav Borborah
Feb 24 at 14:42

Code added :-).
â€“Â www.data-blogger.com
Feb 24 at 17:21

Adding the code doesn't matter: "not fully satisfied with the end result" makes the question off-topic.
â€“Â 200_success
Feb 24 at 18:01

Now it should not be off-topic anymore?
â€“Â www.data-blogger.com
Feb 24 at 18:15

add a commentÂ |Â

active

oldest

votes

Your Answer

StackExchange.ifUsing("editor", function ()
return StackExchange.using("mathjaxEditing", function ()
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["\$", "\$"]]);
);
);
, "mathjax-editing");

StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "196"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
convertImagesToLinks: false,
noModals: false,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f188265%2fchainer-python-logistic-regression%23new-answer', 'question_page');

);

Post as a guest

Name

active

oldest

votes

draft saved

draft discarded

draft saved

draft discarded

Post as a guest

Name

N9M1VD3bGHB6WWj5NSLUjWA T

搜尋此網誌

trjhtr