For-Loop with numpy array too slow

.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty margin-bottom:0;

up vote
8
down vote

favorite

I'm trying to speed up the execution of the following problem:

I have two functions which are used together.

Here are the functions explained:

The first function simulates N coin flips and returns an array of strings with the results:

def throw_a_coin(N):
 return np.random.choice(['H','T'], size=N)

The second function takes in number_of_samples, sample_size
and returns sample_probs.
The function simulates number_of_samples runs of sample_size coin flips and stores the result of num_heads / sample size in sample_probs for each run.

Here is the code:

def make_throws(number_of_samples, sample_size):
 sample_probs = np.zeros(number_of_samples)
 for i in range(number_of_samples):
 num_heads = sum(throw_a_coin(sample_size) == "H")
 sample_probs[i] = float(num_heads) / sample_size
 return sample_probs

Now I want to simulate multiple runs with different sample_sizes and 200 runs for each sample size. So sample_size = 1 I would do 200 runs of 1 coin flip, for sample_size = 10 I would do 200 runs of 10 coin flips and so on.

Here is the code that takes so long:

mean_of_sample_means = np.zeros(len(sample_sizes))
std_dev_of_sample_means = np.zeros(len(sample_sizes))

for i in range(len(sample_sizes)):
 prob = make_throws(200, sample_sizes[i])
 mean_of_sample_means[i] = np.mean(prob)
 std_dev_of_sample_means[i] = np.std(prob)

I am pretty sure that this process can be improved by getting rid of the for loops and using array-operations instead. But I just cannot think of how to apply throw_a_coin or make_throws to an array.

asked Apr 15 at 10:43

David P

483

4

To the 2 people who voted to close this question: what exactly is wrong with it?
â€“Â Mast
Apr 15 at 11:42

1

@ÃÂ€ÃŽÂ¬ÃŽÂ½ÃÂ„ÃŽÂ±Ã¡Â¿Â¥ÃŽÂµÃ¡Â¿Â– The code looks very real to me. All required methods are in place so this is fully reviewable.
â€“Â Simon Forsbergâ™¦
Apr 15 at 13:02

add a commentÂ |Â

up vote
8
down vote

favorite

I'm trying to speed up the execution of the following problem:

I have two functions which are used together.

Here are the functions explained:

The first function simulates N coin flips and returns an array of strings with the results:

def throw_a_coin(N):
 return np.random.choice(['H','T'], size=N)

Here is the code:

def make_throws(number_of_samples, sample_size):
 sample_probs = np.zeros(number_of_samples)
 for i in range(number_of_samples):
 num_heads = sum(throw_a_coin(sample_size) == "H")
 sample_probs[i] = float(num_heads) / sample_size
 return sample_probs

Here is the code that takes so long:

mean_of_sample_means = np.zeros(len(sample_sizes))
std_dev_of_sample_means = np.zeros(len(sample_sizes))

for i in range(len(sample_sizes)):
 prob = make_throws(200, sample_sizes[i])
 mean_of_sample_means[i] = np.mean(prob)
 std_dev_of_sample_means[i] = np.std(prob)

asked Apr 15 at 10:43

David P

483

4

To the 2 people who voted to close this question: what exactly is wrong with it?
â€“Â Mast
Apr 15 at 11:42

1

@ÃÂ€ÃŽÂ¬ÃŽÂ½ÃÂ„ÃŽÂ±Ã¡Â¿Â¥ÃŽÂµÃ¡Â¿Â– The code looks very real to me. All required methods are in place so this is fully reviewable.
â€“Â Simon Forsbergâ™¦
Apr 15 at 13:02

add a commentÂ |Â

up vote
8
down vote

favorite

I'm trying to speed up the execution of the following problem:

I have two functions which are used together.

Here are the functions explained:

The first function simulates N coin flips and returns an array of strings with the results:

def throw_a_coin(N):
 return np.random.choice(['H','T'], size=N)

Here is the code:

def make_throws(number_of_samples, sample_size):
 sample_probs = np.zeros(number_of_samples)
 for i in range(number_of_samples):
 num_heads = sum(throw_a_coin(sample_size) == "H")
 sample_probs[i] = float(num_heads) / sample_size
 return sample_probs

Here is the code that takes so long:

mean_of_sample_means = np.zeros(len(sample_sizes))
std_dev_of_sample_means = np.zeros(len(sample_sizes))

for i in range(len(sample_sizes)):
 prob = make_throws(200, sample_sizes[i])
 mean_of_sample_means[i] = np.mean(prob)
 std_dev_of_sample_means[i] = np.std(prob)

asked Apr 15 at 10:43

David P

483

I'm trying to speed up the execution of the following problem:

I have two functions which are used together.

Here are the functions explained:

The first function simulates N coin flips and returns an array of strings with the results:

def throw_a_coin(N):
 return np.random.choice(['H','T'], size=N)

Here is the code:

def make_throws(number_of_samples, sample_size):
 sample_probs = np.zeros(number_of_samples)
 for i in range(number_of_samples):
 num_heads = sum(throw_a_coin(sample_size) == "H")
 sample_probs[i] = float(num_heads) / sample_size
 return sample_probs

Here is the code that takes so long:

mean_of_sample_means = np.zeros(len(sample_sizes))
std_dev_of_sample_means = np.zeros(len(sample_sizes))

for i in range(len(sample_sizes)):
 prob = make_throws(200, sample_sizes[i])
 mean_of_sample_means[i] = np.mean(prob)
 std_dev_of_sample_means[i] = np.std(prob)

asked Apr 15 at 10:43

David P

483

asked Apr 15 at 10:43

David P

483

asked Apr 15 at 10:43

David P

483

asked Apr 15 at 10:43

David P

483

4

To the 2 people who voted to close this question: what exactly is wrong with it?
â€“Â Mast
Apr 15 at 11:42

1

@ÃÂ€ÃŽÂ¬ÃŽÂ½ÃÂ„ÃŽÂ±Ã¡Â¿Â¥ÃŽÂµÃ¡Â¿Â– The code looks very real to me. All required methods are in place so this is fully reviewable.
â€“Â Simon Forsbergâ™¦
Apr 15 at 13:02

add a commentÂ |Â

4

To the 2 people who voted to close this question: what exactly is wrong with it?
â€“Â Mast
Apr 15 at 11:42

1

@ÃÂ€ÃŽÂ¬ÃŽÂ½ÃÂ„ÃŽÂ±Ã¡Â¿Â¥ÃŽÂµÃ¡Â¿Â– The code looks very real to me. All required methods are in place so this is fully reviewable.
â€“Â Simon Forsbergâ™¦
Apr 15 at 13:02

To the 2 people who voted to close this question: what exactly is wrong with it?
â€“Â Mast
Apr 15 at 11:42

@ÃÂ€ÃŽÂ¬ÃŽÂ½ÃÂ„ÃŽÂ±Ã¡Â¿Â¥ÃŽÂµÃ¡Â¿Â– The code looks very real to me. All required methods are in place so this is fully reviewable.
â€“Â Simon Forsbergâ™¦
Apr 15 at 13:02

add a commentÂ |Â

1 Answer
1

active

oldest

votes

up vote
4
down vote

accepted

My toy example I build from your snippets:

import numpy as np


def throw_a_coin(N):
 return np.random.choice(['H','T'], size=N)


def make_throws(number_of_samples, sample_size):
 sample_probs = np.zeros(number_of_samples)
 for i in range(number_of_samples):
 num_heads = sum(throw_a_coin(sample_size) == "H")
 sample_probs[i] = float(num_heads) / sample_size
 return sample_probs

sample_sizes = [np.random.randint(1, 1e3) for idx in range(500)]

mean_of_sample_means = np.zeros(len(sample_sizes))
std_dev_of_sample_means = np.zeros(len(sample_sizes))

for i in range(len(sample_sizes)):
 prob = make_throws(200, sample_sizes[i])
 mean_of_sample_means[i] = np.mean(prob)
 std_dev_of_sample_means[i] = np.std(prob)

cProfile (ordered by cumtime) reveals the problem (runs in 82.359 seconds):

ncalls tottime percall cumtime percall filename:lineno(function)
 133/1 0.003 0.000 82.359 82.359 built-in method builtins.exec
 1 0.002 0.002 82.359 82.359 fast_flip.py:1(<module>)
 500 0.940 0.002 82.200 0.164 fast_flip.py:8(make_throws)
100000 79.111 0.001 79.111 0.001 built-in method builtins.sum
100000 0.058 0.000 2.148 0.000 fast_flip.py:4(throw_a_coin)
100000 1.455 0.000 2.090 0.000 method 'choice' of 'mtrand.RandomState' objects
100000 0.207 0.000 0.635 0.000 fromnumeric.py:2456(prod)
100000 0.026 0.000 0.429 0.000 _methods.py:34(_prod)
101500 0.409 0.000 0.409 0.000 method 'reduce' of 'numpy.ufunc' objects
 6 0.000 0.000 0.176 0.029 __init__.py:1(<module>)

There is a steep gap after buildins.sum, i.e. you spend most time there. We can use np.sum instead (pushing it down to 3.457 seconds):

ncalls tottime percall cumtime percall filename:lineno(function)
 133/1 0.003 0.000 3.457 3.457 built-in method builtins.exec
 1 0.002 0.002 3.457 3.457 fast_flip.py:1(<module>)
 500 0.905 0.002 3.307 0.007 fast_flip.py:8(make_throws)
100000 0.046 0.000 1.869 0.000 fast_flip.py:4(throw_a_coin)
100000 1.287 0.000 1.823 0.000 method 'choice' of 'mtrand.RandomState' objects
201500 0.702 0.000 0.702 0.000 method 'reduce' of 'numpy.ufunc' objects
100000 0.172 0.000 0.536 0.000 fromnumeric.py:2456(prod)
100000 0.136 0.000 0.532 0.000 fromnumeric.py:1778(sum)
100000 0.023 0.000 0.378 0.000 _methods.py:31(_sum)
100000 0.021 0.000 0.364 0.000 _methods.py:34(_prod)
 6 0.000 0.000 0.178 0.030 __init__.py:1(<module>)

further we can replace the strings "H" and "T" with a Boolean and stay in numpy for longer (down to 1.633 seconds):

ncalls tottime percall cumtime percall filename:lineno(function)
 133/1 0.003 0.000 1.633 1.633 built-in method builtins.exec
 1 0.001 0.001 1.633 1.633 fast_flip.py:1(<module>)
 500 0.101 0.000 1.485 0.003 fast_flip.py:11(make_throws)
100000 0.169 0.000 0.893 0.000 fast_flip.py:4(throw_a_coin)
100000 0.724 0.000 0.724 0.000 method 'uniform' of 'mtrand.RandomState' objects
100000 0.122 0.000 0.491 0.000 fromnumeric.py:1778(sum)
100000 0.024 0.000 0.354 0.000 _methods.py:31(_sum)
101500 0.334 0.000 0.334 0.000 method 'reduce' of 'numpy.ufunc' objects
 6 0.000 0.000 0.178 0.030 __init__.py:1(<module>)

next we can get rid of throw_a_coin and instead sample a numer_of_samples x sample_size array of uniformly distributed random numbers and threshold them. This also allows us to vectorize the for loop and stay in numpy even longer (0.786 seconds):

ncalls tottime percall cumtime percall filename:lineno(function)
 133/1 0.003 0.000 0.786 0.786 built-in method builtins.exec
 1 0.003 0.003 0.786 0.786 fast_flip.py:1(<module>)
 500 0.053 0.000 0.634 0.001 fast_flip.py:4(make_throws)
 500 0.526 0.001 0.526 0.001 method 'uniform' of 'mtrand.RandomState' objects
 6 0.000 0.000 0.179 0.030 __init__.py:1(<module>)

Here is the code:

import numpy as np


def make_throws(number_of_samples, sample_size):
 # True == Heads
 throws = np.random.uniform(size=(number_of_samples, sample_size)) > 0.5
 sample_probs = np.sum(throws, axis=1) / sample_size
 return sample_probs

sample_sizes = [np.random.randint(1, 1e3) for idx in range(500)]

mean_of_sample_means = np.zeros(len(sample_sizes))
std_dev_of_sample_means = np.zeros(len(sample_sizes))

for i in range(len(sample_sizes)):
 prob = make_throws(200, sample_sizes[i])
 mean_of_sample_means[i] = np.mean(prob)
 std_dev_of_sample_means[i] = np.std(prob)

At this point we could start to tackle the for loop, but it would start to get very micro optimized. One might consider aggregating the results in prob and then using np.mean(prob, axis=1) and np.std(prob, axis=1) outside the loop, but that only nets like 20ms so it's more of a personal preference thing.

answered Apr 15 at 14:19

FirefoxMetzger

74628

Wow, very detailed and very helpful! I didn't know I can see how long each single operation takes! Thank you:)
â€“Â David P
Apr 15 at 17:32

@DavidP Yes, that's what profilers are for. They exist in every major language so this is not a Python specific thing. There is also LineProfiler which allows you to nicely view runtime per line for specific functions.
â€“Â FirefoxMetzger
Apr 15 at 17:52

I will save this comment in my brain! I didn't know about the profilers and they will help me for sure in the future, thanks a lot!
â€“Â David P
Apr 16 at 16:42

add a commentÂ |Â

Your Answer

StackExchange.ifUsing("editor", function ()
return StackExchange.using("mathjaxEditing", function ()
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["\$", "\$"]]);
);
);
, "mathjax-editing");

StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "196"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
convertImagesToLinks: false,
noModals: false,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f192108%2ffor-loop-with-numpy-array-too-slow%23new-answer', 'question_page');

);

Post as a guest

Name

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

up vote
4
down vote

accepted

My toy example I build from your snippets:

import numpy as np


def throw_a_coin(N):
 return np.random.choice(['H','T'], size=N)


def make_throws(number_of_samples, sample_size):
 sample_probs = np.zeros(number_of_samples)
 for i in range(number_of_samples):
 num_heads = sum(throw_a_coin(sample_size) == "H")
 sample_probs[i] = float(num_heads) / sample_size
 return sample_probs

sample_sizes = [np.random.randint(1, 1e3) for idx in range(500)]

mean_of_sample_means = np.zeros(len(sample_sizes))
std_dev_of_sample_means = np.zeros(len(sample_sizes))

for i in range(len(sample_sizes)):
 prob = make_throws(200, sample_sizes[i])
 mean_of_sample_means[i] = np.mean(prob)
 std_dev_of_sample_means[i] = np.std(prob)

cProfile (ordered by cumtime) reveals the problem (runs in 82.359 seconds):

ncalls tottime percall cumtime percall filename:lineno(function)
 133/1 0.003 0.000 82.359 82.359 built-in method builtins.exec
 1 0.002 0.002 82.359 82.359 fast_flip.py:1(<module>)
 500 0.940 0.002 82.200 0.164 fast_flip.py:8(make_throws)
100000 79.111 0.001 79.111 0.001 built-in method builtins.sum
100000 0.058 0.000 2.148 0.000 fast_flip.py:4(throw_a_coin)
100000 1.455 0.000 2.090 0.000 method 'choice' of 'mtrand.RandomState' objects
100000 0.207 0.000 0.635 0.000 fromnumeric.py:2456(prod)
100000 0.026 0.000 0.429 0.000 _methods.py:34(_prod)
101500 0.409 0.000 0.409 0.000 method 'reduce' of 'numpy.ufunc' objects
 6 0.000 0.000 0.176 0.029 __init__.py:1(<module>)

There is a steep gap after buildins.sum, i.e. you spend most time there. We can use np.sum instead (pushing it down to 3.457 seconds):

ncalls tottime percall cumtime percall filename:lineno(function)
 133/1 0.003 0.000 3.457 3.457 built-in method builtins.exec
 1 0.002 0.002 3.457 3.457 fast_flip.py:1(<module>)
 500 0.905 0.002 3.307 0.007 fast_flip.py:8(make_throws)
100000 0.046 0.000 1.869 0.000 fast_flip.py:4(throw_a_coin)
100000 1.287 0.000 1.823 0.000 method 'choice' of 'mtrand.RandomState' objects
201500 0.702 0.000 0.702 0.000 method 'reduce' of 'numpy.ufunc' objects
100000 0.172 0.000 0.536 0.000 fromnumeric.py:2456(prod)
100000 0.136 0.000 0.532 0.000 fromnumeric.py:1778(sum)
100000 0.023 0.000 0.378 0.000 _methods.py:31(_sum)
100000 0.021 0.000 0.364 0.000 _methods.py:34(_prod)
 6 0.000 0.000 0.178 0.030 __init__.py:1(<module>)

further we can replace the strings "H" and "T" with a Boolean and stay in numpy for longer (down to 1.633 seconds):

ncalls tottime percall cumtime percall filename:lineno(function)
 133/1 0.003 0.000 1.633 1.633 built-in method builtins.exec
 1 0.001 0.001 1.633 1.633 fast_flip.py:1(<module>)
 500 0.101 0.000 1.485 0.003 fast_flip.py:11(make_throws)
100000 0.169 0.000 0.893 0.000 fast_flip.py:4(throw_a_coin)
100000 0.724 0.000 0.724 0.000 method 'uniform' of 'mtrand.RandomState' objects
100000 0.122 0.000 0.491 0.000 fromnumeric.py:1778(sum)
100000 0.024 0.000 0.354 0.000 _methods.py:31(_sum)
101500 0.334 0.000 0.334 0.000 method 'reduce' of 'numpy.ufunc' objects
 6 0.000 0.000 0.178 0.030 __init__.py:1(<module>)

ncalls tottime percall cumtime percall filename:lineno(function)
 133/1 0.003 0.000 0.786 0.786 built-in method builtins.exec
 1 0.003 0.003 0.786 0.786 fast_flip.py:1(<module>)
 500 0.053 0.000 0.634 0.001 fast_flip.py:4(make_throws)
 500 0.526 0.001 0.526 0.001 method 'uniform' of 'mtrand.RandomState' objects
 6 0.000 0.000 0.179 0.030 __init__.py:1(<module>)

Here is the code:

import numpy as np


def make_throws(number_of_samples, sample_size):
 # True == Heads
 throws = np.random.uniform(size=(number_of_samples, sample_size)) > 0.5
 sample_probs = np.sum(throws, axis=1) / sample_size
 return sample_probs

sample_sizes = [np.random.randint(1, 1e3) for idx in range(500)]

mean_of_sample_means = np.zeros(len(sample_sizes))
std_dev_of_sample_means = np.zeros(len(sample_sizes))

for i in range(len(sample_sizes)):
 prob = make_throws(200, sample_sizes[i])
 mean_of_sample_means[i] = np.mean(prob)
 std_dev_of_sample_means[i] = np.std(prob)

answered Apr 15 at 14:19

FirefoxMetzger

74628

Wow, very detailed and very helpful! I didn't know I can see how long each single operation takes! Thank you:)
â€“Â David P
Apr 15 at 17:32

@DavidP Yes, that's what profilers are for. They exist in every major language so this is not a Python specific thing. There is also LineProfiler which allows you to nicely view runtime per line for specific functions.
â€“Â FirefoxMetzger
Apr 15 at 17:52

I will save this comment in my brain! I didn't know about the profilers and they will help me for sure in the future, thanks a lot!
â€“Â David P
Apr 16 at 16:42

add a commentÂ |Â

up vote
4
down vote

accepted

My toy example I build from your snippets:

import numpy as np


def throw_a_coin(N):
 return np.random.choice(['H','T'], size=N)


def make_throws(number_of_samples, sample_size):
 sample_probs = np.zeros(number_of_samples)
 for i in range(number_of_samples):
 num_heads = sum(throw_a_coin(sample_size) == "H")
 sample_probs[i] = float(num_heads) / sample_size
 return sample_probs

sample_sizes = [np.random.randint(1, 1e3) for idx in range(500)]

mean_of_sample_means = np.zeros(len(sample_sizes))
std_dev_of_sample_means = np.zeros(len(sample_sizes))

for i in range(len(sample_sizes)):
 prob = make_throws(200, sample_sizes[i])
 mean_of_sample_means[i] = np.mean(prob)
 std_dev_of_sample_means[i] = np.std(prob)

cProfile (ordered by cumtime) reveals the problem (runs in 82.359 seconds):

ncalls tottime percall cumtime percall filename:lineno(function)
 133/1 0.003 0.000 82.359 82.359 built-in method builtins.exec
 1 0.002 0.002 82.359 82.359 fast_flip.py:1(<module>)
 500 0.940 0.002 82.200 0.164 fast_flip.py:8(make_throws)
100000 79.111 0.001 79.111 0.001 built-in method builtins.sum
100000 0.058 0.000 2.148 0.000 fast_flip.py:4(throw_a_coin)
100000 1.455 0.000 2.090 0.000 method 'choice' of 'mtrand.RandomState' objects
100000 0.207 0.000 0.635 0.000 fromnumeric.py:2456(prod)
100000 0.026 0.000 0.429 0.000 _methods.py:34(_prod)
101500 0.409 0.000 0.409 0.000 method 'reduce' of 'numpy.ufunc' objects
 6 0.000 0.000 0.176 0.029 __init__.py:1(<module>)

There is a steep gap after buildins.sum, i.e. you spend most time there. We can use np.sum instead (pushing it down to 3.457 seconds):

ncalls tottime percall cumtime percall filename:lineno(function)
 133/1 0.003 0.000 3.457 3.457 built-in method builtins.exec
 1 0.002 0.002 3.457 3.457 fast_flip.py:1(<module>)
 500 0.905 0.002 3.307 0.007 fast_flip.py:8(make_throws)
100000 0.046 0.000 1.869 0.000 fast_flip.py:4(throw_a_coin)
100000 1.287 0.000 1.823 0.000 method 'choice' of 'mtrand.RandomState' objects
201500 0.702 0.000 0.702 0.000 method 'reduce' of 'numpy.ufunc' objects
100000 0.172 0.000 0.536 0.000 fromnumeric.py:2456(prod)
100000 0.136 0.000 0.532 0.000 fromnumeric.py:1778(sum)
100000 0.023 0.000 0.378 0.000 _methods.py:31(_sum)
100000 0.021 0.000 0.364 0.000 _methods.py:34(_prod)
 6 0.000 0.000 0.178 0.030 __init__.py:1(<module>)

further we can replace the strings "H" and "T" with a Boolean and stay in numpy for longer (down to 1.633 seconds):

ncalls tottime percall cumtime percall filename:lineno(function)
 133/1 0.003 0.000 1.633 1.633 built-in method builtins.exec
 1 0.001 0.001 1.633 1.633 fast_flip.py:1(<module>)
 500 0.101 0.000 1.485 0.003 fast_flip.py:11(make_throws)
100000 0.169 0.000 0.893 0.000 fast_flip.py:4(throw_a_coin)
100000 0.724 0.000 0.724 0.000 method 'uniform' of 'mtrand.RandomState' objects
100000 0.122 0.000 0.491 0.000 fromnumeric.py:1778(sum)
100000 0.024 0.000 0.354 0.000 _methods.py:31(_sum)
101500 0.334 0.000 0.334 0.000 method 'reduce' of 'numpy.ufunc' objects
 6 0.000 0.000 0.178 0.030 __init__.py:1(<module>)

ncalls tottime percall cumtime percall filename:lineno(function)
 133/1 0.003 0.000 0.786 0.786 built-in method builtins.exec
 1 0.003 0.003 0.786 0.786 fast_flip.py:1(<module>)
 500 0.053 0.000 0.634 0.001 fast_flip.py:4(make_throws)
 500 0.526 0.001 0.526 0.001 method 'uniform' of 'mtrand.RandomState' objects
 6 0.000 0.000 0.179 0.030 __init__.py:1(<module>)

Here is the code:

import numpy as np


def make_throws(number_of_samples, sample_size):
 # True == Heads
 throws = np.random.uniform(size=(number_of_samples, sample_size)) > 0.5
 sample_probs = np.sum(throws, axis=1) / sample_size
 return sample_probs

sample_sizes = [np.random.randint(1, 1e3) for idx in range(500)]

mean_of_sample_means = np.zeros(len(sample_sizes))
std_dev_of_sample_means = np.zeros(len(sample_sizes))

for i in range(len(sample_sizes)):
 prob = make_throws(200, sample_sizes[i])
 mean_of_sample_means[i] = np.mean(prob)
 std_dev_of_sample_means[i] = np.std(prob)

answered Apr 15 at 14:19

FirefoxMetzger

74628

Wow, very detailed and very helpful! I didn't know I can see how long each single operation takes! Thank you:)
â€“Â David P
Apr 15 at 17:32

@DavidP Yes, that's what profilers are for. They exist in every major language so this is not a Python specific thing. There is also LineProfiler which allows you to nicely view runtime per line for specific functions.
â€“Â FirefoxMetzger
Apr 15 at 17:52

I will save this comment in my brain! I didn't know about the profilers and they will help me for sure in the future, thanks a lot!
â€“Â David P
Apr 16 at 16:42

add a commentÂ |Â

up vote
4
down vote

accepted

My toy example I build from your snippets:

import numpy as np


def throw_a_coin(N):
 return np.random.choice(['H','T'], size=N)


def make_throws(number_of_samples, sample_size):
 sample_probs = np.zeros(number_of_samples)
 for i in range(number_of_samples):
 num_heads = sum(throw_a_coin(sample_size) == "H")
 sample_probs[i] = float(num_heads) / sample_size
 return sample_probs

sample_sizes = [np.random.randint(1, 1e3) for idx in range(500)]

mean_of_sample_means = np.zeros(len(sample_sizes))
std_dev_of_sample_means = np.zeros(len(sample_sizes))

for i in range(len(sample_sizes)):
 prob = make_throws(200, sample_sizes[i])
 mean_of_sample_means[i] = np.mean(prob)
 std_dev_of_sample_means[i] = np.std(prob)

cProfile (ordered by cumtime) reveals the problem (runs in 82.359 seconds):

ncalls tottime percall cumtime percall filename:lineno(function)
 133/1 0.003 0.000 82.359 82.359 built-in method builtins.exec
 1 0.002 0.002 82.359 82.359 fast_flip.py:1(<module>)
 500 0.940 0.002 82.200 0.164 fast_flip.py:8(make_throws)
100000 79.111 0.001 79.111 0.001 built-in method builtins.sum
100000 0.058 0.000 2.148 0.000 fast_flip.py:4(throw_a_coin)
100000 1.455 0.000 2.090 0.000 method 'choice' of 'mtrand.RandomState' objects
100000 0.207 0.000 0.635 0.000 fromnumeric.py:2456(prod)
100000 0.026 0.000 0.429 0.000 _methods.py:34(_prod)
101500 0.409 0.000 0.409 0.000 method 'reduce' of 'numpy.ufunc' objects
 6 0.000 0.000 0.176 0.029 __init__.py:1(<module>)

There is a steep gap after buildins.sum, i.e. you spend most time there. We can use np.sum instead (pushing it down to 3.457 seconds):

ncalls tottime percall cumtime percall filename:lineno(function)
 133/1 0.003 0.000 3.457 3.457 built-in method builtins.exec
 1 0.002 0.002 3.457 3.457 fast_flip.py:1(<module>)
 500 0.905 0.002 3.307 0.007 fast_flip.py:8(make_throws)
100000 0.046 0.000 1.869 0.000 fast_flip.py:4(throw_a_coin)
100000 1.287 0.000 1.823 0.000 method 'choice' of 'mtrand.RandomState' objects
201500 0.702 0.000 0.702 0.000 method 'reduce' of 'numpy.ufunc' objects
100000 0.172 0.000 0.536 0.000 fromnumeric.py:2456(prod)
100000 0.136 0.000 0.532 0.000 fromnumeric.py:1778(sum)
100000 0.023 0.000 0.378 0.000 _methods.py:31(_sum)
100000 0.021 0.000 0.364 0.000 _methods.py:34(_prod)
 6 0.000 0.000 0.178 0.030 __init__.py:1(<module>)

further we can replace the strings "H" and "T" with a Boolean and stay in numpy for longer (down to 1.633 seconds):

ncalls tottime percall cumtime percall filename:lineno(function)
 133/1 0.003 0.000 1.633 1.633 built-in method builtins.exec
 1 0.001 0.001 1.633 1.633 fast_flip.py:1(<module>)
 500 0.101 0.000 1.485 0.003 fast_flip.py:11(make_throws)
100000 0.169 0.000 0.893 0.000 fast_flip.py:4(throw_a_coin)
100000 0.724 0.000 0.724 0.000 method 'uniform' of 'mtrand.RandomState' objects
100000 0.122 0.000 0.491 0.000 fromnumeric.py:1778(sum)
100000 0.024 0.000 0.354 0.000 _methods.py:31(_sum)
101500 0.334 0.000 0.334 0.000 method 'reduce' of 'numpy.ufunc' objects
 6 0.000 0.000 0.178 0.030 __init__.py:1(<module>)

ncalls tottime percall cumtime percall filename:lineno(function)
 133/1 0.003 0.000 0.786 0.786 built-in method builtins.exec
 1 0.003 0.003 0.786 0.786 fast_flip.py:1(<module>)
 500 0.053 0.000 0.634 0.001 fast_flip.py:4(make_throws)
 500 0.526 0.001 0.526 0.001 method 'uniform' of 'mtrand.RandomState' objects
 6 0.000 0.000 0.179 0.030 __init__.py:1(<module>)

Here is the code:

import numpy as np


def make_throws(number_of_samples, sample_size):
 # True == Heads
 throws = np.random.uniform(size=(number_of_samples, sample_size)) > 0.5
 sample_probs = np.sum(throws, axis=1) / sample_size
 return sample_probs

sample_sizes = [np.random.randint(1, 1e3) for idx in range(500)]

mean_of_sample_means = np.zeros(len(sample_sizes))
std_dev_of_sample_means = np.zeros(len(sample_sizes))

for i in range(len(sample_sizes)):
 prob = make_throws(200, sample_sizes[i])
 mean_of_sample_means[i] = np.mean(prob)
 std_dev_of_sample_means[i] = np.std(prob)

answered Apr 15 at 14:19

FirefoxMetzger

74628

My toy example I build from your snippets:

import numpy as np


def throw_a_coin(N):
 return np.random.choice(['H','T'], size=N)


def make_throws(number_of_samples, sample_size):
 sample_probs = np.zeros(number_of_samples)
 for i in range(number_of_samples):
 num_heads = sum(throw_a_coin(sample_size) == "H")
 sample_probs[i] = float(num_heads) / sample_size
 return sample_probs

sample_sizes = [np.random.randint(1, 1e3) for idx in range(500)]

mean_of_sample_means = np.zeros(len(sample_sizes))
std_dev_of_sample_means = np.zeros(len(sample_sizes))

for i in range(len(sample_sizes)):
 prob = make_throws(200, sample_sizes[i])
 mean_of_sample_means[i] = np.mean(prob)
 std_dev_of_sample_means[i] = np.std(prob)

cProfile (ordered by cumtime) reveals the problem (runs in 82.359 seconds):

ncalls tottime percall cumtime percall filename:lineno(function)
 133/1 0.003 0.000 82.359 82.359 built-in method builtins.exec
 1 0.002 0.002 82.359 82.359 fast_flip.py:1(<module>)
 500 0.940 0.002 82.200 0.164 fast_flip.py:8(make_throws)
100000 79.111 0.001 79.111 0.001 built-in method builtins.sum
100000 0.058 0.000 2.148 0.000 fast_flip.py:4(throw_a_coin)
100000 1.455 0.000 2.090 0.000 method 'choice' of 'mtrand.RandomState' objects
100000 0.207 0.000 0.635 0.000 fromnumeric.py:2456(prod)
100000 0.026 0.000 0.429 0.000 _methods.py:34(_prod)
101500 0.409 0.000 0.409 0.000 method 'reduce' of 'numpy.ufunc' objects
 6 0.000 0.000 0.176 0.029 __init__.py:1(<module>)

There is a steep gap after buildins.sum, i.e. you spend most time there. We can use np.sum instead (pushing it down to 3.457 seconds):

ncalls tottime percall cumtime percall filename:lineno(function)
 133/1 0.003 0.000 3.457 3.457 built-in method builtins.exec
 1 0.002 0.002 3.457 3.457 fast_flip.py:1(<module>)
 500 0.905 0.002 3.307 0.007 fast_flip.py:8(make_throws)
100000 0.046 0.000 1.869 0.000 fast_flip.py:4(throw_a_coin)
100000 1.287 0.000 1.823 0.000 method 'choice' of 'mtrand.RandomState' objects
201500 0.702 0.000 0.702 0.000 method 'reduce' of 'numpy.ufunc' objects
100000 0.172 0.000 0.536 0.000 fromnumeric.py:2456(prod)
100000 0.136 0.000 0.532 0.000 fromnumeric.py:1778(sum)
100000 0.023 0.000 0.378 0.000 _methods.py:31(_sum)
100000 0.021 0.000 0.364 0.000 _methods.py:34(_prod)
 6 0.000 0.000 0.178 0.030 __init__.py:1(<module>)

further we can replace the strings "H" and "T" with a Boolean and stay in numpy for longer (down to 1.633 seconds):

ncalls tottime percall cumtime percall filename:lineno(function)
 133/1 0.003 0.000 1.633 1.633 built-in method builtins.exec
 1 0.001 0.001 1.633 1.633 fast_flip.py:1(<module>)
 500 0.101 0.000 1.485 0.003 fast_flip.py:11(make_throws)
100000 0.169 0.000 0.893 0.000 fast_flip.py:4(throw_a_coin)
100000 0.724 0.000 0.724 0.000 method 'uniform' of 'mtrand.RandomState' objects
100000 0.122 0.000 0.491 0.000 fromnumeric.py:1778(sum)
100000 0.024 0.000 0.354 0.000 _methods.py:31(_sum)
101500 0.334 0.000 0.334 0.000 method 'reduce' of 'numpy.ufunc' objects
 6 0.000 0.000 0.178 0.030 __init__.py:1(<module>)

ncalls tottime percall cumtime percall filename:lineno(function)
 133/1 0.003 0.000 0.786 0.786 built-in method builtins.exec
 1 0.003 0.003 0.786 0.786 fast_flip.py:1(<module>)
 500 0.053 0.000 0.634 0.001 fast_flip.py:4(make_throws)
 500 0.526 0.001 0.526 0.001 method 'uniform' of 'mtrand.RandomState' objects
 6 0.000 0.000 0.179 0.030 __init__.py:1(<module>)

Here is the code:

import numpy as np


def make_throws(number_of_samples, sample_size):
 # True == Heads
 throws = np.random.uniform(size=(number_of_samples, sample_size)) > 0.5
 sample_probs = np.sum(throws, axis=1) / sample_size
 return sample_probs

sample_sizes = [np.random.randint(1, 1e3) for idx in range(500)]

mean_of_sample_means = np.zeros(len(sample_sizes))
std_dev_of_sample_means = np.zeros(len(sample_sizes))

for i in range(len(sample_sizes)):
 prob = make_throws(200, sample_sizes[i])
 mean_of_sample_means[i] = np.mean(prob)
 std_dev_of_sample_means[i] = np.std(prob)

answered Apr 15 at 14:19

FirefoxMetzger

74628

answered Apr 15 at 14:19

FirefoxMetzger

74628

answered Apr 15 at 14:19

FirefoxMetzger

74628

answered Apr 15 at 14:19

FirefoxMetzger

74628

Wow, very detailed and very helpful! I didn't know I can see how long each single operation takes! Thank you:)
â€“Â David P
Apr 15 at 17:32

@DavidP Yes, that's what profilers are for. They exist in every major language so this is not a Python specific thing. There is also LineProfiler which allows you to nicely view runtime per line for specific functions.
â€“Â FirefoxMetzger
Apr 15 at 17:52

I will save this comment in my brain! I didn't know about the profilers and they will help me for sure in the future, thanks a lot!
â€“Â David P
Apr 16 at 16:42

add a commentÂ |Â

Wow, very detailed and very helpful! I didn't know I can see how long each single operation takes! Thank you:)
â€“Â David P
Apr 15 at 17:32

@DavidP Yes, that's what profilers are for. They exist in every major language so this is not a Python specific thing. There is also LineProfiler which allows you to nicely view runtime per line for specific functions.
â€“Â FirefoxMetzger
Apr 15 at 17:52

I will save this comment in my brain! I didn't know about the profilers and they will help me for sure in the future, thanks a lot!
â€“Â David P
Apr 16 at 16:42

Wow, very detailed and very helpful! I didn't know I can see how long each single operation takes! Thank you:)
â€“Â David P
Apr 15 at 17:32

@DavidP Yes, that's what profilers are for. They exist in every major language so this is not a Python specific thing. There is also LineProfiler which allows you to nicely view runtime per line for specific functions.
â€“Â FirefoxMetzger
Apr 15 at 17:52

I will save this comment in my brain! I didn't know about the profilers and they will help me for sure in the future, thanks a lot!
â€“Â David P
Apr 16 at 16:42

add a commentÂ |Â

draft saved

draft discarded

draft saved

draft discarded

Post as a guest

Name

搜尋此網誌

trjhtr