Probability of event using Central Limit Theorem + plotting results

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP





.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty margin-bottom:0;







up vote
5
down vote

favorite
1












I've been doing a Udemy course called: "Statistics for Data Science" and I decided to solve one of the homework with Python to kill two birds with one rocket #elon.



The task was:




The team of traders under your supervision earns profits which can be
approximated with Laplace distribution. Profits (of any trade) have a
mean of $95.70 and a std. dev. of $1,247. Your team makes about 100
trades every week.



Questions:



A. What is the probability of my team making a loss in any
given week?
B. What is the probability of my team making over $20,000
in any given week?




As I just started to learn Python I would be happy for some hints and opinions.



# set up
import math
import scipy.stats as st
import matplotlib.pyplot as plt
import numpy as np

# Data
mu = 95.7 # mean
sigma = 1247 # standard deviation
n = 100 # sampling size (trades here)
xcritical1 = 0 # making a loss
xcritical2 = 20000 / n # Earning $20k a weak by 100 trades

mu_1 = mu # Based on Central Limit Theorem
sigma_1 = sigma / (math.sqrt(n)) # Based on CLT

# Calc
def Z(xcritical, mu, sigma):
return (xcritical - mu) / sigma # Standard Score (z-value)

Z1 = Z(xcritical1, mu_1, sigma_1)
Z2 = Z(xcritical2, mu_1, sigma_1)

P1 = st.norm.cdf(Z1) # Cumulative Distribution Function for ND
P2 = 1 - st.norm.cdf(Z2)

print('A. Probability of making loss in any given week is', '0:.4g'.format(P1*100) + '%')
print('B. Probability of making over $20k in any given week is', '0:.4g'.format(P2*100) + '%')

# Plots
def draw_z_score(x, cond, mu, sigma, title):
y = st.norm.pdf(x, mu, sigma) # Probability Density function for ND
z = x[cond]
plt.plot(x, y)
plt.fill_between(z, 0, st.norm.pdf(z, mu, sigma))
plt.title(title)
plt.text(-300, 0.0020, r'$mu=' + str(mu_1) + ', sigma=' + str(sigma_1) + '$')
plt.show()

x = np.arange(-400, 500, 1) # Fixed interval by experimenting
title1 = 'Probability of making loss: ' + '0:.4g'.format(P1*100) + '%'
title2 = 'Probability of earning more than $20k: ' + '0:.4g'.format(P2*100) + '%'

draw_z_score(x, x < xcritical1, mu_1, sigma_1, title1)
draw_z_score(x, x > xcritical2, mu_1, sigma_1, title2)






share|improve this question





















  • Why are you using a normal distribution, instead of a Laplace distribution as stated in the task description?
    – Graipher
    Feb 12 at 6:13










  • Because any distribution can be approximated by Normal Distribution by Central Limit Theorem and this is what they expected me in this task.
    – Mateusz Konopelski
    Feb 12 at 14:48










  • Fair enough (if you mean "the sum of random variables from any random distribution", instead of "any random distribution"). And in this case the approximation is actually already good enough with N=100: repl.it/@graipher/Sumlaplace-vs-Gauss
    – Graipher
    Feb 12 at 15:30
















up vote
5
down vote

favorite
1












I've been doing a Udemy course called: "Statistics for Data Science" and I decided to solve one of the homework with Python to kill two birds with one rocket #elon.



The task was:




The team of traders under your supervision earns profits which can be
approximated with Laplace distribution. Profits (of any trade) have a
mean of $95.70 and a std. dev. of $1,247. Your team makes about 100
trades every week.



Questions:



A. What is the probability of my team making a loss in any
given week?
B. What is the probability of my team making over $20,000
in any given week?




As I just started to learn Python I would be happy for some hints and opinions.



# set up
import math
import scipy.stats as st
import matplotlib.pyplot as plt
import numpy as np

# Data
mu = 95.7 # mean
sigma = 1247 # standard deviation
n = 100 # sampling size (trades here)
xcritical1 = 0 # making a loss
xcritical2 = 20000 / n # Earning $20k a weak by 100 trades

mu_1 = mu # Based on Central Limit Theorem
sigma_1 = sigma / (math.sqrt(n)) # Based on CLT

# Calc
def Z(xcritical, mu, sigma):
return (xcritical - mu) / sigma # Standard Score (z-value)

Z1 = Z(xcritical1, mu_1, sigma_1)
Z2 = Z(xcritical2, mu_1, sigma_1)

P1 = st.norm.cdf(Z1) # Cumulative Distribution Function for ND
P2 = 1 - st.norm.cdf(Z2)

print('A. Probability of making loss in any given week is', '0:.4g'.format(P1*100) + '%')
print('B. Probability of making over $20k in any given week is', '0:.4g'.format(P2*100) + '%')

# Plots
def draw_z_score(x, cond, mu, sigma, title):
y = st.norm.pdf(x, mu, sigma) # Probability Density function for ND
z = x[cond]
plt.plot(x, y)
plt.fill_between(z, 0, st.norm.pdf(z, mu, sigma))
plt.title(title)
plt.text(-300, 0.0020, r'$mu=' + str(mu_1) + ', sigma=' + str(sigma_1) + '$')
plt.show()

x = np.arange(-400, 500, 1) # Fixed interval by experimenting
title1 = 'Probability of making loss: ' + '0:.4g'.format(P1*100) + '%'
title2 = 'Probability of earning more than $20k: ' + '0:.4g'.format(P2*100) + '%'

draw_z_score(x, x < xcritical1, mu_1, sigma_1, title1)
draw_z_score(x, x > xcritical2, mu_1, sigma_1, title2)






share|improve this question





















  • Why are you using a normal distribution, instead of a Laplace distribution as stated in the task description?
    – Graipher
    Feb 12 at 6:13










  • Because any distribution can be approximated by Normal Distribution by Central Limit Theorem and this is what they expected me in this task.
    – Mateusz Konopelski
    Feb 12 at 14:48










  • Fair enough (if you mean "the sum of random variables from any random distribution", instead of "any random distribution"). And in this case the approximation is actually already good enough with N=100: repl.it/@graipher/Sumlaplace-vs-Gauss
    – Graipher
    Feb 12 at 15:30












up vote
5
down vote

favorite
1









up vote
5
down vote

favorite
1






1





I've been doing a Udemy course called: "Statistics for Data Science" and I decided to solve one of the homework with Python to kill two birds with one rocket #elon.



The task was:




The team of traders under your supervision earns profits which can be
approximated with Laplace distribution. Profits (of any trade) have a
mean of $95.70 and a std. dev. of $1,247. Your team makes about 100
trades every week.



Questions:



A. What is the probability of my team making a loss in any
given week?
B. What is the probability of my team making over $20,000
in any given week?




As I just started to learn Python I would be happy for some hints and opinions.



# set up
import math
import scipy.stats as st
import matplotlib.pyplot as plt
import numpy as np

# Data
mu = 95.7 # mean
sigma = 1247 # standard deviation
n = 100 # sampling size (trades here)
xcritical1 = 0 # making a loss
xcritical2 = 20000 / n # Earning $20k a weak by 100 trades

mu_1 = mu # Based on Central Limit Theorem
sigma_1 = sigma / (math.sqrt(n)) # Based on CLT

# Calc
def Z(xcritical, mu, sigma):
return (xcritical - mu) / sigma # Standard Score (z-value)

Z1 = Z(xcritical1, mu_1, sigma_1)
Z2 = Z(xcritical2, mu_1, sigma_1)

P1 = st.norm.cdf(Z1) # Cumulative Distribution Function for ND
P2 = 1 - st.norm.cdf(Z2)

print('A. Probability of making loss in any given week is', '0:.4g'.format(P1*100) + '%')
print('B. Probability of making over $20k in any given week is', '0:.4g'.format(P2*100) + '%')

# Plots
def draw_z_score(x, cond, mu, sigma, title):
y = st.norm.pdf(x, mu, sigma) # Probability Density function for ND
z = x[cond]
plt.plot(x, y)
plt.fill_between(z, 0, st.norm.pdf(z, mu, sigma))
plt.title(title)
plt.text(-300, 0.0020, r'$mu=' + str(mu_1) + ', sigma=' + str(sigma_1) + '$')
plt.show()

x = np.arange(-400, 500, 1) # Fixed interval by experimenting
title1 = 'Probability of making loss: ' + '0:.4g'.format(P1*100) + '%'
title2 = 'Probability of earning more than $20k: ' + '0:.4g'.format(P2*100) + '%'

draw_z_score(x, x < xcritical1, mu_1, sigma_1, title1)
draw_z_score(x, x > xcritical2, mu_1, sigma_1, title2)






share|improve this question













I've been doing a Udemy course called: "Statistics for Data Science" and I decided to solve one of the homework with Python to kill two birds with one rocket #elon.



The task was:




The team of traders under your supervision earns profits which can be
approximated with Laplace distribution. Profits (of any trade) have a
mean of $95.70 and a std. dev. of $1,247. Your team makes about 100
trades every week.



Questions:



A. What is the probability of my team making a loss in any
given week?
B. What is the probability of my team making over $20,000
in any given week?




As I just started to learn Python I would be happy for some hints and opinions.



# set up
import math
import scipy.stats as st
import matplotlib.pyplot as plt
import numpy as np

# Data
mu = 95.7 # mean
sigma = 1247 # standard deviation
n = 100 # sampling size (trades here)
xcritical1 = 0 # making a loss
xcritical2 = 20000 / n # Earning $20k a weak by 100 trades

mu_1 = mu # Based on Central Limit Theorem
sigma_1 = sigma / (math.sqrt(n)) # Based on CLT

# Calc
def Z(xcritical, mu, sigma):
return (xcritical - mu) / sigma # Standard Score (z-value)

Z1 = Z(xcritical1, mu_1, sigma_1)
Z2 = Z(xcritical2, mu_1, sigma_1)

P1 = st.norm.cdf(Z1) # Cumulative Distribution Function for ND
P2 = 1 - st.norm.cdf(Z2)

print('A. Probability of making loss in any given week is', '0:.4g'.format(P1*100) + '%')
print('B. Probability of making over $20k in any given week is', '0:.4g'.format(P2*100) + '%')

# Plots
def draw_z_score(x, cond, mu, sigma, title):
y = st.norm.pdf(x, mu, sigma) # Probability Density function for ND
z = x[cond]
plt.plot(x, y)
plt.fill_between(z, 0, st.norm.pdf(z, mu, sigma))
plt.title(title)
plt.text(-300, 0.0020, r'$mu=' + str(mu_1) + ', sigma=' + str(sigma_1) + '$')
plt.show()

x = np.arange(-400, 500, 1) # Fixed interval by experimenting
title1 = 'Probability of making loss: ' + '0:.4g'.format(P1*100) + '%'
title2 = 'Probability of earning more than $20k: ' + '0:.4g'.format(P2*100) + '%'

draw_z_score(x, x < xcritical1, mu_1, sigma_1, title1)
draw_z_score(x, x > xcritical2, mu_1, sigma_1, title2)








share|improve this question












share|improve this question




share|improve this question








edited Feb 12 at 15:33









Graipher

20.5k43081




20.5k43081









asked Feb 8 at 23:21









Mateusz Konopelski

654




654











  • Why are you using a normal distribution, instead of a Laplace distribution as stated in the task description?
    – Graipher
    Feb 12 at 6:13










  • Because any distribution can be approximated by Normal Distribution by Central Limit Theorem and this is what they expected me in this task.
    – Mateusz Konopelski
    Feb 12 at 14:48










  • Fair enough (if you mean "the sum of random variables from any random distribution", instead of "any random distribution"). And in this case the approximation is actually already good enough with N=100: repl.it/@graipher/Sumlaplace-vs-Gauss
    – Graipher
    Feb 12 at 15:30
















  • Why are you using a normal distribution, instead of a Laplace distribution as stated in the task description?
    – Graipher
    Feb 12 at 6:13










  • Because any distribution can be approximated by Normal Distribution by Central Limit Theorem and this is what they expected me in this task.
    – Mateusz Konopelski
    Feb 12 at 14:48










  • Fair enough (if you mean "the sum of random variables from any random distribution", instead of "any random distribution"). And in this case the approximation is actually already good enough with N=100: repl.it/@graipher/Sumlaplace-vs-Gauss
    – Graipher
    Feb 12 at 15:30















Why are you using a normal distribution, instead of a Laplace distribution as stated in the task description?
– Graipher
Feb 12 at 6:13




Why are you using a normal distribution, instead of a Laplace distribution as stated in the task description?
– Graipher
Feb 12 at 6:13












Because any distribution can be approximated by Normal Distribution by Central Limit Theorem and this is what they expected me in this task.
– Mateusz Konopelski
Feb 12 at 14:48




Because any distribution can be approximated by Normal Distribution by Central Limit Theorem and this is what they expected me in this task.
– Mateusz Konopelski
Feb 12 at 14:48












Fair enough (if you mean "the sum of random variables from any random distribution", instead of "any random distribution"). And in this case the approximation is actually already good enough with N=100: repl.it/@graipher/Sumlaplace-vs-Gauss
– Graipher
Feb 12 at 15:30




Fair enough (if you mean "the sum of random variables from any random distribution", instead of "any random distribution"). And in this case the approximation is actually already good enough with N=100: repl.it/@graipher/Sumlaplace-vs-Gauss
– Graipher
Feb 12 at 15:30










1 Answer
1






active

oldest

votes

















up vote
4
down vote



accepted










You seem to know about str.format (since you use it), yet you are still doing string addition:



title1 = 'Probability of making loss: ' + '0:.4g'.format(P1*100) + '%'
title2 = 'Probability of earning more than $20k: ' + '0:.4g'.format(P2*100) + '%'


Just make it a single string:



title1 = 'Probability of making loss: 0:.4g%'.format(P1*100)
title2 = 'Probability of earning more than $20k: 0:.4g%'.format(P2*100)


You could also use the % formatting option for floats (described in the fifth table here), which automatically multiplies with 100 and adds the % sign. The number of digits after the decimal point work differently, though, compared to the g format:



":.4%".format(0.5)
# '50.0000%'
":.4g%".format(0.5*100.)
# '50%'
":.4g%".format(0.5123123*100.)
# '51.23%'
":.4%".format(0.5123123)
# '51.2312%'



You should also re-organize your code according to the following scheme:



import something

GLOBAL_CONSTANT = "value"

class Definitions

def functions()

if __name__ == '__main__':
main()
# or some small code using the stuff defined above


That last part is there to protect your code from being executed if you want to import some part of this script from another script. At the moment if you want to do from laplace import Z, your whole code would run. With the if __name__ == '__main__' guard, it will only run when you execute this script.




You should add docstrings to your functions, describing what they do, what arguments they take and what they return. Have a look at PEP257 for the guidelines regarding docstrings.




And, in order to answer my own question from the comments:



Yes, it is justified to use the normal distribution here, since the Central Limit Theorem guarantees that the sum of random variables tends towards a normal distribution. The approximation is quite good for $N = 100$, as can be seen with the following small script:



import numpy as np
import matplotlib.pyplot as plt

mu = 95.7
sigma = 1247.
n = 100 # how many random variables to sum for each value
N = 1000 # how many values to generate

x_l = np.random.laplace(mu, sigma/np.sqrt(2), (n, N)).sum(axis=0)
x_g = np.random.normal(n*mu, np.sqrt(n)*sigma, N)

plt.hist(x_g, label="gaus")
plt.hist(x_l, label="sum(laplace)", histtype='step')
plt.legend()
plt.show()


enter image description here






share|improve this answer























    Your Answer




    StackExchange.ifUsing("editor", function ()
    return StackExchange.using("mathjaxEditing", function ()
    StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
    StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["\$", "\$"]]);
    );
    );
    , "mathjax-editing");

    StackExchange.ifUsing("editor", function ()
    StackExchange.using("externalEditor", function ()
    StackExchange.using("snippets", function ()
    StackExchange.snippets.init();
    );
    );
    , "code-snippets");

    StackExchange.ready(function()
    var channelOptions =
    tags: "".split(" "),
    id: "196"
    ;
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function()
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled)
    StackExchange.using("snippets", function()
    createEditor();
    );

    else
    createEditor();

    );

    function createEditor()
    StackExchange.prepareEditor(
    heartbeatType: 'answer',
    convertImagesToLinks: false,
    noModals: false,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: null,
    bindNavPrevention: true,
    postfix: "",
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    );



    );








     

    draft saved


    draft discarded


















    StackExchange.ready(
    function ()
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f187132%2fprobability-of-event-using-central-limit-theorem-plotting-results%23new-answer', 'question_page');

    );

    Post as a guest






























    1 Answer
    1






    active

    oldest

    votes








    1 Answer
    1






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes








    up vote
    4
    down vote



    accepted










    You seem to know about str.format (since you use it), yet you are still doing string addition:



    title1 = 'Probability of making loss: ' + '0:.4g'.format(P1*100) + '%'
    title2 = 'Probability of earning more than $20k: ' + '0:.4g'.format(P2*100) + '%'


    Just make it a single string:



    title1 = 'Probability of making loss: 0:.4g%'.format(P1*100)
    title2 = 'Probability of earning more than $20k: 0:.4g%'.format(P2*100)


    You could also use the % formatting option for floats (described in the fifth table here), which automatically multiplies with 100 and adds the % sign. The number of digits after the decimal point work differently, though, compared to the g format:



    ":.4%".format(0.5)
    # '50.0000%'
    ":.4g%".format(0.5*100.)
    # '50%'
    ":.4g%".format(0.5123123*100.)
    # '51.23%'
    ":.4%".format(0.5123123)
    # '51.2312%'



    You should also re-organize your code according to the following scheme:



    import something

    GLOBAL_CONSTANT = "value"

    class Definitions

    def functions()

    if __name__ == '__main__':
    main()
    # or some small code using the stuff defined above


    That last part is there to protect your code from being executed if you want to import some part of this script from another script. At the moment if you want to do from laplace import Z, your whole code would run. With the if __name__ == '__main__' guard, it will only run when you execute this script.




    You should add docstrings to your functions, describing what they do, what arguments they take and what they return. Have a look at PEP257 for the guidelines regarding docstrings.




    And, in order to answer my own question from the comments:



    Yes, it is justified to use the normal distribution here, since the Central Limit Theorem guarantees that the sum of random variables tends towards a normal distribution. The approximation is quite good for $N = 100$, as can be seen with the following small script:



    import numpy as np
    import matplotlib.pyplot as plt

    mu = 95.7
    sigma = 1247.
    n = 100 # how many random variables to sum for each value
    N = 1000 # how many values to generate

    x_l = np.random.laplace(mu, sigma/np.sqrt(2), (n, N)).sum(axis=0)
    x_g = np.random.normal(n*mu, np.sqrt(n)*sigma, N)

    plt.hist(x_g, label="gaus")
    plt.hist(x_l, label="sum(laplace)", histtype='step')
    plt.legend()
    plt.show()


    enter image description here






    share|improve this answer



























      up vote
      4
      down vote



      accepted










      You seem to know about str.format (since you use it), yet you are still doing string addition:



      title1 = 'Probability of making loss: ' + '0:.4g'.format(P1*100) + '%'
      title2 = 'Probability of earning more than $20k: ' + '0:.4g'.format(P2*100) + '%'


      Just make it a single string:



      title1 = 'Probability of making loss: 0:.4g%'.format(P1*100)
      title2 = 'Probability of earning more than $20k: 0:.4g%'.format(P2*100)


      You could also use the % formatting option for floats (described in the fifth table here), which automatically multiplies with 100 and adds the % sign. The number of digits after the decimal point work differently, though, compared to the g format:



      ":.4%".format(0.5)
      # '50.0000%'
      ":.4g%".format(0.5*100.)
      # '50%'
      ":.4g%".format(0.5123123*100.)
      # '51.23%'
      ":.4%".format(0.5123123)
      # '51.2312%'



      You should also re-organize your code according to the following scheme:



      import something

      GLOBAL_CONSTANT = "value"

      class Definitions

      def functions()

      if __name__ == '__main__':
      main()
      # or some small code using the stuff defined above


      That last part is there to protect your code from being executed if you want to import some part of this script from another script. At the moment if you want to do from laplace import Z, your whole code would run. With the if __name__ == '__main__' guard, it will only run when you execute this script.




      You should add docstrings to your functions, describing what they do, what arguments they take and what they return. Have a look at PEP257 for the guidelines regarding docstrings.




      And, in order to answer my own question from the comments:



      Yes, it is justified to use the normal distribution here, since the Central Limit Theorem guarantees that the sum of random variables tends towards a normal distribution. The approximation is quite good for $N = 100$, as can be seen with the following small script:



      import numpy as np
      import matplotlib.pyplot as plt

      mu = 95.7
      sigma = 1247.
      n = 100 # how many random variables to sum for each value
      N = 1000 # how many values to generate

      x_l = np.random.laplace(mu, sigma/np.sqrt(2), (n, N)).sum(axis=0)
      x_g = np.random.normal(n*mu, np.sqrt(n)*sigma, N)

      plt.hist(x_g, label="gaus")
      plt.hist(x_l, label="sum(laplace)", histtype='step')
      plt.legend()
      plt.show()


      enter image description here






      share|improve this answer

























        up vote
        4
        down vote



        accepted







        up vote
        4
        down vote



        accepted






        You seem to know about str.format (since you use it), yet you are still doing string addition:



        title1 = 'Probability of making loss: ' + '0:.4g'.format(P1*100) + '%'
        title2 = 'Probability of earning more than $20k: ' + '0:.4g'.format(P2*100) + '%'


        Just make it a single string:



        title1 = 'Probability of making loss: 0:.4g%'.format(P1*100)
        title2 = 'Probability of earning more than $20k: 0:.4g%'.format(P2*100)


        You could also use the % formatting option for floats (described in the fifth table here), which automatically multiplies with 100 and adds the % sign. The number of digits after the decimal point work differently, though, compared to the g format:



        ":.4%".format(0.5)
        # '50.0000%'
        ":.4g%".format(0.5*100.)
        # '50%'
        ":.4g%".format(0.5123123*100.)
        # '51.23%'
        ":.4%".format(0.5123123)
        # '51.2312%'



        You should also re-organize your code according to the following scheme:



        import something

        GLOBAL_CONSTANT = "value"

        class Definitions

        def functions()

        if __name__ == '__main__':
        main()
        # or some small code using the stuff defined above


        That last part is there to protect your code from being executed if you want to import some part of this script from another script. At the moment if you want to do from laplace import Z, your whole code would run. With the if __name__ == '__main__' guard, it will only run when you execute this script.




        You should add docstrings to your functions, describing what they do, what arguments they take and what they return. Have a look at PEP257 for the guidelines regarding docstrings.




        And, in order to answer my own question from the comments:



        Yes, it is justified to use the normal distribution here, since the Central Limit Theorem guarantees that the sum of random variables tends towards a normal distribution. The approximation is quite good for $N = 100$, as can be seen with the following small script:



        import numpy as np
        import matplotlib.pyplot as plt

        mu = 95.7
        sigma = 1247.
        n = 100 # how many random variables to sum for each value
        N = 1000 # how many values to generate

        x_l = np.random.laplace(mu, sigma/np.sqrt(2), (n, N)).sum(axis=0)
        x_g = np.random.normal(n*mu, np.sqrt(n)*sigma, N)

        plt.hist(x_g, label="gaus")
        plt.hist(x_l, label="sum(laplace)", histtype='step')
        plt.legend()
        plt.show()


        enter image description here






        share|improve this answer















        You seem to know about str.format (since you use it), yet you are still doing string addition:



        title1 = 'Probability of making loss: ' + '0:.4g'.format(P1*100) + '%'
        title2 = 'Probability of earning more than $20k: ' + '0:.4g'.format(P2*100) + '%'


        Just make it a single string:



        title1 = 'Probability of making loss: 0:.4g%'.format(P1*100)
        title2 = 'Probability of earning more than $20k: 0:.4g%'.format(P2*100)


        You could also use the % formatting option for floats (described in the fifth table here), which automatically multiplies with 100 and adds the % sign. The number of digits after the decimal point work differently, though, compared to the g format:



        ":.4%".format(0.5)
        # '50.0000%'
        ":.4g%".format(0.5*100.)
        # '50%'
        ":.4g%".format(0.5123123*100.)
        # '51.23%'
        ":.4%".format(0.5123123)
        # '51.2312%'



        You should also re-organize your code according to the following scheme:



        import something

        GLOBAL_CONSTANT = "value"

        class Definitions

        def functions()

        if __name__ == '__main__':
        main()
        # or some small code using the stuff defined above


        That last part is there to protect your code from being executed if you want to import some part of this script from another script. At the moment if you want to do from laplace import Z, your whole code would run. With the if __name__ == '__main__' guard, it will only run when you execute this script.




        You should add docstrings to your functions, describing what they do, what arguments they take and what they return. Have a look at PEP257 for the guidelines regarding docstrings.




        And, in order to answer my own question from the comments:



        Yes, it is justified to use the normal distribution here, since the Central Limit Theorem guarantees that the sum of random variables tends towards a normal distribution. The approximation is quite good for $N = 100$, as can be seen with the following small script:



        import numpy as np
        import matplotlib.pyplot as plt

        mu = 95.7
        sigma = 1247.
        n = 100 # how many random variables to sum for each value
        N = 1000 # how many values to generate

        x_l = np.random.laplace(mu, sigma/np.sqrt(2), (n, N)).sum(axis=0)
        x_g = np.random.normal(n*mu, np.sqrt(n)*sigma, N)

        plt.hist(x_g, label="gaus")
        plt.hist(x_l, label="sum(laplace)", histtype='step')
        plt.legend()
        plt.show()


        enter image description here







        share|improve this answer















        share|improve this answer



        share|improve this answer








        edited Feb 13 at 13:14


























        answered Feb 12 at 15:40









        Graipher

        20.5k43081




        20.5k43081






















             

            draft saved


            draft discarded


























             


            draft saved


            draft discarded














            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f187132%2fprobability-of-event-using-central-limit-theorem-plotting-results%23new-answer', 'question_page');

            );

            Post as a guest













































































            Popular posts from this blog

            Greedy Best First Search implementation in Rust

            Function to Return a JSON Like Objects Using VBA Collections and Arrays

            C++11 CLH Lock Implementation