Prime Number Generator & Efficiency

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP





.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty margin-bottom:0;







up vote
8
down vote

favorite
1












If anyone can explain to me the advantages of using things like the Sieve of Atkin over the Sieve of Eratosthenes, and how to code it (especially the Sieve of Atkin), that would be highly appreciated.



As well, I welcome anyone has any ideas, suggestions, or improvements that can improve the efficiency of the code I already have.



What I would like to do: I would like to make (or improve my function) a function that returns an array of integers containing all the prime numbers from 2...n that can execute as quickly and efficiently as possible (ideally so that it will be able to execute n = 1,000,000,000 in a matter of seconds).



Note: I am using Swift 4.1 and Xcode 9.3




Where I'm Currently at



So far, I've used the Sieve of Eratosthenes to make this function that can calculate prime numbers under 1,000,000 in 0.28s, but when I try it with a number like 1 billion or 1 trillion, it takes too long.



func generatePrimes(to n: Int) -> [Int] 

precondition(n > 5, "Input must be greater than 5")

var arr : [Int] = Array<Int>(stride(from: 3, to: n, by: 2))

for index in 0...
let num : Int = arr.remove(at: index)
arr = arr.filter $0 % num != 0
arr.insert(num, at: index)
if arr[index + 1] >= Int(floor(sqrtf(Float(n)))) break


arr.insert(2, at: 0)

//Print Statements
print("Prime numbers under (n):")
_ = arr.enumerated().map (index, element) in print("t(index + 1). (element)")

return arr



Usage:



generatePrimes(to: 50)


Prints:





Prime numbers under 50:
1. 2
2. 3
3. 5
4. 7
5. 11
6. 13
7. 17
8. 19
9. 23
10. 29
11. 31
12. 37
13. 41
14. 43
15. 47






share|improve this question



























    up vote
    8
    down vote

    favorite
    1












    If anyone can explain to me the advantages of using things like the Sieve of Atkin over the Sieve of Eratosthenes, and how to code it (especially the Sieve of Atkin), that would be highly appreciated.



    As well, I welcome anyone has any ideas, suggestions, or improvements that can improve the efficiency of the code I already have.



    What I would like to do: I would like to make (or improve my function) a function that returns an array of integers containing all the prime numbers from 2...n that can execute as quickly and efficiently as possible (ideally so that it will be able to execute n = 1,000,000,000 in a matter of seconds).



    Note: I am using Swift 4.1 and Xcode 9.3




    Where I'm Currently at



    So far, I've used the Sieve of Eratosthenes to make this function that can calculate prime numbers under 1,000,000 in 0.28s, but when I try it with a number like 1 billion or 1 trillion, it takes too long.



    func generatePrimes(to n: Int) -> [Int] 

    precondition(n > 5, "Input must be greater than 5")

    var arr : [Int] = Array<Int>(stride(from: 3, to: n, by: 2))

    for index in 0...
    let num : Int = arr.remove(at: index)
    arr = arr.filter $0 % num != 0
    arr.insert(num, at: index)
    if arr[index + 1] >= Int(floor(sqrtf(Float(n)))) break


    arr.insert(2, at: 0)

    //Print Statements
    print("Prime numbers under (n):")
    _ = arr.enumerated().map (index, element) in print("t(index + 1). (element)")

    return arr



    Usage:



    generatePrimes(to: 50)


    Prints:





    Prime numbers under 50:
    1. 2
    2. 3
    3. 5
    4. 7
    5. 11
    6. 13
    7. 17
    8. 19
    9. 23
    10. 29
    11. 31
    12. 37
    13. 41
    14. 43
    15. 47






    share|improve this question























      up vote
      8
      down vote

      favorite
      1









      up vote
      8
      down vote

      favorite
      1






      1





      If anyone can explain to me the advantages of using things like the Sieve of Atkin over the Sieve of Eratosthenes, and how to code it (especially the Sieve of Atkin), that would be highly appreciated.



      As well, I welcome anyone has any ideas, suggestions, or improvements that can improve the efficiency of the code I already have.



      What I would like to do: I would like to make (or improve my function) a function that returns an array of integers containing all the prime numbers from 2...n that can execute as quickly and efficiently as possible (ideally so that it will be able to execute n = 1,000,000,000 in a matter of seconds).



      Note: I am using Swift 4.1 and Xcode 9.3




      Where I'm Currently at



      So far, I've used the Sieve of Eratosthenes to make this function that can calculate prime numbers under 1,000,000 in 0.28s, but when I try it with a number like 1 billion or 1 trillion, it takes too long.



      func generatePrimes(to n: Int) -> [Int] 

      precondition(n > 5, "Input must be greater than 5")

      var arr : [Int] = Array<Int>(stride(from: 3, to: n, by: 2))

      for index in 0...
      let num : Int = arr.remove(at: index)
      arr = arr.filter $0 % num != 0
      arr.insert(num, at: index)
      if arr[index + 1] >= Int(floor(sqrtf(Float(n)))) break


      arr.insert(2, at: 0)

      //Print Statements
      print("Prime numbers under (n):")
      _ = arr.enumerated().map (index, element) in print("t(index + 1). (element)")

      return arr



      Usage:



      generatePrimes(to: 50)


      Prints:





      Prime numbers under 50:
      1. 2
      2. 3
      3. 5
      4. 7
      5. 11
      6. 13
      7. 17
      8. 19
      9. 23
      10. 29
      11. 31
      12. 37
      13. 41
      14. 43
      15. 47






      share|improve this question













      If anyone can explain to me the advantages of using things like the Sieve of Atkin over the Sieve of Eratosthenes, and how to code it (especially the Sieve of Atkin), that would be highly appreciated.



      As well, I welcome anyone has any ideas, suggestions, or improvements that can improve the efficiency of the code I already have.



      What I would like to do: I would like to make (or improve my function) a function that returns an array of integers containing all the prime numbers from 2...n that can execute as quickly and efficiently as possible (ideally so that it will be able to execute n = 1,000,000,000 in a matter of seconds).



      Note: I am using Swift 4.1 and Xcode 9.3




      Where I'm Currently at



      So far, I've used the Sieve of Eratosthenes to make this function that can calculate prime numbers under 1,000,000 in 0.28s, but when I try it with a number like 1 billion or 1 trillion, it takes too long.



      func generatePrimes(to n: Int) -> [Int] 

      precondition(n > 5, "Input must be greater than 5")

      var arr : [Int] = Array<Int>(stride(from: 3, to: n, by: 2))

      for index in 0...
      let num : Int = arr.remove(at: index)
      arr = arr.filter $0 % num != 0
      arr.insert(num, at: index)
      if arr[index + 1] >= Int(floor(sqrtf(Float(n)))) break


      arr.insert(2, at: 0)

      //Print Statements
      print("Prime numbers under (n):")
      _ = arr.enumerated().map (index, element) in print("t(index + 1). (element)")

      return arr



      Usage:



      generatePrimes(to: 50)


      Prints:





      Prime numbers under 50:
      1. 2
      2. 3
      3. 5
      4. 7
      5. 11
      6. 13
      7. 17
      8. 19
      9. 23
      10. 29
      11. 31
      12. 37
      13. 41
      14. 43
      15. 47








      share|improve this question












      share|improve this question




      share|improve this question








      edited Jun 10 at 4:09









      Jamal♦

      30.1k11114225




      30.1k11114225









      asked Apr 14 at 0:09









      Noah Wilder

      745




      745




















          1 Answer
          1






          active

          oldest

          votes

















          up vote
          7
          down vote



          accepted










          That is not the Sieve of Eratosthenes



          The Sieve of Eratosthenes computes multiples
          of each found prime to mark subsequent composite numbers in the sieve.
          Your algorithm computes the remainder of all subsequent numbers instead.
          That makes a huge difference. I'll come back to that later, let's start
          with a



          Review of your current code



          There is a inconsistency here:



          func generatePrimes(to n: Int) -> [Int] 
          // ...
          var arr : [Int] = Array<Int>(stride(from: 3, to: n, by: 2))
          // ...



          The function – as I understand the to parameter –
          computes all primes up to and including n. On the other hand,
          stride(from: 3, to: n, by: 2) does not include the upper bound,
          and that is easily verified with



          print(generatePrimes(to: 11)) // [2, 3, 5, 7]


          So either rename the function to func generatePrimes(below n: Int)
          or use stride(from: 3, through: n, by: 2) to include the upper bound.
          I'll do the latter for this review.



          It would also be a good idea to add a documentation comment which
          unambiguously documents the function parameters.



          The explicit type annotation is not needed, and the array creation
          can be simplified to



          var arr = Array(stride(from: 3, through: n, by: 2))


          Why is the limit number expected to be greater than 5? That is an
          unnecessary restriction and would be unexpected for a caller of your function. It may be that your implementation does not work if
          $ n le 5 $, but it would be easy to handle that case separately:



          if n <= 5 
          return [2, 3, 5].filter $0 <= n



          In



           let num : Int = arr.remove(at: index)


          the type annotation on the left is again not needed.



          (Remark: This



           arr = arr.filter $0 % num != 0 


          can be replaced by the more efficient



           arr.removeAll(where: $0 % num == 0 )


          in a future version of Swift, when SE-0197 Adding in-place removeAll(where:) to the Standard Library is implemented in Swift 4.2.)



          Here



           if arr[index + 1] >= Int(floor(sqrtf(Float(n)))) break 


          is a hidden bug: As



          print(generatePrimes(to: 26)) // [2, 3, 5, 7, 11, 13, 17, 19, 23, 25]


          demonstrates, you are leaving the loop to early.
          I would also compute the square root only once, and use double-precision
          arithmetic: The 24 bit significand of a Float cannot represent large
          integers correctly (compare Computing the integer square root of large numbers).



          Finally, the function should compute the result, but not print it, i.e. this



          //Print Statements
          print("Prime numbers under (n):")
          _ = arr.enumerated().map (index, element) in print("t(index + 1). (element)")


          should be removed. It is generally a good habit to separate computation from I/O.
          In addition, this distorts the benchmark results, because you measure also the
          time for converting the numbers to strings, and the time to print these strings
          (which depends on the output device, printing to a file is faster than printing
          to the Terminal or the Xcode console).



          With all changes suggested so far, your function would look like this:



          /// Compute list of primes
          ///
          /// - Parameter n: The upper bound
          /// - Returns: An array with all primes <= `n`
          func generatePrimes(to n: Int) -> [Int]

          if n <= 5
          return [2, 3, 5].filter $0 <= n


          var arr = Array(stride(from: 3, through: n, by: 2))

          let squareRootN = Int(Double(n).squareRoot())
          for index in 0...
          if arr[index] > squareRootN break
          let num = arr.remove(at: index)
          arr = arr.filter $0 % num != 0
          arr.insert(num, at: index)


          arr.insert(2, at: 0)
          return arr



          Using the Sieve of Eratosthenes



          As I said above, your algorithm is different from the Eratosthenes sieve.
          Each time a prime number $ p $ is found, your code does trial divisions for
          all remaining numbers to remove multiples of $ p $.
          The Eratosthenes sieve computes multiples of $ p $ instead:



          p*p, p*p + p, p*p + 2*p, p*p + 3*p, ...


          and marks these as composite. Note that there are far less values to compute.



          Also your algorithm removes and inserts array elements frequently.
          That is slow because the remaining elements must be shifted to the left
          or to the right.



          The Eratosthenes sieve works with a fixed-sized boolean array instead.



          Here is a very simple straight-forward implementation in Swift:



          func eratosthenesSieve(to n: Int) -> [Int] 
          var composite = Array(repeating: false, count: n + 1) // The sieve
          var primes: [Int] =

          if n >= 55
          // Upper bound for the number of primes up to and including `n`,
          // from https://en.wikipedia.org/wiki/Prime_number_theorem#Non-asymptotic_bounds_on_the_prime-counting_function :
          let d = Double(n)
          let upperBound = Int(d / (log(d) - 4))
          primes.reserveCapacity(upperBound)


          let squareRootN = Int(Double(n).squareRoot())
          var p = 2
          while p <= squareRootN
          if !composite[p]
          primes.append(p)
          for q in stride(from: p * p, through: n, by: p)
          composite[q] = true


          p += 1

          while p <= n
          if !composite[p]
          primes.append(p)

          p += 1

          return primes



          The only optimization I did was to reserve capacity for the result
          array in order to avoid reallocations.



          This is already considerably faster than your original code.
          Here is my benchmarking code:



          func benchmark(_ ident: String, _ function: (Int) -> [Int], to n: Int) 
          let start = Date()
          let primes = function(n)
          let elapsed = Date().timeIntervalSince(start)
          print(ident, primes.count, "primes up to", n, String(format: "time: %.3f", elapsed))


          benchmark("original", generatePrimes, to: 100_000)
          benchmark("original", generatePrimes, to: 1_000_000)
          benchmark("original", generatePrimes, to: 10_000_000)

          print()

          benchmark("eratosthenes", eratosthenesSieve, to: 100_000)
          benchmark("eratosthenes", eratosthenesSieve, to: 1_000_000)
          benchmark("eratosthenes", eratosthenesSieve, to: 10_000_000)
          benchmark("eratosthenes", eratosthenesSieve, to: 100_000_000)
          benchmark("eratosthenes", eratosthenesSieve, to: 1_000_000_000)


          On a 1.2 GHz Intel Core m5 MacBook (Retina, 12-inch, Early 2016)
          with 8GM RAM I get the following results (slightly reformatted to
          increase the legibility):




          original 9592 primes up to 100000 time: 0.038
          original 78498 primes up to 1000000 time: 0.418
          original 664579 primes up to 10000000 time: 9.739

          eratosthenes 9592 primes up to 100000 time: 0.001
          eratosthenes 78498 primes up to 1000000 time: 0.007
          eratosthenes 664579 primes up to 10000000 time: 0.096
          eratosthenes 5761455 primes up to 100000000 time: 1.223
          eratosthenes 50847534 primes up to 1000000000 time: 15.034


          This can surely be further improved, a first step would be to handle the case p=2
          separately and to use the sieve only for odd numbers (which however complicates the
          index calculations a bit).



          But, as you see, computing the primes up to 1 billion is feasible with the sieve
          of Eratosthenes (and the numbers are correct, you can compare them with
          https://primes.utm.edu/howmany.html).



          Going for the trillion



          You were also thinking of computing all primes up to one trillion. That might be
          a problem, no matter which algorithm you choose. According to
          https://primes.utm.edu/howmany.html, there are 37,607,912,018 prime numbers
          below 1 trillion. Even if we use 32-bit bit integers to save space, that would
          still require approx 140 GB memory, which makes returning an array with all that
          numbers difficult.



          So that would require a different approach which allows to compute prime numbers in
          some given range, instead of all prime numbers up to a limit.






          share|improve this answer























            Your Answer




            StackExchange.ifUsing("editor", function ()
            return StackExchange.using("mathjaxEditing", function ()
            StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
            StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["\$", "\$"]]);
            );
            );
            , "mathjax-editing");

            StackExchange.ifUsing("editor", function ()
            StackExchange.using("externalEditor", function ()
            StackExchange.using("snippets", function ()
            StackExchange.snippets.init();
            );
            );
            , "code-snippets");

            StackExchange.ready(function()
            var channelOptions =
            tags: "".split(" "),
            id: "196"
            ;
            initTagRenderer("".split(" "), "".split(" "), channelOptions);

            StackExchange.using("externalEditor", function()
            // Have to fire editor after snippets, if snippets enabled
            if (StackExchange.settings.snippets.snippetsEnabled)
            StackExchange.using("snippets", function()
            createEditor();
            );

            else
            createEditor();

            );

            function createEditor()
            StackExchange.prepareEditor(
            heartbeatType: 'answer',
            convertImagesToLinks: false,
            noModals: false,
            showLowRepImageUploadWarning: true,
            reputationToPostImages: null,
            bindNavPrevention: true,
            postfix: "",
            onDemand: true,
            discardSelector: ".discard-answer"
            ,immediatelyShowMarkdownHelp:true
            );



            );








             

            draft saved


            draft discarded


















            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f192021%2fprime-number-generator-efficiency%23new-answer', 'question_page');

            );

            Post as a guest






























            1 Answer
            1






            active

            oldest

            votes








            1 Answer
            1






            active

            oldest

            votes









            active

            oldest

            votes






            active

            oldest

            votes








            up vote
            7
            down vote



            accepted










            That is not the Sieve of Eratosthenes



            The Sieve of Eratosthenes computes multiples
            of each found prime to mark subsequent composite numbers in the sieve.
            Your algorithm computes the remainder of all subsequent numbers instead.
            That makes a huge difference. I'll come back to that later, let's start
            with a



            Review of your current code



            There is a inconsistency here:



            func generatePrimes(to n: Int) -> [Int] 
            // ...
            var arr : [Int] = Array<Int>(stride(from: 3, to: n, by: 2))
            // ...



            The function – as I understand the to parameter –
            computes all primes up to and including n. On the other hand,
            stride(from: 3, to: n, by: 2) does not include the upper bound,
            and that is easily verified with



            print(generatePrimes(to: 11)) // [2, 3, 5, 7]


            So either rename the function to func generatePrimes(below n: Int)
            or use stride(from: 3, through: n, by: 2) to include the upper bound.
            I'll do the latter for this review.



            It would also be a good idea to add a documentation comment which
            unambiguously documents the function parameters.



            The explicit type annotation is not needed, and the array creation
            can be simplified to



            var arr = Array(stride(from: 3, through: n, by: 2))


            Why is the limit number expected to be greater than 5? That is an
            unnecessary restriction and would be unexpected for a caller of your function. It may be that your implementation does not work if
            $ n le 5 $, but it would be easy to handle that case separately:



            if n <= 5 
            return [2, 3, 5].filter $0 <= n



            In



             let num : Int = arr.remove(at: index)


            the type annotation on the left is again not needed.



            (Remark: This



             arr = arr.filter $0 % num != 0 


            can be replaced by the more efficient



             arr.removeAll(where: $0 % num == 0 )


            in a future version of Swift, when SE-0197 Adding in-place removeAll(where:) to the Standard Library is implemented in Swift 4.2.)



            Here



             if arr[index + 1] >= Int(floor(sqrtf(Float(n)))) break 


            is a hidden bug: As



            print(generatePrimes(to: 26)) // [2, 3, 5, 7, 11, 13, 17, 19, 23, 25]


            demonstrates, you are leaving the loop to early.
            I would also compute the square root only once, and use double-precision
            arithmetic: The 24 bit significand of a Float cannot represent large
            integers correctly (compare Computing the integer square root of large numbers).



            Finally, the function should compute the result, but not print it, i.e. this



            //Print Statements
            print("Prime numbers under (n):")
            _ = arr.enumerated().map (index, element) in print("t(index + 1). (element)")


            should be removed. It is generally a good habit to separate computation from I/O.
            In addition, this distorts the benchmark results, because you measure also the
            time for converting the numbers to strings, and the time to print these strings
            (which depends on the output device, printing to a file is faster than printing
            to the Terminal or the Xcode console).



            With all changes suggested so far, your function would look like this:



            /// Compute list of primes
            ///
            /// - Parameter n: The upper bound
            /// - Returns: An array with all primes <= `n`
            func generatePrimes(to n: Int) -> [Int]

            if n <= 5
            return [2, 3, 5].filter $0 <= n


            var arr = Array(stride(from: 3, through: n, by: 2))

            let squareRootN = Int(Double(n).squareRoot())
            for index in 0...
            if arr[index] > squareRootN break
            let num = arr.remove(at: index)
            arr = arr.filter $0 % num != 0
            arr.insert(num, at: index)


            arr.insert(2, at: 0)
            return arr



            Using the Sieve of Eratosthenes



            As I said above, your algorithm is different from the Eratosthenes sieve.
            Each time a prime number $ p $ is found, your code does trial divisions for
            all remaining numbers to remove multiples of $ p $.
            The Eratosthenes sieve computes multiples of $ p $ instead:



            p*p, p*p + p, p*p + 2*p, p*p + 3*p, ...


            and marks these as composite. Note that there are far less values to compute.



            Also your algorithm removes and inserts array elements frequently.
            That is slow because the remaining elements must be shifted to the left
            or to the right.



            The Eratosthenes sieve works with a fixed-sized boolean array instead.



            Here is a very simple straight-forward implementation in Swift:



            func eratosthenesSieve(to n: Int) -> [Int] 
            var composite = Array(repeating: false, count: n + 1) // The sieve
            var primes: [Int] =

            if n >= 55
            // Upper bound for the number of primes up to and including `n`,
            // from https://en.wikipedia.org/wiki/Prime_number_theorem#Non-asymptotic_bounds_on_the_prime-counting_function :
            let d = Double(n)
            let upperBound = Int(d / (log(d) - 4))
            primes.reserveCapacity(upperBound)


            let squareRootN = Int(Double(n).squareRoot())
            var p = 2
            while p <= squareRootN
            if !composite[p]
            primes.append(p)
            for q in stride(from: p * p, through: n, by: p)
            composite[q] = true


            p += 1

            while p <= n
            if !composite[p]
            primes.append(p)

            p += 1

            return primes



            The only optimization I did was to reserve capacity for the result
            array in order to avoid reallocations.



            This is already considerably faster than your original code.
            Here is my benchmarking code:



            func benchmark(_ ident: String, _ function: (Int) -> [Int], to n: Int) 
            let start = Date()
            let primes = function(n)
            let elapsed = Date().timeIntervalSince(start)
            print(ident, primes.count, "primes up to", n, String(format: "time: %.3f", elapsed))


            benchmark("original", generatePrimes, to: 100_000)
            benchmark("original", generatePrimes, to: 1_000_000)
            benchmark("original", generatePrimes, to: 10_000_000)

            print()

            benchmark("eratosthenes", eratosthenesSieve, to: 100_000)
            benchmark("eratosthenes", eratosthenesSieve, to: 1_000_000)
            benchmark("eratosthenes", eratosthenesSieve, to: 10_000_000)
            benchmark("eratosthenes", eratosthenesSieve, to: 100_000_000)
            benchmark("eratosthenes", eratosthenesSieve, to: 1_000_000_000)


            On a 1.2 GHz Intel Core m5 MacBook (Retina, 12-inch, Early 2016)
            with 8GM RAM I get the following results (slightly reformatted to
            increase the legibility):




            original 9592 primes up to 100000 time: 0.038
            original 78498 primes up to 1000000 time: 0.418
            original 664579 primes up to 10000000 time: 9.739

            eratosthenes 9592 primes up to 100000 time: 0.001
            eratosthenes 78498 primes up to 1000000 time: 0.007
            eratosthenes 664579 primes up to 10000000 time: 0.096
            eratosthenes 5761455 primes up to 100000000 time: 1.223
            eratosthenes 50847534 primes up to 1000000000 time: 15.034


            This can surely be further improved, a first step would be to handle the case p=2
            separately and to use the sieve only for odd numbers (which however complicates the
            index calculations a bit).



            But, as you see, computing the primes up to 1 billion is feasible with the sieve
            of Eratosthenes (and the numbers are correct, you can compare them with
            https://primes.utm.edu/howmany.html).



            Going for the trillion



            You were also thinking of computing all primes up to one trillion. That might be
            a problem, no matter which algorithm you choose. According to
            https://primes.utm.edu/howmany.html, there are 37,607,912,018 prime numbers
            below 1 trillion. Even if we use 32-bit bit integers to save space, that would
            still require approx 140 GB memory, which makes returning an array with all that
            numbers difficult.



            So that would require a different approach which allows to compute prime numbers in
            some given range, instead of all prime numbers up to a limit.






            share|improve this answer



























              up vote
              7
              down vote



              accepted










              That is not the Sieve of Eratosthenes



              The Sieve of Eratosthenes computes multiples
              of each found prime to mark subsequent composite numbers in the sieve.
              Your algorithm computes the remainder of all subsequent numbers instead.
              That makes a huge difference. I'll come back to that later, let's start
              with a



              Review of your current code



              There is a inconsistency here:



              func generatePrimes(to n: Int) -> [Int] 
              // ...
              var arr : [Int] = Array<Int>(stride(from: 3, to: n, by: 2))
              // ...



              The function – as I understand the to parameter –
              computes all primes up to and including n. On the other hand,
              stride(from: 3, to: n, by: 2) does not include the upper bound,
              and that is easily verified with



              print(generatePrimes(to: 11)) // [2, 3, 5, 7]


              So either rename the function to func generatePrimes(below n: Int)
              or use stride(from: 3, through: n, by: 2) to include the upper bound.
              I'll do the latter for this review.



              It would also be a good idea to add a documentation comment which
              unambiguously documents the function parameters.



              The explicit type annotation is not needed, and the array creation
              can be simplified to



              var arr = Array(stride(from: 3, through: n, by: 2))


              Why is the limit number expected to be greater than 5? That is an
              unnecessary restriction and would be unexpected for a caller of your function. It may be that your implementation does not work if
              $ n le 5 $, but it would be easy to handle that case separately:



              if n <= 5 
              return [2, 3, 5].filter $0 <= n



              In



               let num : Int = arr.remove(at: index)


              the type annotation on the left is again not needed.



              (Remark: This



               arr = arr.filter $0 % num != 0 


              can be replaced by the more efficient



               arr.removeAll(where: $0 % num == 0 )


              in a future version of Swift, when SE-0197 Adding in-place removeAll(where:) to the Standard Library is implemented in Swift 4.2.)



              Here



               if arr[index + 1] >= Int(floor(sqrtf(Float(n)))) break 


              is a hidden bug: As



              print(generatePrimes(to: 26)) // [2, 3, 5, 7, 11, 13, 17, 19, 23, 25]


              demonstrates, you are leaving the loop to early.
              I would also compute the square root only once, and use double-precision
              arithmetic: The 24 bit significand of a Float cannot represent large
              integers correctly (compare Computing the integer square root of large numbers).



              Finally, the function should compute the result, but not print it, i.e. this



              //Print Statements
              print("Prime numbers under (n):")
              _ = arr.enumerated().map (index, element) in print("t(index + 1). (element)")


              should be removed. It is generally a good habit to separate computation from I/O.
              In addition, this distorts the benchmark results, because you measure also the
              time for converting the numbers to strings, and the time to print these strings
              (which depends on the output device, printing to a file is faster than printing
              to the Terminal or the Xcode console).



              With all changes suggested so far, your function would look like this:



              /// Compute list of primes
              ///
              /// - Parameter n: The upper bound
              /// - Returns: An array with all primes <= `n`
              func generatePrimes(to n: Int) -> [Int]

              if n <= 5
              return [2, 3, 5].filter $0 <= n


              var arr = Array(stride(from: 3, through: n, by: 2))

              let squareRootN = Int(Double(n).squareRoot())
              for index in 0...
              if arr[index] > squareRootN break
              let num = arr.remove(at: index)
              arr = arr.filter $0 % num != 0
              arr.insert(num, at: index)


              arr.insert(2, at: 0)
              return arr



              Using the Sieve of Eratosthenes



              As I said above, your algorithm is different from the Eratosthenes sieve.
              Each time a prime number $ p $ is found, your code does trial divisions for
              all remaining numbers to remove multiples of $ p $.
              The Eratosthenes sieve computes multiples of $ p $ instead:



              p*p, p*p + p, p*p + 2*p, p*p + 3*p, ...


              and marks these as composite. Note that there are far less values to compute.



              Also your algorithm removes and inserts array elements frequently.
              That is slow because the remaining elements must be shifted to the left
              or to the right.



              The Eratosthenes sieve works with a fixed-sized boolean array instead.



              Here is a very simple straight-forward implementation in Swift:



              func eratosthenesSieve(to n: Int) -> [Int] 
              var composite = Array(repeating: false, count: n + 1) // The sieve
              var primes: [Int] =

              if n >= 55
              // Upper bound for the number of primes up to and including `n`,
              // from https://en.wikipedia.org/wiki/Prime_number_theorem#Non-asymptotic_bounds_on_the_prime-counting_function :
              let d = Double(n)
              let upperBound = Int(d / (log(d) - 4))
              primes.reserveCapacity(upperBound)


              let squareRootN = Int(Double(n).squareRoot())
              var p = 2
              while p <= squareRootN
              if !composite[p]
              primes.append(p)
              for q in stride(from: p * p, through: n, by: p)
              composite[q] = true


              p += 1

              while p <= n
              if !composite[p]
              primes.append(p)

              p += 1

              return primes



              The only optimization I did was to reserve capacity for the result
              array in order to avoid reallocations.



              This is already considerably faster than your original code.
              Here is my benchmarking code:



              func benchmark(_ ident: String, _ function: (Int) -> [Int], to n: Int) 
              let start = Date()
              let primes = function(n)
              let elapsed = Date().timeIntervalSince(start)
              print(ident, primes.count, "primes up to", n, String(format: "time: %.3f", elapsed))


              benchmark("original", generatePrimes, to: 100_000)
              benchmark("original", generatePrimes, to: 1_000_000)
              benchmark("original", generatePrimes, to: 10_000_000)

              print()

              benchmark("eratosthenes", eratosthenesSieve, to: 100_000)
              benchmark("eratosthenes", eratosthenesSieve, to: 1_000_000)
              benchmark("eratosthenes", eratosthenesSieve, to: 10_000_000)
              benchmark("eratosthenes", eratosthenesSieve, to: 100_000_000)
              benchmark("eratosthenes", eratosthenesSieve, to: 1_000_000_000)


              On a 1.2 GHz Intel Core m5 MacBook (Retina, 12-inch, Early 2016)
              with 8GM RAM I get the following results (slightly reformatted to
              increase the legibility):




              original 9592 primes up to 100000 time: 0.038
              original 78498 primes up to 1000000 time: 0.418
              original 664579 primes up to 10000000 time: 9.739

              eratosthenes 9592 primes up to 100000 time: 0.001
              eratosthenes 78498 primes up to 1000000 time: 0.007
              eratosthenes 664579 primes up to 10000000 time: 0.096
              eratosthenes 5761455 primes up to 100000000 time: 1.223
              eratosthenes 50847534 primes up to 1000000000 time: 15.034


              This can surely be further improved, a first step would be to handle the case p=2
              separately and to use the sieve only for odd numbers (which however complicates the
              index calculations a bit).



              But, as you see, computing the primes up to 1 billion is feasible with the sieve
              of Eratosthenes (and the numbers are correct, you can compare them with
              https://primes.utm.edu/howmany.html).



              Going for the trillion



              You were also thinking of computing all primes up to one trillion. That might be
              a problem, no matter which algorithm you choose. According to
              https://primes.utm.edu/howmany.html, there are 37,607,912,018 prime numbers
              below 1 trillion. Even if we use 32-bit bit integers to save space, that would
              still require approx 140 GB memory, which makes returning an array with all that
              numbers difficult.



              So that would require a different approach which allows to compute prime numbers in
              some given range, instead of all prime numbers up to a limit.






              share|improve this answer

























                up vote
                7
                down vote



                accepted







                up vote
                7
                down vote



                accepted






                That is not the Sieve of Eratosthenes



                The Sieve of Eratosthenes computes multiples
                of each found prime to mark subsequent composite numbers in the sieve.
                Your algorithm computes the remainder of all subsequent numbers instead.
                That makes a huge difference. I'll come back to that later, let's start
                with a



                Review of your current code



                There is a inconsistency here:



                func generatePrimes(to n: Int) -> [Int] 
                // ...
                var arr : [Int] = Array<Int>(stride(from: 3, to: n, by: 2))
                // ...



                The function – as I understand the to parameter –
                computes all primes up to and including n. On the other hand,
                stride(from: 3, to: n, by: 2) does not include the upper bound,
                and that is easily verified with



                print(generatePrimes(to: 11)) // [2, 3, 5, 7]


                So either rename the function to func generatePrimes(below n: Int)
                or use stride(from: 3, through: n, by: 2) to include the upper bound.
                I'll do the latter for this review.



                It would also be a good idea to add a documentation comment which
                unambiguously documents the function parameters.



                The explicit type annotation is not needed, and the array creation
                can be simplified to



                var arr = Array(stride(from: 3, through: n, by: 2))


                Why is the limit number expected to be greater than 5? That is an
                unnecessary restriction and would be unexpected for a caller of your function. It may be that your implementation does not work if
                $ n le 5 $, but it would be easy to handle that case separately:



                if n <= 5 
                return [2, 3, 5].filter $0 <= n



                In



                 let num : Int = arr.remove(at: index)


                the type annotation on the left is again not needed.



                (Remark: This



                 arr = arr.filter $0 % num != 0 


                can be replaced by the more efficient



                 arr.removeAll(where: $0 % num == 0 )


                in a future version of Swift, when SE-0197 Adding in-place removeAll(where:) to the Standard Library is implemented in Swift 4.2.)



                Here



                 if arr[index + 1] >= Int(floor(sqrtf(Float(n)))) break 


                is a hidden bug: As



                print(generatePrimes(to: 26)) // [2, 3, 5, 7, 11, 13, 17, 19, 23, 25]


                demonstrates, you are leaving the loop to early.
                I would also compute the square root only once, and use double-precision
                arithmetic: The 24 bit significand of a Float cannot represent large
                integers correctly (compare Computing the integer square root of large numbers).



                Finally, the function should compute the result, but not print it, i.e. this



                //Print Statements
                print("Prime numbers under (n):")
                _ = arr.enumerated().map (index, element) in print("t(index + 1). (element)")


                should be removed. It is generally a good habit to separate computation from I/O.
                In addition, this distorts the benchmark results, because you measure also the
                time for converting the numbers to strings, and the time to print these strings
                (which depends on the output device, printing to a file is faster than printing
                to the Terminal or the Xcode console).



                With all changes suggested so far, your function would look like this:



                /// Compute list of primes
                ///
                /// - Parameter n: The upper bound
                /// - Returns: An array with all primes <= `n`
                func generatePrimes(to n: Int) -> [Int]

                if n <= 5
                return [2, 3, 5].filter $0 <= n


                var arr = Array(stride(from: 3, through: n, by: 2))

                let squareRootN = Int(Double(n).squareRoot())
                for index in 0...
                if arr[index] > squareRootN break
                let num = arr.remove(at: index)
                arr = arr.filter $0 % num != 0
                arr.insert(num, at: index)


                arr.insert(2, at: 0)
                return arr



                Using the Sieve of Eratosthenes



                As I said above, your algorithm is different from the Eratosthenes sieve.
                Each time a prime number $ p $ is found, your code does trial divisions for
                all remaining numbers to remove multiples of $ p $.
                The Eratosthenes sieve computes multiples of $ p $ instead:



                p*p, p*p + p, p*p + 2*p, p*p + 3*p, ...


                and marks these as composite. Note that there are far less values to compute.



                Also your algorithm removes and inserts array elements frequently.
                That is slow because the remaining elements must be shifted to the left
                or to the right.



                The Eratosthenes sieve works with a fixed-sized boolean array instead.



                Here is a very simple straight-forward implementation in Swift:



                func eratosthenesSieve(to n: Int) -> [Int] 
                var composite = Array(repeating: false, count: n + 1) // The sieve
                var primes: [Int] =

                if n >= 55
                // Upper bound for the number of primes up to and including `n`,
                // from https://en.wikipedia.org/wiki/Prime_number_theorem#Non-asymptotic_bounds_on_the_prime-counting_function :
                let d = Double(n)
                let upperBound = Int(d / (log(d) - 4))
                primes.reserveCapacity(upperBound)


                let squareRootN = Int(Double(n).squareRoot())
                var p = 2
                while p <= squareRootN
                if !composite[p]
                primes.append(p)
                for q in stride(from: p * p, through: n, by: p)
                composite[q] = true


                p += 1

                while p <= n
                if !composite[p]
                primes.append(p)

                p += 1

                return primes



                The only optimization I did was to reserve capacity for the result
                array in order to avoid reallocations.



                This is already considerably faster than your original code.
                Here is my benchmarking code:



                func benchmark(_ ident: String, _ function: (Int) -> [Int], to n: Int) 
                let start = Date()
                let primes = function(n)
                let elapsed = Date().timeIntervalSince(start)
                print(ident, primes.count, "primes up to", n, String(format: "time: %.3f", elapsed))


                benchmark("original", generatePrimes, to: 100_000)
                benchmark("original", generatePrimes, to: 1_000_000)
                benchmark("original", generatePrimes, to: 10_000_000)

                print()

                benchmark("eratosthenes", eratosthenesSieve, to: 100_000)
                benchmark("eratosthenes", eratosthenesSieve, to: 1_000_000)
                benchmark("eratosthenes", eratosthenesSieve, to: 10_000_000)
                benchmark("eratosthenes", eratosthenesSieve, to: 100_000_000)
                benchmark("eratosthenes", eratosthenesSieve, to: 1_000_000_000)


                On a 1.2 GHz Intel Core m5 MacBook (Retina, 12-inch, Early 2016)
                with 8GM RAM I get the following results (slightly reformatted to
                increase the legibility):




                original 9592 primes up to 100000 time: 0.038
                original 78498 primes up to 1000000 time: 0.418
                original 664579 primes up to 10000000 time: 9.739

                eratosthenes 9592 primes up to 100000 time: 0.001
                eratosthenes 78498 primes up to 1000000 time: 0.007
                eratosthenes 664579 primes up to 10000000 time: 0.096
                eratosthenes 5761455 primes up to 100000000 time: 1.223
                eratosthenes 50847534 primes up to 1000000000 time: 15.034


                This can surely be further improved, a first step would be to handle the case p=2
                separately and to use the sieve only for odd numbers (which however complicates the
                index calculations a bit).



                But, as you see, computing the primes up to 1 billion is feasible with the sieve
                of Eratosthenes (and the numbers are correct, you can compare them with
                https://primes.utm.edu/howmany.html).



                Going for the trillion



                You were also thinking of computing all primes up to one trillion. That might be
                a problem, no matter which algorithm you choose. According to
                https://primes.utm.edu/howmany.html, there are 37,607,912,018 prime numbers
                below 1 trillion. Even if we use 32-bit bit integers to save space, that would
                still require approx 140 GB memory, which makes returning an array with all that
                numbers difficult.



                So that would require a different approach which allows to compute prime numbers in
                some given range, instead of all prime numbers up to a limit.






                share|improve this answer















                That is not the Sieve of Eratosthenes



                The Sieve of Eratosthenes computes multiples
                of each found prime to mark subsequent composite numbers in the sieve.
                Your algorithm computes the remainder of all subsequent numbers instead.
                That makes a huge difference. I'll come back to that later, let's start
                with a



                Review of your current code



                There is a inconsistency here:



                func generatePrimes(to n: Int) -> [Int] 
                // ...
                var arr : [Int] = Array<Int>(stride(from: 3, to: n, by: 2))
                // ...



                The function – as I understand the to parameter –
                computes all primes up to and including n. On the other hand,
                stride(from: 3, to: n, by: 2) does not include the upper bound,
                and that is easily verified with



                print(generatePrimes(to: 11)) // [2, 3, 5, 7]


                So either rename the function to func generatePrimes(below n: Int)
                or use stride(from: 3, through: n, by: 2) to include the upper bound.
                I'll do the latter for this review.



                It would also be a good idea to add a documentation comment which
                unambiguously documents the function parameters.



                The explicit type annotation is not needed, and the array creation
                can be simplified to



                var arr = Array(stride(from: 3, through: n, by: 2))


                Why is the limit number expected to be greater than 5? That is an
                unnecessary restriction and would be unexpected for a caller of your function. It may be that your implementation does not work if
                $ n le 5 $, but it would be easy to handle that case separately:



                if n <= 5 
                return [2, 3, 5].filter $0 <= n



                In



                 let num : Int = arr.remove(at: index)


                the type annotation on the left is again not needed.



                (Remark: This



                 arr = arr.filter $0 % num != 0 


                can be replaced by the more efficient



                 arr.removeAll(where: $0 % num == 0 )


                in a future version of Swift, when SE-0197 Adding in-place removeAll(where:) to the Standard Library is implemented in Swift 4.2.)



                Here



                 if arr[index + 1] >= Int(floor(sqrtf(Float(n)))) break 


                is a hidden bug: As



                print(generatePrimes(to: 26)) // [2, 3, 5, 7, 11, 13, 17, 19, 23, 25]


                demonstrates, you are leaving the loop to early.
                I would also compute the square root only once, and use double-precision
                arithmetic: The 24 bit significand of a Float cannot represent large
                integers correctly (compare Computing the integer square root of large numbers).



                Finally, the function should compute the result, but not print it, i.e. this



                //Print Statements
                print("Prime numbers under (n):")
                _ = arr.enumerated().map (index, element) in print("t(index + 1). (element)")


                should be removed. It is generally a good habit to separate computation from I/O.
                In addition, this distorts the benchmark results, because you measure also the
                time for converting the numbers to strings, and the time to print these strings
                (which depends on the output device, printing to a file is faster than printing
                to the Terminal or the Xcode console).



                With all changes suggested so far, your function would look like this:



                /// Compute list of primes
                ///
                /// - Parameter n: The upper bound
                /// - Returns: An array with all primes <= `n`
                func generatePrimes(to n: Int) -> [Int]

                if n <= 5
                return [2, 3, 5].filter $0 <= n


                var arr = Array(stride(from: 3, through: n, by: 2))

                let squareRootN = Int(Double(n).squareRoot())
                for index in 0...
                if arr[index] > squareRootN break
                let num = arr.remove(at: index)
                arr = arr.filter $0 % num != 0
                arr.insert(num, at: index)


                arr.insert(2, at: 0)
                return arr



                Using the Sieve of Eratosthenes



                As I said above, your algorithm is different from the Eratosthenes sieve.
                Each time a prime number $ p $ is found, your code does trial divisions for
                all remaining numbers to remove multiples of $ p $.
                The Eratosthenes sieve computes multiples of $ p $ instead:



                p*p, p*p + p, p*p + 2*p, p*p + 3*p, ...


                and marks these as composite. Note that there are far less values to compute.



                Also your algorithm removes and inserts array elements frequently.
                That is slow because the remaining elements must be shifted to the left
                or to the right.



                The Eratosthenes sieve works with a fixed-sized boolean array instead.



                Here is a very simple straight-forward implementation in Swift:



                func eratosthenesSieve(to n: Int) -> [Int] 
                var composite = Array(repeating: false, count: n + 1) // The sieve
                var primes: [Int] =

                if n >= 55
                // Upper bound for the number of primes up to and including `n`,
                // from https://en.wikipedia.org/wiki/Prime_number_theorem#Non-asymptotic_bounds_on_the_prime-counting_function :
                let d = Double(n)
                let upperBound = Int(d / (log(d) - 4))
                primes.reserveCapacity(upperBound)


                let squareRootN = Int(Double(n).squareRoot())
                var p = 2
                while p <= squareRootN
                if !composite[p]
                primes.append(p)
                for q in stride(from: p * p, through: n, by: p)
                composite[q] = true


                p += 1

                while p <= n
                if !composite[p]
                primes.append(p)

                p += 1

                return primes



                The only optimization I did was to reserve capacity for the result
                array in order to avoid reallocations.



                This is already considerably faster than your original code.
                Here is my benchmarking code:



                func benchmark(_ ident: String, _ function: (Int) -> [Int], to n: Int) 
                let start = Date()
                let primes = function(n)
                let elapsed = Date().timeIntervalSince(start)
                print(ident, primes.count, "primes up to", n, String(format: "time: %.3f", elapsed))


                benchmark("original", generatePrimes, to: 100_000)
                benchmark("original", generatePrimes, to: 1_000_000)
                benchmark("original", generatePrimes, to: 10_000_000)

                print()

                benchmark("eratosthenes", eratosthenesSieve, to: 100_000)
                benchmark("eratosthenes", eratosthenesSieve, to: 1_000_000)
                benchmark("eratosthenes", eratosthenesSieve, to: 10_000_000)
                benchmark("eratosthenes", eratosthenesSieve, to: 100_000_000)
                benchmark("eratosthenes", eratosthenesSieve, to: 1_000_000_000)


                On a 1.2 GHz Intel Core m5 MacBook (Retina, 12-inch, Early 2016)
                with 8GM RAM I get the following results (slightly reformatted to
                increase the legibility):




                original 9592 primes up to 100000 time: 0.038
                original 78498 primes up to 1000000 time: 0.418
                original 664579 primes up to 10000000 time: 9.739

                eratosthenes 9592 primes up to 100000 time: 0.001
                eratosthenes 78498 primes up to 1000000 time: 0.007
                eratosthenes 664579 primes up to 10000000 time: 0.096
                eratosthenes 5761455 primes up to 100000000 time: 1.223
                eratosthenes 50847534 primes up to 1000000000 time: 15.034


                This can surely be further improved, a first step would be to handle the case p=2
                separately and to use the sieve only for odd numbers (which however complicates the
                index calculations a bit).



                But, as you see, computing the primes up to 1 billion is feasible with the sieve
                of Eratosthenes (and the numbers are correct, you can compare them with
                https://primes.utm.edu/howmany.html).



                Going for the trillion



                You were also thinking of computing all primes up to one trillion. That might be
                a problem, no matter which algorithm you choose. According to
                https://primes.utm.edu/howmany.html, there are 37,607,912,018 prime numbers
                below 1 trillion. Even if we use 32-bit bit integers to save space, that would
                still require approx 140 GB memory, which makes returning an array with all that
                numbers difficult.



                So that would require a different approach which allows to compute prime numbers in
                some given range, instead of all prime numbers up to a limit.







                share|improve this answer















                share|improve this answer



                share|improve this answer








                edited Apr 14 at 22:58









                Noah Wilder

                745




                745











                answered Apr 14 at 12:18









                Martin R

                14k12157




                14k12157






















                     

                    draft saved


                    draft discarded


























                     


                    draft saved


                    draft discarded














                    StackExchange.ready(
                    function ()
                    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f192021%2fprime-number-generator-efficiency%23new-answer', 'question_page');

                    );

                    Post as a guest













































































                    Popular posts from this blog

                    Greedy Best First Search implementation in Rust

                    Function to Return a JSON Like Objects Using VBA Collections and Arrays

                    C++11 CLH Lock Implementation