Functional Sieve of Eratosthenes in Clojure
Clash Royale CLAN TAG#URR8PPP
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty margin-bottom:0;
up vote
2
down vote
favorite
For some reason, this is the first time I've ever written an implementation of the Sieve of Eratosthenes. I basically just stared at the Wikipedia walkthrough of the algorithm until it made sense, then tried making the algorithm in idiomatic Clojure.
I'm not using an boolean array representing every number. Instead, I have a marked
set, and a primes
vector that are maintained separately.
My main concern is performance. I was hoping for better performance than I'm getting. It's only about 2x faster than the naïve implementation (included below the Sieve code for reference). I'd mainly like recommendations for speeding this up.
I decided to go overboard with the use of
->>
here. It just fit so perfectly everywhere, but I think I may have taken it too far.Is there a way to avoid the
cond
dispatch innaïve-prime?
?Anything else notable.
(ns irrelevant.sieves.eratosthenes)
; ----- Sieve -----
(defn sieve-primes [n]
(loop [p 2 ; First prime
marked # ; Found composites
primes ]
(let [mults (->> n
(range p)
(map #(* p %))
(take-while #(< % n)))
next-p (->> p
(inc)
(iterate inc)
(remove marked)
(first))
new-primes (conj primes p)]
(if (>= next-p n)
new-primes
(recur next-p (into marked mults) new-primes)))))
; ----- Naive -----
(defn naive-prime? [n]
(cond
(< n 2) false
(= n 2) true
:else (->> n
(Math/sqrt)
(inc)
(range 2)
(some #(zero? (rem n %)))
(not))))
(defn naive-primes [n]
(->> n
(range 2)
(filterv naive-prime?)))
Runtime Complexity:
I wrote up a testing function to automate timing with the help of Criterium for benchmarking:
(defn test-times [start-n end-n step-factor]
(->> (iterate #(* step-factor %) start-n)
(take-while #(< % end-n))
(mapv (fn [n] [n (->> (c/quick-benchmark*
(fn (sieve-primes n))
nil)
(:mean)
(first))]))))
This tests how long it takes for various values of n
, then packages the results in a vector of [n time-taken]
pairs. Here are the results (times in seconds):
(test-times 1 1e7 2)
=>
[[1 2.3196602948749615E-7]
[2 2.3105128053865377E-7]
[4 1.42932799280724E-6]
[8 5.63559279071997E-6]
[16 1.2984918224299065E-5]
[32 3.289705735278571E-5]
[64 7.087895492662475E-5]
[128 1.4285673446327686E-4]
[256 2.923177758112095E-4]
[512 6.273025793650794E-4]
[1024 0.0012981272479166666]
[2048 0.0025652252416666667]
[4096 0.005141740791666667]
[8192 0.01068772265]
[16384 0.022486903166666666]
[32768 0.05367695991666667]
[65536 0.11212592816666668]
[131072 0.23180363016666666]
[262144 0.48056071016666674]
[524288 1.1464995601666668]
[1048576 2.8823068596666666]
[2097152 6.4231157065]
[4194304 15.808042165333335]
[8388608 56.98961213533334]]
Plotting it out, it appears to be roughly O(n^3)
. The time taken seems to roughly double every time n
is doubled (O(n)
) until n=524288, then it explodes. A custom Wolfram regression calculator gave a best fit with a cubic curve.
performance clojure sieve-of-eratosthenes
 |Â
show 5 more comments
up vote
2
down vote
favorite
For some reason, this is the first time I've ever written an implementation of the Sieve of Eratosthenes. I basically just stared at the Wikipedia walkthrough of the algorithm until it made sense, then tried making the algorithm in idiomatic Clojure.
I'm not using an boolean array representing every number. Instead, I have a marked
set, and a primes
vector that are maintained separately.
My main concern is performance. I was hoping for better performance than I'm getting. It's only about 2x faster than the naïve implementation (included below the Sieve code for reference). I'd mainly like recommendations for speeding this up.
I decided to go overboard with the use of
->>
here. It just fit so perfectly everywhere, but I think I may have taken it too far.Is there a way to avoid the
cond
dispatch innaïve-prime?
?Anything else notable.
(ns irrelevant.sieves.eratosthenes)
; ----- Sieve -----
(defn sieve-primes [n]
(loop [p 2 ; First prime
marked # ; Found composites
primes ]
(let [mults (->> n
(range p)
(map #(* p %))
(take-while #(< % n)))
next-p (->> p
(inc)
(iterate inc)
(remove marked)
(first))
new-primes (conj primes p)]
(if (>= next-p n)
new-primes
(recur next-p (into marked mults) new-primes)))))
; ----- Naive -----
(defn naive-prime? [n]
(cond
(< n 2) false
(= n 2) true
:else (->> n
(Math/sqrt)
(inc)
(range 2)
(some #(zero? (rem n %)))
(not))))
(defn naive-primes [n]
(->> n
(range 2)
(filterv naive-prime?)))
Runtime Complexity:
I wrote up a testing function to automate timing with the help of Criterium for benchmarking:
(defn test-times [start-n end-n step-factor]
(->> (iterate #(* step-factor %) start-n)
(take-while #(< % end-n))
(mapv (fn [n] [n (->> (c/quick-benchmark*
(fn (sieve-primes n))
nil)
(:mean)
(first))]))))
This tests how long it takes for various values of n
, then packages the results in a vector of [n time-taken]
pairs. Here are the results (times in seconds):
(test-times 1 1e7 2)
=>
[[1 2.3196602948749615E-7]
[2 2.3105128053865377E-7]
[4 1.42932799280724E-6]
[8 5.63559279071997E-6]
[16 1.2984918224299065E-5]
[32 3.289705735278571E-5]
[64 7.087895492662475E-5]
[128 1.4285673446327686E-4]
[256 2.923177758112095E-4]
[512 6.273025793650794E-4]
[1024 0.0012981272479166666]
[2048 0.0025652252416666667]
[4096 0.005141740791666667]
[8192 0.01068772265]
[16384 0.022486903166666666]
[32768 0.05367695991666667]
[65536 0.11212592816666668]
[131072 0.23180363016666666]
[262144 0.48056071016666674]
[524288 1.1464995601666668]
[1048576 2.8823068596666666]
[2097152 6.4231157065]
[4194304 15.808042165333335]
[8388608 56.98961213533334]]
Plotting it out, it appears to be roughly O(n^3)
. The time taken seems to roughly double every time n
is doubled (O(n)
) until n=524288, then it explodes. A custom Wolfram regression calculator gave a best fit with a cubic curve.
performance clojure sieve-of-eratosthenes
ð¸ðÂÂÂðÂÂÂðÂÂÂðÂÂÂðÂÂÂðÂÂÂðÂÂÂð OðÂÂÂðÂÂÂðÂÂÂðÂÂÂð ðÂÂÂð GðÂÂÂðÂÂÂð¤ð¡âÂÂ, please! (i.e. compare the speeds at two size points at least, preferably three or more, to get better picture of what's going on there).
â Will Ness
Apr 30 at 2:14
can it be that(remove marked ps)
compares eachp
inps
against all elements inmarked
, out of order?
â Will Ness
Apr 30 at 2:27
@WillNess A membership test of a set is near constant. This is also lazy, so it will only search enough to find the first result.
â Carcigenicate
Apr 30 at 3:00
ok, thanks. what about the empirical orders of growth? it would give a good hint about the goings on.
â Will Ness
Apr 30 at 14:36
@WillNess I haven't yet graphed the runtime of different n values. I can do that today. I did notice that it scaled better than the naive version though. It was only like 1.8x faster than the naive for n=1000, but ~2.2x faster for n=10000.
â Carcigenicate
Apr 30 at 14:39
 |Â
show 5 more comments
up vote
2
down vote
favorite
up vote
2
down vote
favorite
For some reason, this is the first time I've ever written an implementation of the Sieve of Eratosthenes. I basically just stared at the Wikipedia walkthrough of the algorithm until it made sense, then tried making the algorithm in idiomatic Clojure.
I'm not using an boolean array representing every number. Instead, I have a marked
set, and a primes
vector that are maintained separately.
My main concern is performance. I was hoping for better performance than I'm getting. It's only about 2x faster than the naïve implementation (included below the Sieve code for reference). I'd mainly like recommendations for speeding this up.
I decided to go overboard with the use of
->>
here. It just fit so perfectly everywhere, but I think I may have taken it too far.Is there a way to avoid the
cond
dispatch innaïve-prime?
?Anything else notable.
(ns irrelevant.sieves.eratosthenes)
; ----- Sieve -----
(defn sieve-primes [n]
(loop [p 2 ; First prime
marked # ; Found composites
primes ]
(let [mults (->> n
(range p)
(map #(* p %))
(take-while #(< % n)))
next-p (->> p
(inc)
(iterate inc)
(remove marked)
(first))
new-primes (conj primes p)]
(if (>= next-p n)
new-primes
(recur next-p (into marked mults) new-primes)))))
; ----- Naive -----
(defn naive-prime? [n]
(cond
(< n 2) false
(= n 2) true
:else (->> n
(Math/sqrt)
(inc)
(range 2)
(some #(zero? (rem n %)))
(not))))
(defn naive-primes [n]
(->> n
(range 2)
(filterv naive-prime?)))
Runtime Complexity:
I wrote up a testing function to automate timing with the help of Criterium for benchmarking:
(defn test-times [start-n end-n step-factor]
(->> (iterate #(* step-factor %) start-n)
(take-while #(< % end-n))
(mapv (fn [n] [n (->> (c/quick-benchmark*
(fn (sieve-primes n))
nil)
(:mean)
(first))]))))
This tests how long it takes for various values of n
, then packages the results in a vector of [n time-taken]
pairs. Here are the results (times in seconds):
(test-times 1 1e7 2)
=>
[[1 2.3196602948749615E-7]
[2 2.3105128053865377E-7]
[4 1.42932799280724E-6]
[8 5.63559279071997E-6]
[16 1.2984918224299065E-5]
[32 3.289705735278571E-5]
[64 7.087895492662475E-5]
[128 1.4285673446327686E-4]
[256 2.923177758112095E-4]
[512 6.273025793650794E-4]
[1024 0.0012981272479166666]
[2048 0.0025652252416666667]
[4096 0.005141740791666667]
[8192 0.01068772265]
[16384 0.022486903166666666]
[32768 0.05367695991666667]
[65536 0.11212592816666668]
[131072 0.23180363016666666]
[262144 0.48056071016666674]
[524288 1.1464995601666668]
[1048576 2.8823068596666666]
[2097152 6.4231157065]
[4194304 15.808042165333335]
[8388608 56.98961213533334]]
Plotting it out, it appears to be roughly O(n^3)
. The time taken seems to roughly double every time n
is doubled (O(n)
) until n=524288, then it explodes. A custom Wolfram regression calculator gave a best fit with a cubic curve.
performance clojure sieve-of-eratosthenes
For some reason, this is the first time I've ever written an implementation of the Sieve of Eratosthenes. I basically just stared at the Wikipedia walkthrough of the algorithm until it made sense, then tried making the algorithm in idiomatic Clojure.
I'm not using an boolean array representing every number. Instead, I have a marked
set, and a primes
vector that are maintained separately.
My main concern is performance. I was hoping for better performance than I'm getting. It's only about 2x faster than the naïve implementation (included below the Sieve code for reference). I'd mainly like recommendations for speeding this up.
I decided to go overboard with the use of
->>
here. It just fit so perfectly everywhere, but I think I may have taken it too far.Is there a way to avoid the
cond
dispatch innaïve-prime?
?Anything else notable.
(ns irrelevant.sieves.eratosthenes)
; ----- Sieve -----
(defn sieve-primes [n]
(loop [p 2 ; First prime
marked # ; Found composites
primes ]
(let [mults (->> n
(range p)
(map #(* p %))
(take-while #(< % n)))
next-p (->> p
(inc)
(iterate inc)
(remove marked)
(first))
new-primes (conj primes p)]
(if (>= next-p n)
new-primes
(recur next-p (into marked mults) new-primes)))))
; ----- Naive -----
(defn naive-prime? [n]
(cond
(< n 2) false
(= n 2) true
:else (->> n
(Math/sqrt)
(inc)
(range 2)
(some #(zero? (rem n %)))
(not))))
(defn naive-primes [n]
(->> n
(range 2)
(filterv naive-prime?)))
Runtime Complexity:
I wrote up a testing function to automate timing with the help of Criterium for benchmarking:
(defn test-times [start-n end-n step-factor]
(->> (iterate #(* step-factor %) start-n)
(take-while #(< % end-n))
(mapv (fn [n] [n (->> (c/quick-benchmark*
(fn (sieve-primes n))
nil)
(:mean)
(first))]))))
This tests how long it takes for various values of n
, then packages the results in a vector of [n time-taken]
pairs. Here are the results (times in seconds):
(test-times 1 1e7 2)
=>
[[1 2.3196602948749615E-7]
[2 2.3105128053865377E-7]
[4 1.42932799280724E-6]
[8 5.63559279071997E-6]
[16 1.2984918224299065E-5]
[32 3.289705735278571E-5]
[64 7.087895492662475E-5]
[128 1.4285673446327686E-4]
[256 2.923177758112095E-4]
[512 6.273025793650794E-4]
[1024 0.0012981272479166666]
[2048 0.0025652252416666667]
[4096 0.005141740791666667]
[8192 0.01068772265]
[16384 0.022486903166666666]
[32768 0.05367695991666667]
[65536 0.11212592816666668]
[131072 0.23180363016666666]
[262144 0.48056071016666674]
[524288 1.1464995601666668]
[1048576 2.8823068596666666]
[2097152 6.4231157065]
[4194304 15.808042165333335]
[8388608 56.98961213533334]]
Plotting it out, it appears to be roughly O(n^3)
. The time taken seems to roughly double every time n
is doubled (O(n)
) until n=524288, then it explodes. A custom Wolfram regression calculator gave a best fit with a cubic curve.
performance clojure sieve-of-eratosthenes
edited Apr 30 at 17:11
asked Apr 29 at 23:35
Carcigenicate
2,31911128
2,31911128
ð¸ðÂÂÂðÂÂÂðÂÂÂðÂÂÂðÂÂÂðÂÂÂðÂÂÂð OðÂÂÂðÂÂÂðÂÂÂðÂÂÂð ðÂÂÂð GðÂÂÂðÂÂÂð¤ð¡âÂÂ, please! (i.e. compare the speeds at two size points at least, preferably three or more, to get better picture of what's going on there).
â Will Ness
Apr 30 at 2:14
can it be that(remove marked ps)
compares eachp
inps
against all elements inmarked
, out of order?
â Will Ness
Apr 30 at 2:27
@WillNess A membership test of a set is near constant. This is also lazy, so it will only search enough to find the first result.
â Carcigenicate
Apr 30 at 3:00
ok, thanks. what about the empirical orders of growth? it would give a good hint about the goings on.
â Will Ness
Apr 30 at 14:36
@WillNess I haven't yet graphed the runtime of different n values. I can do that today. I did notice that it scaled better than the naive version though. It was only like 1.8x faster than the naive for n=1000, but ~2.2x faster for n=10000.
â Carcigenicate
Apr 30 at 14:39
 |Â
show 5 more comments
ð¸ðÂÂÂðÂÂÂðÂÂÂðÂÂÂðÂÂÂðÂÂÂðÂÂÂð OðÂÂÂðÂÂÂðÂÂÂðÂÂÂð ðÂÂÂð GðÂÂÂðÂÂÂð¤ð¡âÂÂ, please! (i.e. compare the speeds at two size points at least, preferably three or more, to get better picture of what's going on there).
â Will Ness
Apr 30 at 2:14
can it be that(remove marked ps)
compares eachp
inps
against all elements inmarked
, out of order?
â Will Ness
Apr 30 at 2:27
@WillNess A membership test of a set is near constant. This is also lazy, so it will only search enough to find the first result.
â Carcigenicate
Apr 30 at 3:00
ok, thanks. what about the empirical orders of growth? it would give a good hint about the goings on.
â Will Ness
Apr 30 at 14:36
@WillNess I haven't yet graphed the runtime of different n values. I can do that today. I did notice that it scaled better than the naive version though. It was only like 1.8x faster than the naive for n=1000, but ~2.2x faster for n=10000.
â Carcigenicate
Apr 30 at 14:39
ð¸ðÂÂÂðÂÂÂðÂÂÂðÂÂÂðÂÂÂðÂÂÂðÂÂÂð OðÂÂÂðÂÂÂðÂÂÂðÂÂÂð ðÂÂÂð GðÂÂÂðÂÂÂð¤ð¡âÂÂ, please! (i.e. compare the speeds at two size points at least, preferably three or more, to get better picture of what's going on there).
â Will Ness
Apr 30 at 2:14
ð¸ðÂÂÂðÂÂÂðÂÂÂðÂÂÂðÂÂÂðÂÂÂðÂÂÂð OðÂÂÂðÂÂÂðÂÂÂðÂÂÂð ðÂÂÂð GðÂÂÂðÂÂÂð¤ð¡âÂÂ, please! (i.e. compare the speeds at two size points at least, preferably three or more, to get better picture of what's going on there).
â Will Ness
Apr 30 at 2:14
can it be that
(remove marked ps)
compares each p
in ps
against all elements in marked
, out of order?â Will Ness
Apr 30 at 2:27
can it be that
(remove marked ps)
compares each p
in ps
against all elements in marked
, out of order?â Will Ness
Apr 30 at 2:27
@WillNess A membership test of a set is near constant. This is also lazy, so it will only search enough to find the first result.
â Carcigenicate
Apr 30 at 3:00
@WillNess A membership test of a set is near constant. This is also lazy, so it will only search enough to find the first result.
â Carcigenicate
Apr 30 at 3:00
ok, thanks. what about the empirical orders of growth? it would give a good hint about the goings on.
â Will Ness
Apr 30 at 14:36
ok, thanks. what about the empirical orders of growth? it would give a good hint about the goings on.
â Will Ness
Apr 30 at 14:36
@WillNess I haven't yet graphed the runtime of different n values. I can do that today. I did notice that it scaled better than the naive version though. It was only like 1.8x faster than the naive for n=1000, but ~2.2x faster for n=10000.
â Carcigenicate
Apr 30 at 14:39
@WillNess I haven't yet graphed the runtime of different n values. I can do that today. I did notice that it scaled better than the naive version though. It was only like 1.8x faster than the naive for n=1000, but ~2.2x faster for n=10000.
â Carcigenicate
Apr 30 at 14:39
 |Â
show 5 more comments
active
oldest
votes
active
oldest
votes
active
oldest
votes
active
oldest
votes
active
oldest
votes
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f193233%2ffunctional-sieve-of-eratosthenes-in-clojure%23new-answer', 'question_page');
);
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
ð¸ðÂÂÂðÂÂÂðÂÂÂðÂÂÂðÂÂÂðÂÂÂðÂÂÂð OðÂÂÂðÂÂÂðÂÂÂðÂÂÂð ðÂÂÂð GðÂÂÂðÂÂÂð¤ð¡âÂÂ, please! (i.e. compare the speeds at two size points at least, preferably three or more, to get better picture of what's going on there).
â Will Ness
Apr 30 at 2:14
can it be that
(remove marked ps)
compares eachp
inps
against all elements inmarked
, out of order?â Will Ness
Apr 30 at 2:27
@WillNess A membership test of a set is near constant. This is also lazy, so it will only search enough to find the first result.
â Carcigenicate
Apr 30 at 3:00
ok, thanks. what about the empirical orders of growth? it would give a good hint about the goings on.
â Will Ness
Apr 30 at 14:36
@WillNess I haven't yet graphed the runtime of different n values. I can do that today. I did notice that it scaled better than the naive version though. It was only like 1.8x faster than the naive for n=1000, but ~2.2x faster for n=10000.
â Carcigenicate
Apr 30 at 14:39