Functional Sieve of Eratosthenes in Clojure

.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty margin-bottom:0;

up vote
2
down vote

favorite

For some reason, this is the first time I've ever written an implementation of the Sieve of Eratosthenes. I basically just stared at the Wikipedia walkthrough of the algorithm until it made sense, then tried making the algorithm in idiomatic Clojure.

I'm not using an boolean array representing every number. Instead, I have a marked set, and a primes vector that are maintained separately.

My main concern is performance. I was hoping for better performance than I'm getting. It's only about 2x faster than the naÃƒÂ¯ve implementation (included below the Sieve code for reference). I'd mainly like recommendations for speeding this up.

I decided to go overboard with the use of ->> here. It just fit so perfectly everywhere, but I think I may have taken it too far.

Is there a way to avoid the cond dispatch in naÃƒÂ¯ve-prime??

Anything else notable.

(ns irrelevant.sieves.eratosthenes)

; ----- Sieve -----

(defn sieve-primes [n]
 (loop [p 2 ; First prime
 marked # ; Found composites
 primes ]
 (let [mults (->> n
 (range p)
 (map #(* p %))
 (take-while #(< % n)))

 next-p (->> p
 (inc)
 (iterate inc)
 (remove marked)
 (first))

 new-primes (conj primes p)]

 (if (>= next-p n)
 new-primes
 (recur next-p (into marked mults) new-primes)))))

; ----- Naive -----

(defn naive-prime? [n]
 (cond
 (< n 2) false
 (= n 2) true
 :else (->> n
 (Math/sqrt)
 (inc)
 (range 2)
 (some #(zero? (rem n %)))
 (not))))

(defn naive-primes [n]
 (->> n
 (range 2)
 (filterv naive-prime?)))

Runtime Complexity:

I wrote up a testing function to automate timing with the help of Criterium for benchmarking:

(defn test-times [start-n end-n step-factor]
 (->> (iterate #(* step-factor %) start-n)
 (take-while #(< % end-n))
 (mapv (fn [n] [n (->> (c/quick-benchmark*
 (fn (sieve-primes n))
 nil)
 (:mean)
 (first))]))))

This tests how long it takes for various values of n, then packages the results in a vector of [n time-taken] pairs. Here are the results (times in seconds):

(test-times 1 1e7 2)
=>
[[1 2.3196602948749615E-7]
 [2 2.3105128053865377E-7]
 [4 1.42932799280724E-6]
 [8 5.63559279071997E-6]
 [16 1.2984918224299065E-5]
 [32 3.289705735278571E-5]
 [64 7.087895492662475E-5]
 [128 1.4285673446327686E-4]
 [256 2.923177758112095E-4]
 [512 6.273025793650794E-4]
 [1024 0.0012981272479166666]
 [2048 0.0025652252416666667]
 [4096 0.005141740791666667]
 [8192 0.01068772265]
 [16384 0.022486903166666666]
 [32768 0.05367695991666667]
 [65536 0.11212592816666668]
 [131072 0.23180363016666666]
 [262144 0.48056071016666674]
 [524288 1.1464995601666668]
 [1048576 2.8823068596666666]
 [2097152 6.4231157065]
 [4194304 15.808042165333335]
 [8388608 56.98961213533334]]

Plotting it out, it appears to be roughly O(n^3). The time taken seems to roughly double every time n is doubled (O(n)) until n=524288, then it explodes. A custom Wolfram regression calculator gave a best fit with a cubic curve.

edited Apr 30 at 17:11

asked Apr 29 at 23:35

Carcigenicate

2,31911128

Ã°ÂÂÂ¸Ã°ÂÂ‘ÂšÃ°ÂÂ‘ÂÃ°ÂÂ‘Â–Ã°ÂÂ‘ÂŸÃ°ÂÂ‘Â–Ã°ÂÂ‘ÂÃ°ÂÂ‘ÂŽÃ°ÂÂ‘Â™ OÃ°ÂÂ‘ÂŸÃ°ÂÂ‘Â‘Ã°ÂÂ‘Â’Ã°ÂÂ‘ÂŸÃ°ÂÂ‘Â Ã°ÂÂ‘ÂœÃ°ÂÂ‘Â“ GÃ°ÂÂ‘ÂŸÃ°ÂÂ‘ÂœÃ°ÂÂ‘Â¤Ã°ÂÂ‘Â¡Ã¢Â„ÂŽ, please! (i.e. compare the speeds at two size points at least, preferably three or more, to get better picture of what's going on there).
â€“Â Will Ness
Apr 30 at 2:14

can it be that (remove marked ps) compares each p in ps against all elements in marked, out of order?
â€“Â Will Ness
Apr 30 at 2:27

@WillNess A membership test of a set is near constant. This is also lazy, so it will only search enough to find the first result.
â€“Â Carcigenicate
Apr 30 at 3:00

ok, thanks. what about the empirical orders of growth? it would give a good hint about the goings on.
â€“Â Will Ness
Apr 30 at 14:36

@WillNess I haven't yet graphed the runtime of different n values. I can do that today. I did notice that it scaled better than the naive version though. It was only like 1.8x faster than the naive for n=1000, but ~2.2x faster for n=10000.
â€“Â Carcigenicate
Apr 30 at 14:39

Â |Â
show 5 more comments

up vote
2
down vote

favorite

I'm not using an boolean array representing every number. Instead, I have a marked set, and a primes vector that are maintained separately.

My main concern is performance. I was hoping for better performance than I'm getting. It's only about 2x faster than the naÃƒÂ¯ve implementation (included below the Sieve code for reference). I'd mainly like recommendations for speeding this up.

I decided to go overboard with the use of ->> here. It just fit so perfectly everywhere, but I think I may have taken it too far.

Is there a way to avoid the cond dispatch in naÃƒÂ¯ve-prime??

Anything else notable.

(ns irrelevant.sieves.eratosthenes)

; ----- Sieve -----

(defn sieve-primes [n]
 (loop [p 2 ; First prime
 marked # ; Found composites
 primes ]
 (let [mults (->> n
 (range p)
 (map #(* p %))
 (take-while #(< % n)))

 next-p (->> p
 (inc)
 (iterate inc)
 (remove marked)
 (first))

 new-primes (conj primes p)]

 (if (>= next-p n)
 new-primes
 (recur next-p (into marked mults) new-primes)))))

; ----- Naive -----

(defn naive-prime? [n]
 (cond
 (< n 2) false
 (= n 2) true
 :else (->> n
 (Math/sqrt)
 (inc)
 (range 2)
 (some #(zero? (rem n %)))
 (not))))

(defn naive-primes [n]
 (->> n
 (range 2)
 (filterv naive-prime?)))

Runtime Complexity:

I wrote up a testing function to automate timing with the help of Criterium for benchmarking:

(defn test-times [start-n end-n step-factor]
 (->> (iterate #(* step-factor %) start-n)
 (take-while #(< % end-n))
 (mapv (fn [n] [n (->> (c/quick-benchmark*
 (fn (sieve-primes n))
 nil)
 (:mean)
 (first))]))))

This tests how long it takes for various values of n, then packages the results in a vector of [n time-taken] pairs. Here are the results (times in seconds):

(test-times 1 1e7 2)
=>
[[1 2.3196602948749615E-7]
 [2 2.3105128053865377E-7]
 [4 1.42932799280724E-6]
 [8 5.63559279071997E-6]
 [16 1.2984918224299065E-5]
 [32 3.289705735278571E-5]
 [64 7.087895492662475E-5]
 [128 1.4285673446327686E-4]
 [256 2.923177758112095E-4]
 [512 6.273025793650794E-4]
 [1024 0.0012981272479166666]
 [2048 0.0025652252416666667]
 [4096 0.005141740791666667]
 [8192 0.01068772265]
 [16384 0.022486903166666666]
 [32768 0.05367695991666667]
 [65536 0.11212592816666668]
 [131072 0.23180363016666666]
 [262144 0.48056071016666674]
 [524288 1.1464995601666668]
 [1048576 2.8823068596666666]
 [2097152 6.4231157065]
 [4194304 15.808042165333335]
 [8388608 56.98961213533334]]

edited Apr 30 at 17:11

asked Apr 29 at 23:35

Carcigenicate

2,31911128

Ã°ÂÂÂ¸Ã°ÂÂ‘ÂšÃ°ÂÂ‘ÂÃ°ÂÂ‘Â–Ã°ÂÂ‘ÂŸÃ°ÂÂ‘Â–Ã°ÂÂ‘ÂÃ°ÂÂ‘ÂŽÃ°ÂÂ‘Â™ OÃ°ÂÂ‘ÂŸÃ°ÂÂ‘Â‘Ã°ÂÂ‘Â’Ã°ÂÂ‘ÂŸÃ°ÂÂ‘Â Ã°ÂÂ‘ÂœÃ°ÂÂ‘Â“ GÃ°ÂÂ‘ÂŸÃ°ÂÂ‘ÂœÃ°ÂÂ‘Â¤Ã°ÂÂ‘Â¡Ã¢Â„ÂŽ, please! (i.e. compare the speeds at two size points at least, preferably three or more, to get better picture of what's going on there).
â€“Â Will Ness
Apr 30 at 2:14

can it be that (remove marked ps) compares each p in ps against all elements in marked, out of order?
â€“Â Will Ness
Apr 30 at 2:27

@WillNess A membership test of a set is near constant. This is also lazy, so it will only search enough to find the first result.
â€“Â Carcigenicate
Apr 30 at 3:00

ok, thanks. what about the empirical orders of growth? it would give a good hint about the goings on.
â€“Â Will Ness
Apr 30 at 14:36

@WillNess I haven't yet graphed the runtime of different n values. I can do that today. I did notice that it scaled better than the naive version though. It was only like 1.8x faster than the naive for n=1000, but ~2.2x faster for n=10000.
â€“Â Carcigenicate
Apr 30 at 14:39

Â |Â
show 5 more comments

up vote
2
down vote

favorite

I'm not using an boolean array representing every number. Instead, I have a marked set, and a primes vector that are maintained separately.

My main concern is performance. I was hoping for better performance than I'm getting. It's only about 2x faster than the naÃƒÂ¯ve implementation (included below the Sieve code for reference). I'd mainly like recommendations for speeding this up.

I decided to go overboard with the use of ->> here. It just fit so perfectly everywhere, but I think I may have taken it too far.

Is there a way to avoid the cond dispatch in naÃƒÂ¯ve-prime??

Anything else notable.

(ns irrelevant.sieves.eratosthenes)

; ----- Sieve -----

(defn sieve-primes [n]
 (loop [p 2 ; First prime
 marked # ; Found composites
 primes ]
 (let [mults (->> n
 (range p)
 (map #(* p %))
 (take-while #(< % n)))

 next-p (->> p
 (inc)
 (iterate inc)
 (remove marked)
 (first))

 new-primes (conj primes p)]

 (if (>= next-p n)
 new-primes
 (recur next-p (into marked mults) new-primes)))))

; ----- Naive -----

(defn naive-prime? [n]
 (cond
 (< n 2) false
 (= n 2) true
 :else (->> n
 (Math/sqrt)
 (inc)
 (range 2)
 (some #(zero? (rem n %)))
 (not))))

(defn naive-primes [n]
 (->> n
 (range 2)
 (filterv naive-prime?)))

Runtime Complexity:

I wrote up a testing function to automate timing with the help of Criterium for benchmarking:

(defn test-times [start-n end-n step-factor]
 (->> (iterate #(* step-factor %) start-n)
 (take-while #(< % end-n))
 (mapv (fn [n] [n (->> (c/quick-benchmark*
 (fn (sieve-primes n))
 nil)
 (:mean)
 (first))]))))

This tests how long it takes for various values of n, then packages the results in a vector of [n time-taken] pairs. Here are the results (times in seconds):

(test-times 1 1e7 2)
=>
[[1 2.3196602948749615E-7]
 [2 2.3105128053865377E-7]
 [4 1.42932799280724E-6]
 [8 5.63559279071997E-6]
 [16 1.2984918224299065E-5]
 [32 3.289705735278571E-5]
 [64 7.087895492662475E-5]
 [128 1.4285673446327686E-4]
 [256 2.923177758112095E-4]
 [512 6.273025793650794E-4]
 [1024 0.0012981272479166666]
 [2048 0.0025652252416666667]
 [4096 0.005141740791666667]
 [8192 0.01068772265]
 [16384 0.022486903166666666]
 [32768 0.05367695991666667]
 [65536 0.11212592816666668]
 [131072 0.23180363016666666]
 [262144 0.48056071016666674]
 [524288 1.1464995601666668]
 [1048576 2.8823068596666666]
 [2097152 6.4231157065]
 [4194304 15.808042165333335]
 [8388608 56.98961213533334]]

edited Apr 30 at 17:11

asked Apr 29 at 23:35

Carcigenicate

2,31911128

I'm not using an boolean array representing every number. Instead, I have a marked set, and a primes vector that are maintained separately.

My main concern is performance. I was hoping for better performance than I'm getting. It's only about 2x faster than the naÃƒÂ¯ve implementation (included below the Sieve code for reference). I'd mainly like recommendations for speeding this up.

I decided to go overboard with the use of ->> here. It just fit so perfectly everywhere, but I think I may have taken it too far.

Is there a way to avoid the cond dispatch in naÃƒÂ¯ve-prime??

Anything else notable.

(ns irrelevant.sieves.eratosthenes)

; ----- Sieve -----

(defn sieve-primes [n]
 (loop [p 2 ; First prime
 marked # ; Found composites
 primes ]
 (let [mults (->> n
 (range p)
 (map #(* p %))
 (take-while #(< % n)))

 next-p (->> p
 (inc)
 (iterate inc)
 (remove marked)
 (first))

 new-primes (conj primes p)]

 (if (>= next-p n)
 new-primes
 (recur next-p (into marked mults) new-primes)))))

; ----- Naive -----

(defn naive-prime? [n]
 (cond
 (< n 2) false
 (= n 2) true
 :else (->> n
 (Math/sqrt)
 (inc)
 (range 2)
 (some #(zero? (rem n %)))
 (not))))

(defn naive-primes [n]
 (->> n
 (range 2)
 (filterv naive-prime?)))

Runtime Complexity:

I wrote up a testing function to automate timing with the help of Criterium for benchmarking:

(defn test-times [start-n end-n step-factor]
 (->> (iterate #(* step-factor %) start-n)
 (take-while #(< % end-n))
 (mapv (fn [n] [n (->> (c/quick-benchmark*
 (fn (sieve-primes n))
 nil)
 (:mean)
 (first))]))))

This tests how long it takes for various values of n, then packages the results in a vector of [n time-taken] pairs. Here are the results (times in seconds):

(test-times 1 1e7 2)
=>
[[1 2.3196602948749615E-7]
 [2 2.3105128053865377E-7]
 [4 1.42932799280724E-6]
 [8 5.63559279071997E-6]
 [16 1.2984918224299065E-5]
 [32 3.289705735278571E-5]
 [64 7.087895492662475E-5]
 [128 1.4285673446327686E-4]
 [256 2.923177758112095E-4]
 [512 6.273025793650794E-4]
 [1024 0.0012981272479166666]
 [2048 0.0025652252416666667]
 [4096 0.005141740791666667]
 [8192 0.01068772265]
 [16384 0.022486903166666666]
 [32768 0.05367695991666667]
 [65536 0.11212592816666668]
 [131072 0.23180363016666666]
 [262144 0.48056071016666674]
 [524288 1.1464995601666668]
 [1048576 2.8823068596666666]
 [2097152 6.4231157065]
 [4194304 15.808042165333335]
 [8388608 56.98961213533334]]

edited Apr 30 at 17:11

asked Apr 29 at 23:35

Carcigenicate

2,31911128

edited Apr 30 at 17:11

asked Apr 29 at 23:35

Carcigenicate

2,31911128

asked Apr 29 at 23:35

Carcigenicate

2,31911128

asked Apr 29 at 23:35

Carcigenicate

2,31911128

Ã°ÂÂÂ¸Ã°ÂÂ‘ÂšÃ°ÂÂ‘ÂÃ°ÂÂ‘Â–Ã°ÂÂ‘ÂŸÃ°ÂÂ‘Â–Ã°ÂÂ‘ÂÃ°ÂÂ‘ÂŽÃ°ÂÂ‘Â™ OÃ°ÂÂ‘ÂŸÃ°ÂÂ‘Â‘Ã°ÂÂ‘Â’Ã°ÂÂ‘ÂŸÃ°ÂÂ‘Â Ã°ÂÂ‘ÂœÃ°ÂÂ‘Â“ GÃ°ÂÂ‘ÂŸÃ°ÂÂ‘ÂœÃ°ÂÂ‘Â¤Ã°ÂÂ‘Â¡Ã¢Â„ÂŽ, please! (i.e. compare the speeds at two size points at least, preferably three or more, to get better picture of what's going on there).
â€“Â Will Ness
Apr 30 at 2:14

can it be that (remove marked ps) compares each p in ps against all elements in marked, out of order?
â€“Â Will Ness
Apr 30 at 2:27

@WillNess A membership test of a set is near constant. This is also lazy, so it will only search enough to find the first result.
â€“Â Carcigenicate
Apr 30 at 3:00

ok, thanks. what about the empirical orders of growth? it would give a good hint about the goings on.
â€“Â Will Ness
Apr 30 at 14:36

@WillNess I haven't yet graphed the runtime of different n values. I can do that today. I did notice that it scaled better than the naive version though. It was only like 1.8x faster than the naive for n=1000, but ~2.2x faster for n=10000.
â€“Â Carcigenicate
Apr 30 at 14:39

Â |Â
show 5 more comments

Ã°ÂÂÂ¸Ã°ÂÂ‘ÂšÃ°ÂÂ‘ÂÃ°ÂÂ‘Â–Ã°ÂÂ‘ÂŸÃ°ÂÂ‘Â–Ã°ÂÂ‘ÂÃ°ÂÂ‘ÂŽÃ°ÂÂ‘Â™ OÃ°ÂÂ‘ÂŸÃ°ÂÂ‘Â‘Ã°ÂÂ‘Â’Ã°ÂÂ‘ÂŸÃ°ÂÂ‘Â Ã°ÂÂ‘ÂœÃ°ÂÂ‘Â“ GÃ°ÂÂ‘ÂŸÃ°ÂÂ‘ÂœÃ°ÂÂ‘Â¤Ã°ÂÂ‘Â¡Ã¢Â„ÂŽ, please! (i.e. compare the speeds at two size points at least, preferably three or more, to get better picture of what's going on there).
â€“Â Will Ness
Apr 30 at 2:14

can it be that (remove marked ps) compares each p in ps against all elements in marked, out of order?
â€“Â Will Ness
Apr 30 at 2:27

@WillNess A membership test of a set is near constant. This is also lazy, so it will only search enough to find the first result.
â€“Â Carcigenicate
Apr 30 at 3:00

ok, thanks. what about the empirical orders of growth? it would give a good hint about the goings on.
â€“Â Will Ness
Apr 30 at 14:36

@WillNess I haven't yet graphed the runtime of different n values. I can do that today. I did notice that it scaled better than the naive version though. It was only like 1.8x faster than the naive for n=1000, but ~2.2x faster for n=10000.
â€“Â Carcigenicate
Apr 30 at 14:39

Ã°ÂÂÂ¸Ã°ÂÂ‘ÂšÃ°ÂÂ‘ÂÃ°ÂÂ‘Â–Ã°ÂÂ‘ÂŸÃ°ÂÂ‘Â–Ã°ÂÂ‘ÂÃ°ÂÂ‘ÂŽÃ°ÂÂ‘Â™ OÃ°ÂÂ‘ÂŸÃ°ÂÂ‘Â‘Ã°ÂÂ‘Â’Ã°ÂÂ‘ÂŸÃ°ÂÂ‘Â Ã°ÂÂ‘ÂœÃ°ÂÂ‘Â“ GÃ°ÂÂ‘ÂŸÃ°ÂÂ‘ÂœÃ°ÂÂ‘Â¤Ã°ÂÂ‘Â¡Ã¢Â„ÂŽ, please! (i.e. compare the speeds at two size points at least, preferably three or more, to get better picture of what's going on there).
â€“Â Will Ness
Apr 30 at 2:14

can it be that (remove marked ps) compares each p in ps against all elements in marked, out of order?
â€“Â Will Ness
Apr 30 at 2:27

@WillNess A membership test of a set is near constant. This is also lazy, so it will only search enough to find the first result.
â€“Â Carcigenicate
Apr 30 at 3:00

ok, thanks. what about the empirical orders of growth? it would give a good hint about the goings on.
â€“Â Will Ness
Apr 30 at 14:36

@WillNess I haven't yet graphed the runtime of different n values. I can do that today. I did notice that it scaled better than the naive version though. It was only like 1.8x faster than the naive for n=1000, but ~2.2x faster for n=10000.
â€“Â Carcigenicate
Apr 30 at 14:39

Â |Â
show 5 more comments

active

oldest

votes

Your Answer

StackExchange.ifUsing("editor", function ()
return StackExchange.using("mathjaxEditing", function ()
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["\$", "\$"]]);
);
);
, "mathjax-editing");

StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "196"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
convertImagesToLinks: false,
noModals: false,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f193233%2ffunctional-sieve-of-eratosthenes-in-clojure%23new-answer', 'question_page');

);

Post as a guest

Name

active

oldest

votes

draft saved

draft discarded

draft saved

draft discarded

Post as a guest

Name

搜尋此網誌

trjhtr

Functional Sieve of Eratosthenes in Clojure

Runtime Complexity:

Runtime Complexity:

Runtime Complexity:

Runtime Complexity:

Your Answer

Post as a guest

Post as a guest

Popular posts from this blog

Chat program with C++ and SFML

Read an image with ADNS2610 optical sensor and Arduino Uno

Read files from a directory using Promises

Functional Sieve of Eratosthenes in Clojure

Runtime Complexity:

Runtime Complexity:

Runtime Complexity:

Runtime Complexity:

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Sign up or log in

Post as a guest

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Popular posts from this blog

Chat program with C++ and SFML

Read an image with ADNS2610 optical sensor and Arduino Uno

Read files from a directory using Promises