Using bisect to flip coins

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP





.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty margin-bottom:0;







up vote
10
down vote

favorite
3












I am trying to solve this problem on Codechef, which has a list of $N$ coins, all initially tails-up, and a list of $Q$ commands. There are two possible commands, 0 A B flips all coins in the interval [A, B] and 1 A B prints the number of heads-up coins in the interval [A, B] (both intervals include B).



After reviewing somebody else's answer that exceeded the time-limit, I came up with a fast numpy answer, which was still not fast enough for the online judge (3s instead of 0.3s).



However, one other reviewer suggested the following algorithm:




  • The key observation is that a 0 a b operation is in fact a pair of operations: flip coins from a to the end, and flip coins from b + 1 to the end. This leads to the solution:

    • Decouple the 0 a b operation into a and b+1

    • Maintain the list of flip points, and keep it ordered

    • To perform a 1 operation, find the largest flip point before A. If its index in flip list is even, the A coin is tails
      up, otherwise it is heads up. Then, traverse the flip list until B
      adding up lengths of every other interval.




I tried to implement this, using bisect to keep the list of flips sorted, but it is a lot slower (20s for the same worst-case input as the 3s quoted above).



from bisect import bisect_left, insort

n, q = map(int, input().split())

flips =

for _ in range(q):
command, start, end = map(int, input().split())
if command == 0:
insort(flips, start)
insort(flips, end + 1)
elif command == 1:
flip_start = bisect_left(flips, start)
flip_end = bisect_left(flips, end + 1)
flips_temp = flips[flip_start:flip_end]
insort(flips_temp, end + 1)
if flip_start % 2:
insort(flips_temp, start)
print(sum(flips_temp[1::2]) - sum(flips_temp[:-1:2]))


However, this is probably not the best implementation of that algorithm, maybe there is a way to use heapq to keep the list sorted (but then slicing and traversing it become harder). So, any comments, especially on how to speed this up, are welcome.



I am aware that one should normally put this into a function and make it re-usable, but since this is time-critical code I think it is fine to leave it all in the global namespace. It is a one-shot script, anyway.




The worst-case input file I used (which I generated using the code in my linked answer from above) can be found here (1.4 MB). It assumes the maximum allowed values for $N$ and $Q$ and contains a random sequence of the two allowed commands.



I feed it to this code with the following command (and time it at the same time):



time cat flipping_coins.dat | python3 flipping_coins.py


Running only the cat command already takes 0.17s.




As Ev. Kounis noted in the comments, there is currently no solution in pure Python that successfully beats this challenge. There are, however, a couple of PyPy solutions, which use segment trees.







share|improve this question

















  • 2




    I would also like to comment that currently there is no successful pure Python solution to raise the "bounty" and stakes a bit.
    – Ev. Kounis
    Jan 26 at 9:42






  • 1




    There is now a successful pure Python solution, though only for Python 2, as mentioned in this answer
    – Alex Selby
    Jan 26 at 18:31
















up vote
10
down vote

favorite
3












I am trying to solve this problem on Codechef, which has a list of $N$ coins, all initially tails-up, and a list of $Q$ commands. There are two possible commands, 0 A B flips all coins in the interval [A, B] and 1 A B prints the number of heads-up coins in the interval [A, B] (both intervals include B).



After reviewing somebody else's answer that exceeded the time-limit, I came up with a fast numpy answer, which was still not fast enough for the online judge (3s instead of 0.3s).



However, one other reviewer suggested the following algorithm:




  • The key observation is that a 0 a b operation is in fact a pair of operations: flip coins from a to the end, and flip coins from b + 1 to the end. This leads to the solution:

    • Decouple the 0 a b operation into a and b+1

    • Maintain the list of flip points, and keep it ordered

    • To perform a 1 operation, find the largest flip point before A. If its index in flip list is even, the A coin is tails
      up, otherwise it is heads up. Then, traverse the flip list until B
      adding up lengths of every other interval.




I tried to implement this, using bisect to keep the list of flips sorted, but it is a lot slower (20s for the same worst-case input as the 3s quoted above).



from bisect import bisect_left, insort

n, q = map(int, input().split())

flips =

for _ in range(q):
command, start, end = map(int, input().split())
if command == 0:
insort(flips, start)
insort(flips, end + 1)
elif command == 1:
flip_start = bisect_left(flips, start)
flip_end = bisect_left(flips, end + 1)
flips_temp = flips[flip_start:flip_end]
insort(flips_temp, end + 1)
if flip_start % 2:
insort(flips_temp, start)
print(sum(flips_temp[1::2]) - sum(flips_temp[:-1:2]))


However, this is probably not the best implementation of that algorithm, maybe there is a way to use heapq to keep the list sorted (but then slicing and traversing it become harder). So, any comments, especially on how to speed this up, are welcome.



I am aware that one should normally put this into a function and make it re-usable, but since this is time-critical code I think it is fine to leave it all in the global namespace. It is a one-shot script, anyway.




The worst-case input file I used (which I generated using the code in my linked answer from above) can be found here (1.4 MB). It assumes the maximum allowed values for $N$ and $Q$ and contains a random sequence of the two allowed commands.



I feed it to this code with the following command (and time it at the same time):



time cat flipping_coins.dat | python3 flipping_coins.py


Running only the cat command already takes 0.17s.




As Ev. Kounis noted in the comments, there is currently no solution in pure Python that successfully beats this challenge. There are, however, a couple of PyPy solutions, which use segment trees.







share|improve this question

















  • 2




    I would also like to comment that currently there is no successful pure Python solution to raise the "bounty" and stakes a bit.
    – Ev. Kounis
    Jan 26 at 9:42






  • 1




    There is now a successful pure Python solution, though only for Python 2, as mentioned in this answer
    – Alex Selby
    Jan 26 at 18:31












up vote
10
down vote

favorite
3









up vote
10
down vote

favorite
3






3





I am trying to solve this problem on Codechef, which has a list of $N$ coins, all initially tails-up, and a list of $Q$ commands. There are two possible commands, 0 A B flips all coins in the interval [A, B] and 1 A B prints the number of heads-up coins in the interval [A, B] (both intervals include B).



After reviewing somebody else's answer that exceeded the time-limit, I came up with a fast numpy answer, which was still not fast enough for the online judge (3s instead of 0.3s).



However, one other reviewer suggested the following algorithm:




  • The key observation is that a 0 a b operation is in fact a pair of operations: flip coins from a to the end, and flip coins from b + 1 to the end. This leads to the solution:

    • Decouple the 0 a b operation into a and b+1

    • Maintain the list of flip points, and keep it ordered

    • To perform a 1 operation, find the largest flip point before A. If its index in flip list is even, the A coin is tails
      up, otherwise it is heads up. Then, traverse the flip list until B
      adding up lengths of every other interval.




I tried to implement this, using bisect to keep the list of flips sorted, but it is a lot slower (20s for the same worst-case input as the 3s quoted above).



from bisect import bisect_left, insort

n, q = map(int, input().split())

flips =

for _ in range(q):
command, start, end = map(int, input().split())
if command == 0:
insort(flips, start)
insort(flips, end + 1)
elif command == 1:
flip_start = bisect_left(flips, start)
flip_end = bisect_left(flips, end + 1)
flips_temp = flips[flip_start:flip_end]
insort(flips_temp, end + 1)
if flip_start % 2:
insort(flips_temp, start)
print(sum(flips_temp[1::2]) - sum(flips_temp[:-1:2]))


However, this is probably not the best implementation of that algorithm, maybe there is a way to use heapq to keep the list sorted (but then slicing and traversing it become harder). So, any comments, especially on how to speed this up, are welcome.



I am aware that one should normally put this into a function and make it re-usable, but since this is time-critical code I think it is fine to leave it all in the global namespace. It is a one-shot script, anyway.




The worst-case input file I used (which I generated using the code in my linked answer from above) can be found here (1.4 MB). It assumes the maximum allowed values for $N$ and $Q$ and contains a random sequence of the two allowed commands.



I feed it to this code with the following command (and time it at the same time):



time cat flipping_coins.dat | python3 flipping_coins.py


Running only the cat command already takes 0.17s.




As Ev. Kounis noted in the comments, there is currently no solution in pure Python that successfully beats this challenge. There are, however, a couple of PyPy solutions, which use segment trees.







share|improve this question













I am trying to solve this problem on Codechef, which has a list of $N$ coins, all initially tails-up, and a list of $Q$ commands. There are two possible commands, 0 A B flips all coins in the interval [A, B] and 1 A B prints the number of heads-up coins in the interval [A, B] (both intervals include B).



After reviewing somebody else's answer that exceeded the time-limit, I came up with a fast numpy answer, which was still not fast enough for the online judge (3s instead of 0.3s).



However, one other reviewer suggested the following algorithm:




  • The key observation is that a 0 a b operation is in fact a pair of operations: flip coins from a to the end, and flip coins from b + 1 to the end. This leads to the solution:

    • Decouple the 0 a b operation into a and b+1

    • Maintain the list of flip points, and keep it ordered

    • To perform a 1 operation, find the largest flip point before A. If its index in flip list is even, the A coin is tails
      up, otherwise it is heads up. Then, traverse the flip list until B
      adding up lengths of every other interval.




I tried to implement this, using bisect to keep the list of flips sorted, but it is a lot slower (20s for the same worst-case input as the 3s quoted above).



from bisect import bisect_left, insort

n, q = map(int, input().split())

flips =

for _ in range(q):
command, start, end = map(int, input().split())
if command == 0:
insort(flips, start)
insort(flips, end + 1)
elif command == 1:
flip_start = bisect_left(flips, start)
flip_end = bisect_left(flips, end + 1)
flips_temp = flips[flip_start:flip_end]
insort(flips_temp, end + 1)
if flip_start % 2:
insort(flips_temp, start)
print(sum(flips_temp[1::2]) - sum(flips_temp[:-1:2]))


However, this is probably not the best implementation of that algorithm, maybe there is a way to use heapq to keep the list sorted (but then slicing and traversing it become harder). So, any comments, especially on how to speed this up, are welcome.



I am aware that one should normally put this into a function and make it re-usable, but since this is time-critical code I think it is fine to leave it all in the global namespace. It is a one-shot script, anyway.




The worst-case input file I used (which I generated using the code in my linked answer from above) can be found here (1.4 MB). It assumes the maximum allowed values for $N$ and $Q$ and contains a random sequence of the two allowed commands.



I feed it to this code with the following command (and time it at the same time):



time cat flipping_coins.dat | python3 flipping_coins.py


Running only the cat command already takes 0.17s.




As Ev. Kounis noted in the comments, there is currently no solution in pure Python that successfully beats this challenge. There are, however, a couple of PyPy solutions, which use segment trees.









share|improve this question












share|improve this question




share|improve this question








edited Jun 14 at 14:12
























asked Jan 26 at 9:36









Graipher

20.5k43081




20.5k43081







  • 2




    I would also like to comment that currently there is no successful pure Python solution to raise the "bounty" and stakes a bit.
    – Ev. Kounis
    Jan 26 at 9:42






  • 1




    There is now a successful pure Python solution, though only for Python 2, as mentioned in this answer
    – Alex Selby
    Jan 26 at 18:31












  • 2




    I would also like to comment that currently there is no successful pure Python solution to raise the "bounty" and stakes a bit.
    – Ev. Kounis
    Jan 26 at 9:42






  • 1




    There is now a successful pure Python solution, though only for Python 2, as mentioned in this answer
    – Alex Selby
    Jan 26 at 18:31







2




2




I would also like to comment that currently there is no successful pure Python solution to raise the "bounty" and stakes a bit.
– Ev. Kounis
Jan 26 at 9:42




I would also like to comment that currently there is no successful pure Python solution to raise the "bounty" and stakes a bit.
– Ev. Kounis
Jan 26 at 9:42




1




1




There is now a successful pure Python solution, though only for Python 2, as mentioned in this answer
– Alex Selby
Jan 26 at 18:31




There is now a successful pure Python solution, though only for Python 2, as mentioned in this answer
– Alex Selby
Jan 26 at 18:31










1 Answer
1






active

oldest

votes

















up vote
2
down vote














 if command == 0:
insort(flips, start)
insort(flips, end + 1)



Well, the main reason it's slow is insort. As the doc says,




Keep in mind that the O(log n) search is dominated by the slow O(n) insertion step.




That said, there's a significant performance improvement possible here without changing data structure. Flipping coins from a to the end and then flipping coins from a to the end is equivalent to doing absolutely nothing. So rather than always insert, toggle whether or not the value is in flips. I see an execution time reduction from about 30s to about 16s with



def toggle(flips, val):
idx = bisect_left(flips, val)
if idx < len(flips) and flips[idx] == val:
del flips[idx]
else:
flips.insert(idx, val)


and



 if command == 0:
toggle(flips, start)
toggle(flips, end + 1)


Obviously if you want speed then you need to switch to using a better data structure than a list, but within the constraints of list I think this is fairly efficient.





 elif command == 1:
flip_start = bisect_left(flips, start)
flip_end = bisect_left(flips, end + 1)
flips_temp = flips[flip_start:flip_end]
insort(flips_temp, end + 1)
if flip_start % 2:
insort(flips_temp, start)
print(sum(flips_temp[1::2]) - sum(flips_temp[:-1:2]))



The first thing that's suspicious here is insort. However, to my surprise, using flips_temp.append(end + 1) and flips_temp.insert(0, start) made things slightly slower. The real speedup comes from eliminating the copying. Careful analysis shows that the logic is equivalent to



 elif command == 1:
flip_start = bisect_left(flips, start)
flip_end = bisect_left(flips, end + 1)

accum = sum(flips[flip_start+1:flip_end:2]) - sum(flips[flip_start:flip_end:2])
if flip_start % 2:
accum = -start - accum
if flip_end % 2:
accum = accum + end + 1
print(accum)


This gave me a further speedup from 16 seconds to 12 seconds.



IMO it would make sense to implement the difference of sums using functools.reduce(operator.sub, ..., 0), but that's much much slower (~40 seconds).






share|improve this answer























    Your Answer




    StackExchange.ifUsing("editor", function ()
    return StackExchange.using("mathjaxEditing", function ()
    StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
    StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["\$", "\$"]]);
    );
    );
    , "mathjax-editing");

    StackExchange.ifUsing("editor", function ()
    StackExchange.using("externalEditor", function ()
    StackExchange.using("snippets", function ()
    StackExchange.snippets.init();
    );
    );
    , "code-snippets");

    StackExchange.ready(function()
    var channelOptions =
    tags: "".split(" "),
    id: "196"
    ;
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function()
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled)
    StackExchange.using("snippets", function()
    createEditor();
    );

    else
    createEditor();

    );

    function createEditor()
    StackExchange.prepareEditor(
    heartbeatType: 'answer',
    convertImagesToLinks: false,
    noModals: false,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: null,
    bindNavPrevention: true,
    postfix: "",
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    );



    );








     

    draft saved


    draft discarded


















    StackExchange.ready(
    function ()
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f186041%2fusing-bisect-to-flip-coins%23new-answer', 'question_page');

    );

    Post as a guest






























    1 Answer
    1






    active

    oldest

    votes








    1 Answer
    1






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes








    up vote
    2
    down vote














     if command == 0:
    insort(flips, start)
    insort(flips, end + 1)



    Well, the main reason it's slow is insort. As the doc says,




    Keep in mind that the O(log n) search is dominated by the slow O(n) insertion step.




    That said, there's a significant performance improvement possible here without changing data structure. Flipping coins from a to the end and then flipping coins from a to the end is equivalent to doing absolutely nothing. So rather than always insert, toggle whether or not the value is in flips. I see an execution time reduction from about 30s to about 16s with



    def toggle(flips, val):
    idx = bisect_left(flips, val)
    if idx < len(flips) and flips[idx] == val:
    del flips[idx]
    else:
    flips.insert(idx, val)


    and



     if command == 0:
    toggle(flips, start)
    toggle(flips, end + 1)


    Obviously if you want speed then you need to switch to using a better data structure than a list, but within the constraints of list I think this is fairly efficient.





     elif command == 1:
    flip_start = bisect_left(flips, start)
    flip_end = bisect_left(flips, end + 1)
    flips_temp = flips[flip_start:flip_end]
    insort(flips_temp, end + 1)
    if flip_start % 2:
    insort(flips_temp, start)
    print(sum(flips_temp[1::2]) - sum(flips_temp[:-1:2]))



    The first thing that's suspicious here is insort. However, to my surprise, using flips_temp.append(end + 1) and flips_temp.insert(0, start) made things slightly slower. The real speedup comes from eliminating the copying. Careful analysis shows that the logic is equivalent to



     elif command == 1:
    flip_start = bisect_left(flips, start)
    flip_end = bisect_left(flips, end + 1)

    accum = sum(flips[flip_start+1:flip_end:2]) - sum(flips[flip_start:flip_end:2])
    if flip_start % 2:
    accum = -start - accum
    if flip_end % 2:
    accum = accum + end + 1
    print(accum)


    This gave me a further speedup from 16 seconds to 12 seconds.



    IMO it would make sense to implement the difference of sums using functools.reduce(operator.sub, ..., 0), but that's much much slower (~40 seconds).






    share|improve this answer



























      up vote
      2
      down vote














       if command == 0:
      insort(flips, start)
      insort(flips, end + 1)



      Well, the main reason it's slow is insort. As the doc says,




      Keep in mind that the O(log n) search is dominated by the slow O(n) insertion step.




      That said, there's a significant performance improvement possible here without changing data structure. Flipping coins from a to the end and then flipping coins from a to the end is equivalent to doing absolutely nothing. So rather than always insert, toggle whether or not the value is in flips. I see an execution time reduction from about 30s to about 16s with



      def toggle(flips, val):
      idx = bisect_left(flips, val)
      if idx < len(flips) and flips[idx] == val:
      del flips[idx]
      else:
      flips.insert(idx, val)


      and



       if command == 0:
      toggle(flips, start)
      toggle(flips, end + 1)


      Obviously if you want speed then you need to switch to using a better data structure than a list, but within the constraints of list I think this is fairly efficient.





       elif command == 1:
      flip_start = bisect_left(flips, start)
      flip_end = bisect_left(flips, end + 1)
      flips_temp = flips[flip_start:flip_end]
      insort(flips_temp, end + 1)
      if flip_start % 2:
      insort(flips_temp, start)
      print(sum(flips_temp[1::2]) - sum(flips_temp[:-1:2]))



      The first thing that's suspicious here is insort. However, to my surprise, using flips_temp.append(end + 1) and flips_temp.insert(0, start) made things slightly slower. The real speedup comes from eliminating the copying. Careful analysis shows that the logic is equivalent to



       elif command == 1:
      flip_start = bisect_left(flips, start)
      flip_end = bisect_left(flips, end + 1)

      accum = sum(flips[flip_start+1:flip_end:2]) - sum(flips[flip_start:flip_end:2])
      if flip_start % 2:
      accum = -start - accum
      if flip_end % 2:
      accum = accum + end + 1
      print(accum)


      This gave me a further speedup from 16 seconds to 12 seconds.



      IMO it would make sense to implement the difference of sums using functools.reduce(operator.sub, ..., 0), but that's much much slower (~40 seconds).






      share|improve this answer

























        up vote
        2
        down vote










        up vote
        2
        down vote










         if command == 0:
        insort(flips, start)
        insort(flips, end + 1)



        Well, the main reason it's slow is insort. As the doc says,




        Keep in mind that the O(log n) search is dominated by the slow O(n) insertion step.




        That said, there's a significant performance improvement possible here without changing data structure. Flipping coins from a to the end and then flipping coins from a to the end is equivalent to doing absolutely nothing. So rather than always insert, toggle whether or not the value is in flips. I see an execution time reduction from about 30s to about 16s with



        def toggle(flips, val):
        idx = bisect_left(flips, val)
        if idx < len(flips) and flips[idx] == val:
        del flips[idx]
        else:
        flips.insert(idx, val)


        and



         if command == 0:
        toggle(flips, start)
        toggle(flips, end + 1)


        Obviously if you want speed then you need to switch to using a better data structure than a list, but within the constraints of list I think this is fairly efficient.





         elif command == 1:
        flip_start = bisect_left(flips, start)
        flip_end = bisect_left(flips, end + 1)
        flips_temp = flips[flip_start:flip_end]
        insort(flips_temp, end + 1)
        if flip_start % 2:
        insort(flips_temp, start)
        print(sum(flips_temp[1::2]) - sum(flips_temp[:-1:2]))



        The first thing that's suspicious here is insort. However, to my surprise, using flips_temp.append(end + 1) and flips_temp.insert(0, start) made things slightly slower. The real speedup comes from eliminating the copying. Careful analysis shows that the logic is equivalent to



         elif command == 1:
        flip_start = bisect_left(flips, start)
        flip_end = bisect_left(flips, end + 1)

        accum = sum(flips[flip_start+1:flip_end:2]) - sum(flips[flip_start:flip_end:2])
        if flip_start % 2:
        accum = -start - accum
        if flip_end % 2:
        accum = accum + end + 1
        print(accum)


        This gave me a further speedup from 16 seconds to 12 seconds.



        IMO it would make sense to implement the difference of sums using functools.reduce(operator.sub, ..., 0), but that's much much slower (~40 seconds).






        share|improve this answer
















         if command == 0:
        insort(flips, start)
        insort(flips, end + 1)



        Well, the main reason it's slow is insort. As the doc says,




        Keep in mind that the O(log n) search is dominated by the slow O(n) insertion step.




        That said, there's a significant performance improvement possible here without changing data structure. Flipping coins from a to the end and then flipping coins from a to the end is equivalent to doing absolutely nothing. So rather than always insert, toggle whether or not the value is in flips. I see an execution time reduction from about 30s to about 16s with



        def toggle(flips, val):
        idx = bisect_left(flips, val)
        if idx < len(flips) and flips[idx] == val:
        del flips[idx]
        else:
        flips.insert(idx, val)


        and



         if command == 0:
        toggle(flips, start)
        toggle(flips, end + 1)


        Obviously if you want speed then you need to switch to using a better data structure than a list, but within the constraints of list I think this is fairly efficient.





         elif command == 1:
        flip_start = bisect_left(flips, start)
        flip_end = bisect_left(flips, end + 1)
        flips_temp = flips[flip_start:flip_end]
        insort(flips_temp, end + 1)
        if flip_start % 2:
        insort(flips_temp, start)
        print(sum(flips_temp[1::2]) - sum(flips_temp[:-1:2]))



        The first thing that's suspicious here is insort. However, to my surprise, using flips_temp.append(end + 1) and flips_temp.insert(0, start) made things slightly slower. The real speedup comes from eliminating the copying. Careful analysis shows that the logic is equivalent to



         elif command == 1:
        flip_start = bisect_left(flips, start)
        flip_end = bisect_left(flips, end + 1)

        accum = sum(flips[flip_start+1:flip_end:2]) - sum(flips[flip_start:flip_end:2])
        if flip_start % 2:
        accum = -start - accum
        if flip_end % 2:
        accum = accum + end + 1
        print(accum)


        This gave me a further speedup from 16 seconds to 12 seconds.



        IMO it would make sense to implement the difference of sums using functools.reduce(operator.sub, ..., 0), but that's much much slower (~40 seconds).







        share|improve this answer















        share|improve this answer



        share|improve this answer








        edited Jun 14 at 14:17


























        answered Jun 14 at 14:08









        Peter Taylor

        14.1k2454




        14.1k2454






















             

            draft saved


            draft discarded


























             


            draft saved


            draft discarded














            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f186041%2fusing-bisect-to-flip-coins%23new-answer', 'question_page');

            );

            Post as a guest













































































            Popular posts from this blog

            Chat program with C++ and SFML

            Function to Return a JSON Like Objects Using VBA Collections and Arrays

            Will my employers contract hold up in court?