Recursive merge-sort inversion count algorithm very slow

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP





.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty margin-bottom:0;







up vote
1
down vote

favorite












I am trying to implement a recursive count of inversions in a list of integers by using merge sort and counting the number of split inversions on each recursion. My code works but runs painfully slowly above a list of about 100 integers, and I need to count for ~3 orders of magnitude greater. Can anyone see what I'm doing wrong that makes it so slow?



inversions = np.loadtxt("inversion_text.txt", int).tolist()

def sort_count_inversions(X):
n = len(X)

# For subproblems of size 1 there are no inversions and no sorting required
if n == 1:
return 0, X
a = X[:n//2]
b = X[n//2:]

# Count the inversions in subproblems
# Get back the count of inversions and the sorted subproblems
count_a, sorted_a = sort_count_inversions(a)
count_b, sorted_b = sort_count_inversions(b)

# Count the split inversions between sorted sub problems
# Get back the count of split inversions and the the merge-sorted subproblems (sorted X)
split_count, a_b = merge_count_split_inversions(sorted_a,sorted_b)

# return the count of inversions in X and merge-sorted X
count = split_count + count_a + count_b
return count, a_b


def merge_count_split_inversions(X, Y):
n = len(X) + len(Y)
j = 0
k = 0
X_Y =
count_split = 0

# Iterate through X and Y comparing value by value
for i in range(n):
# If we have reached the end of X or Y, there are no more inversions so append the remaining values to X_Y
if j == len(X):
X_Y.extend(Y[k:])
elif k == len(Y):
X_Y.extend(X[j:])
# Add the smaller of the two compared elements to X_Y
# If adding from Y, increment count of split inversions by the number of remaining elements in X
elif X[j] < Y[k]:
X_Y.append(X[j])
j += 1
else:
X_Y.append(Y[k])
k += 1
count_split += len(X) - j

# Return the total count of split inversions and the result of merge_sorting X and Y (X_Y)
return count_split, X_Y

sort_count_inversions(inversions)






share|improve this question

















  • 1




    Happy to take a look, but have you profiled your code yet?
    – scnerd
    May 15 at 13:43










  • Could you provide a sample of inversions?
    – scnerd
    May 15 at 13:46






  • 1




    You say the algorithm works, but the sorted result that gets returned is much larger than the original input array. Whatever is causing that might also be the source of the slowness.
    – scnerd
    May 15 at 13:52
















up vote
1
down vote

favorite












I am trying to implement a recursive count of inversions in a list of integers by using merge sort and counting the number of split inversions on each recursion. My code works but runs painfully slowly above a list of about 100 integers, and I need to count for ~3 orders of magnitude greater. Can anyone see what I'm doing wrong that makes it so slow?



inversions = np.loadtxt("inversion_text.txt", int).tolist()

def sort_count_inversions(X):
n = len(X)

# For subproblems of size 1 there are no inversions and no sorting required
if n == 1:
return 0, X
a = X[:n//2]
b = X[n//2:]

# Count the inversions in subproblems
# Get back the count of inversions and the sorted subproblems
count_a, sorted_a = sort_count_inversions(a)
count_b, sorted_b = sort_count_inversions(b)

# Count the split inversions between sorted sub problems
# Get back the count of split inversions and the the merge-sorted subproblems (sorted X)
split_count, a_b = merge_count_split_inversions(sorted_a,sorted_b)

# return the count of inversions in X and merge-sorted X
count = split_count + count_a + count_b
return count, a_b


def merge_count_split_inversions(X, Y):
n = len(X) + len(Y)
j = 0
k = 0
X_Y =
count_split = 0

# Iterate through X and Y comparing value by value
for i in range(n):
# If we have reached the end of X or Y, there are no more inversions so append the remaining values to X_Y
if j == len(X):
X_Y.extend(Y[k:])
elif k == len(Y):
X_Y.extend(X[j:])
# Add the smaller of the two compared elements to X_Y
# If adding from Y, increment count of split inversions by the number of remaining elements in X
elif X[j] < Y[k]:
X_Y.append(X[j])
j += 1
else:
X_Y.append(Y[k])
k += 1
count_split += len(X) - j

# Return the total count of split inversions and the result of merge_sorting X and Y (X_Y)
return count_split, X_Y

sort_count_inversions(inversions)






share|improve this question

















  • 1




    Happy to take a look, but have you profiled your code yet?
    – scnerd
    May 15 at 13:43










  • Could you provide a sample of inversions?
    – scnerd
    May 15 at 13:46






  • 1




    You say the algorithm works, but the sorted result that gets returned is much larger than the original input array. Whatever is causing that might also be the source of the slowness.
    – scnerd
    May 15 at 13:52












up vote
1
down vote

favorite









up vote
1
down vote

favorite











I am trying to implement a recursive count of inversions in a list of integers by using merge sort and counting the number of split inversions on each recursion. My code works but runs painfully slowly above a list of about 100 integers, and I need to count for ~3 orders of magnitude greater. Can anyone see what I'm doing wrong that makes it so slow?



inversions = np.loadtxt("inversion_text.txt", int).tolist()

def sort_count_inversions(X):
n = len(X)

# For subproblems of size 1 there are no inversions and no sorting required
if n == 1:
return 0, X
a = X[:n//2]
b = X[n//2:]

# Count the inversions in subproblems
# Get back the count of inversions and the sorted subproblems
count_a, sorted_a = sort_count_inversions(a)
count_b, sorted_b = sort_count_inversions(b)

# Count the split inversions between sorted sub problems
# Get back the count of split inversions and the the merge-sorted subproblems (sorted X)
split_count, a_b = merge_count_split_inversions(sorted_a,sorted_b)

# return the count of inversions in X and merge-sorted X
count = split_count + count_a + count_b
return count, a_b


def merge_count_split_inversions(X, Y):
n = len(X) + len(Y)
j = 0
k = 0
X_Y =
count_split = 0

# Iterate through X and Y comparing value by value
for i in range(n):
# If we have reached the end of X or Y, there are no more inversions so append the remaining values to X_Y
if j == len(X):
X_Y.extend(Y[k:])
elif k == len(Y):
X_Y.extend(X[j:])
# Add the smaller of the two compared elements to X_Y
# If adding from Y, increment count of split inversions by the number of remaining elements in X
elif X[j] < Y[k]:
X_Y.append(X[j])
j += 1
else:
X_Y.append(Y[k])
k += 1
count_split += len(X) - j

# Return the total count of split inversions and the result of merge_sorting X and Y (X_Y)
return count_split, X_Y

sort_count_inversions(inversions)






share|improve this question













I am trying to implement a recursive count of inversions in a list of integers by using merge sort and counting the number of split inversions on each recursion. My code works but runs painfully slowly above a list of about 100 integers, and I need to count for ~3 orders of magnitude greater. Can anyone see what I'm doing wrong that makes it so slow?



inversions = np.loadtxt("inversion_text.txt", int).tolist()

def sort_count_inversions(X):
n = len(X)

# For subproblems of size 1 there are no inversions and no sorting required
if n == 1:
return 0, X
a = X[:n//2]
b = X[n//2:]

# Count the inversions in subproblems
# Get back the count of inversions and the sorted subproblems
count_a, sorted_a = sort_count_inversions(a)
count_b, sorted_b = sort_count_inversions(b)

# Count the split inversions between sorted sub problems
# Get back the count of split inversions and the the merge-sorted subproblems (sorted X)
split_count, a_b = merge_count_split_inversions(sorted_a,sorted_b)

# return the count of inversions in X and merge-sorted X
count = split_count + count_a + count_b
return count, a_b


def merge_count_split_inversions(X, Y):
n = len(X) + len(Y)
j = 0
k = 0
X_Y =
count_split = 0

# Iterate through X and Y comparing value by value
for i in range(n):
# If we have reached the end of X or Y, there are no more inversions so append the remaining values to X_Y
if j == len(X):
X_Y.extend(Y[k:])
elif k == len(Y):
X_Y.extend(X[j:])
# Add the smaller of the two compared elements to X_Y
# If adding from Y, increment count of split inversions by the number of remaining elements in X
elif X[j] < Y[k]:
X_Y.append(X[j])
j += 1
else:
X_Y.append(Y[k])
k += 1
count_split += len(X) - j

# Return the total count of split inversions and the result of merge_sorting X and Y (X_Y)
return count_split, X_Y

sort_count_inversions(inversions)








share|improve this question












share|improve this question




share|improve this question








edited May 16 at 17:45









200_success

123k14143399




123k14143399









asked May 15 at 13:35









Felix

83




83







  • 1




    Happy to take a look, but have you profiled your code yet?
    – scnerd
    May 15 at 13:43










  • Could you provide a sample of inversions?
    – scnerd
    May 15 at 13:46






  • 1




    You say the algorithm works, but the sorted result that gets returned is much larger than the original input array. Whatever is causing that might also be the source of the slowness.
    – scnerd
    May 15 at 13:52












  • 1




    Happy to take a look, but have you profiled your code yet?
    – scnerd
    May 15 at 13:43










  • Could you provide a sample of inversions?
    – scnerd
    May 15 at 13:46






  • 1




    You say the algorithm works, but the sorted result that gets returned is much larger than the original input array. Whatever is causing that might also be the source of the slowness.
    – scnerd
    May 15 at 13:52







1




1




Happy to take a look, but have you profiled your code yet?
– scnerd
May 15 at 13:43




Happy to take a look, but have you profiled your code yet?
– scnerd
May 15 at 13:43












Could you provide a sample of inversions?
– scnerd
May 15 at 13:46




Could you provide a sample of inversions?
– scnerd
May 15 at 13:46




1




1




You say the algorithm works, but the sorted result that gets returned is much larger than the original input array. Whatever is causing that might also be the source of the slowness.
– scnerd
May 15 at 13:52




You say the algorithm works, but the sorted result that gets returned is much larger than the original input array. Whatever is causing that might also be the source of the slowness.
– scnerd
May 15 at 13:52










1 Answer
1






active

oldest

votes

















up vote
2
down vote



accepted










There's a bug in merge_count_split_inversions.



 for i in range(n):
if j == len(X):
add len(Y)-k elements to X_Y
elif k == len(Y):
add len(X)-j elements to X_Y
elif X[j] < Y[k]:
add 1 element to X_Y
j += 1
else:
add 1 element to X_Y
k += 1


This should have the invariant that after the ith time round the loop it has added i elements to X_Y. Either the first two cases need to add a single element, or they need to break.






share|improve this answer





















  • Great thanks for the spot, all runs fine and quickly now :-)
    – Felix
    May 16 at 17:28










  • @Felix I have rolled back Rev 4 → 2. Please see What to do when someone answers.
    – 200_success
    May 16 at 17:46










Your Answer




StackExchange.ifUsing("editor", function ()
return StackExchange.using("mathjaxEditing", function ()
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["\$", "\$"]]);
);
);
, "mathjax-editing");

StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "196"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
convertImagesToLinks: false,
noModals: false,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);



);








 

draft saved


draft discarded


















StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f194456%2frecursive-merge-sort-inversion-count-algorithm-very-slow%23new-answer', 'question_page');

);

Post as a guest






























1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes








up vote
2
down vote



accepted










There's a bug in merge_count_split_inversions.



 for i in range(n):
if j == len(X):
add len(Y)-k elements to X_Y
elif k == len(Y):
add len(X)-j elements to X_Y
elif X[j] < Y[k]:
add 1 element to X_Y
j += 1
else:
add 1 element to X_Y
k += 1


This should have the invariant that after the ith time round the loop it has added i elements to X_Y. Either the first two cases need to add a single element, or they need to break.






share|improve this answer





















  • Great thanks for the spot, all runs fine and quickly now :-)
    – Felix
    May 16 at 17:28










  • @Felix I have rolled back Rev 4 → 2. Please see What to do when someone answers.
    – 200_success
    May 16 at 17:46














up vote
2
down vote



accepted










There's a bug in merge_count_split_inversions.



 for i in range(n):
if j == len(X):
add len(Y)-k elements to X_Y
elif k == len(Y):
add len(X)-j elements to X_Y
elif X[j] < Y[k]:
add 1 element to X_Y
j += 1
else:
add 1 element to X_Y
k += 1


This should have the invariant that after the ith time round the loop it has added i elements to X_Y. Either the first two cases need to add a single element, or they need to break.






share|improve this answer





















  • Great thanks for the spot, all runs fine and quickly now :-)
    – Felix
    May 16 at 17:28










  • @Felix I have rolled back Rev 4 → 2. Please see What to do when someone answers.
    – 200_success
    May 16 at 17:46












up vote
2
down vote



accepted







up vote
2
down vote



accepted






There's a bug in merge_count_split_inversions.



 for i in range(n):
if j == len(X):
add len(Y)-k elements to X_Y
elif k == len(Y):
add len(X)-j elements to X_Y
elif X[j] < Y[k]:
add 1 element to X_Y
j += 1
else:
add 1 element to X_Y
k += 1


This should have the invariant that after the ith time round the loop it has added i elements to X_Y. Either the first two cases need to add a single element, or they need to break.






share|improve this answer













There's a bug in merge_count_split_inversions.



 for i in range(n):
if j == len(X):
add len(Y)-k elements to X_Y
elif k == len(Y):
add len(X)-j elements to X_Y
elif X[j] < Y[k]:
add 1 element to X_Y
j += 1
else:
add 1 element to X_Y
k += 1


This should have the invariant that after the ith time round the loop it has added i elements to X_Y. Either the first two cases need to add a single element, or they need to break.







share|improve this answer













share|improve this answer



share|improve this answer











answered May 15 at 14:25









Peter Taylor

14k2454




14k2454











  • Great thanks for the spot, all runs fine and quickly now :-)
    – Felix
    May 16 at 17:28










  • @Felix I have rolled back Rev 4 → 2. Please see What to do when someone answers.
    – 200_success
    May 16 at 17:46
















  • Great thanks for the spot, all runs fine and quickly now :-)
    – Felix
    May 16 at 17:28










  • @Felix I have rolled back Rev 4 → 2. Please see What to do when someone answers.
    – 200_success
    May 16 at 17:46















Great thanks for the spot, all runs fine and quickly now :-)
– Felix
May 16 at 17:28




Great thanks for the spot, all runs fine and quickly now :-)
– Felix
May 16 at 17:28












@Felix I have rolled back Rev 4 → 2. Please see What to do when someone answers.
– 200_success
May 16 at 17:46




@Felix I have rolled back Rev 4 → 2. Please see What to do when someone answers.
– 200_success
May 16 at 17:46












 

draft saved


draft discarded


























 


draft saved


draft discarded














StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f194456%2frecursive-merge-sort-inversion-count-algorithm-very-slow%23new-answer', 'question_page');

);

Post as a guest













































































Popular posts from this blog

Python Lists

Aion

JavaScript Array Iteration Methods