Recursive merge-sort inversion count algorithm very slow

.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty margin-bottom:0;

up vote
1
down vote

favorite

I am trying to implement a recursive count of inversions in a list of integers by using merge sort and counting the number of split inversions on each recursion. My code works but runs painfully slowly above a list of about 100 integers, and I need to count for ~3 orders of magnitude greater. Can anyone see what I'm doing wrong that makes it so slow?

inversions = np.loadtxt("inversion_text.txt", int).tolist()

def sort_count_inversions(X):
 n = len(X)

 # For subproblems of size 1 there are no inversions and no sorting required
 if n == 1:
 return 0, X
 a = X[:n//2]
 b = X[n//2:]

 # Count the inversions in subproblems
 # Get back the count of inversions and the sorted subproblems
 count_a, sorted_a = sort_count_inversions(a)
 count_b, sorted_b = sort_count_inversions(b)

 # Count the split inversions between sorted sub problems
 # Get back the count of split inversions and the the merge-sorted subproblems (sorted X)
 split_count, a_b = merge_count_split_inversions(sorted_a,sorted_b)

 # return the count of inversions in X and merge-sorted X
 count = split_count + count_a + count_b
 return count, a_b


def merge_count_split_inversions(X, Y):
 n = len(X) + len(Y)
 j = 0
 k = 0
 X_Y = 
 count_split = 0

 # Iterate through X and Y comparing value by value
 for i in range(n):
 # If we have reached the end of X or Y, there are no more inversions so append the remaining values to X_Y
 if j == len(X):
 X_Y.extend(Y[k:])
 elif k == len(Y):
 X_Y.extend(X[j:])
 # Add the smaller of the two compared elements to X_Y
 # If adding from Y, increment count of split inversions by the number of remaining elements in X
 elif X[j] < Y[k]: 
 X_Y.append(X[j])
 j += 1
 else:
 X_Y.append(Y[k])
 k += 1
 count_split += len(X) - j

 # Return the total count of split inversions and the result of merge_sorting X and Y (X_Y) 
 return count_split, X_Y 

sort_count_inversions(inversions)

edited May 16 at 17:45

200_success

123k14143399

asked May 15 at 13:35

Felix

1

Happy to take a look, but have you profiled your code yet?
â€“Â scnerd
May 15 at 13:43

Could you provide a sample of inversions?
â€“Â scnerd
May 15 at 13:46

1

You say the algorithm works, but the sorted result that gets returned is much larger than the original input array. Whatever is causing that might also be the source of the slowness.
â€“Â scnerd
May 15 at 13:52

add a commentÂ |Â

up vote
1
down vote

favorite

inversions = np.loadtxt("inversion_text.txt", int).tolist()

def sort_count_inversions(X):
 n = len(X)

 # For subproblems of size 1 there are no inversions and no sorting required
 if n == 1:
 return 0, X
 a = X[:n//2]
 b = X[n//2:]

 # Count the inversions in subproblems
 # Get back the count of inversions and the sorted subproblems
 count_a, sorted_a = sort_count_inversions(a)
 count_b, sorted_b = sort_count_inversions(b)

 # Count the split inversions between sorted sub problems
 # Get back the count of split inversions and the the merge-sorted subproblems (sorted X)
 split_count, a_b = merge_count_split_inversions(sorted_a,sorted_b)

 # return the count of inversions in X and merge-sorted X
 count = split_count + count_a + count_b
 return count, a_b


def merge_count_split_inversions(X, Y):
 n = len(X) + len(Y)
 j = 0
 k = 0
 X_Y = 
 count_split = 0

 # Iterate through X and Y comparing value by value
 for i in range(n):
 # If we have reached the end of X or Y, there are no more inversions so append the remaining values to X_Y
 if j == len(X):
 X_Y.extend(Y[k:])
 elif k == len(Y):
 X_Y.extend(X[j:])
 # Add the smaller of the two compared elements to X_Y
 # If adding from Y, increment count of split inversions by the number of remaining elements in X
 elif X[j] < Y[k]: 
 X_Y.append(X[j])
 j += 1
 else:
 X_Y.append(Y[k])
 k += 1
 count_split += len(X) - j

 # Return the total count of split inversions and the result of merge_sorting X and Y (X_Y) 
 return count_split, X_Y 

sort_count_inversions(inversions)

edited May 16 at 17:45

200_success

123k14143399

asked May 15 at 13:35

Felix

1

Happy to take a look, but have you profiled your code yet?
â€“Â scnerd
May 15 at 13:43

Could you provide a sample of inversions?
â€“Â scnerd
May 15 at 13:46

1

You say the algorithm works, but the sorted result that gets returned is much larger than the original input array. Whatever is causing that might also be the source of the slowness.
â€“Â scnerd
May 15 at 13:52

add a commentÂ |Â

up vote
1
down vote

favorite

inversions = np.loadtxt("inversion_text.txt", int).tolist()

def sort_count_inversions(X):
 n = len(X)

 # For subproblems of size 1 there are no inversions and no sorting required
 if n == 1:
 return 0, X
 a = X[:n//2]
 b = X[n//2:]

 # Count the inversions in subproblems
 # Get back the count of inversions and the sorted subproblems
 count_a, sorted_a = sort_count_inversions(a)
 count_b, sorted_b = sort_count_inversions(b)

 # Count the split inversions between sorted sub problems
 # Get back the count of split inversions and the the merge-sorted subproblems (sorted X)
 split_count, a_b = merge_count_split_inversions(sorted_a,sorted_b)

 # return the count of inversions in X and merge-sorted X
 count = split_count + count_a + count_b
 return count, a_b


def merge_count_split_inversions(X, Y):
 n = len(X) + len(Y)
 j = 0
 k = 0
 X_Y = 
 count_split = 0

 # Iterate through X and Y comparing value by value
 for i in range(n):
 # If we have reached the end of X or Y, there are no more inversions so append the remaining values to X_Y
 if j == len(X):
 X_Y.extend(Y[k:])
 elif k == len(Y):
 X_Y.extend(X[j:])
 # Add the smaller of the two compared elements to X_Y
 # If adding from Y, increment count of split inversions by the number of remaining elements in X
 elif X[j] < Y[k]: 
 X_Y.append(X[j])
 j += 1
 else:
 X_Y.append(Y[k])
 k += 1
 count_split += len(X) - j

 # Return the total count of split inversions and the result of merge_sorting X and Y (X_Y) 
 return count_split, X_Y 

sort_count_inversions(inversions)

edited May 16 at 17:45

200_success

123k14143399

asked May 15 at 13:35

Felix

inversions = np.loadtxt("inversion_text.txt", int).tolist()

def sort_count_inversions(X):
 n = len(X)

 # For subproblems of size 1 there are no inversions and no sorting required
 if n == 1:
 return 0, X
 a = X[:n//2]
 b = X[n//2:]

 # Count the inversions in subproblems
 # Get back the count of inversions and the sorted subproblems
 count_a, sorted_a = sort_count_inversions(a)
 count_b, sorted_b = sort_count_inversions(b)

 # Count the split inversions between sorted sub problems
 # Get back the count of split inversions and the the merge-sorted subproblems (sorted X)
 split_count, a_b = merge_count_split_inversions(sorted_a,sorted_b)

 # return the count of inversions in X and merge-sorted X
 count = split_count + count_a + count_b
 return count, a_b


def merge_count_split_inversions(X, Y):
 n = len(X) + len(Y)
 j = 0
 k = 0
 X_Y = 
 count_split = 0

 # Iterate through X and Y comparing value by value
 for i in range(n):
 # If we have reached the end of X or Y, there are no more inversions so append the remaining values to X_Y
 if j == len(X):
 X_Y.extend(Y[k:])
 elif k == len(Y):
 X_Y.extend(X[j:])
 # Add the smaller of the two compared elements to X_Y
 # If adding from Y, increment count of split inversions by the number of remaining elements in X
 elif X[j] < Y[k]: 
 X_Y.append(X[j])
 j += 1
 else:
 X_Y.append(Y[k])
 k += 1
 count_split += len(X) - j

 # Return the total count of split inversions and the result of merge_sorting X and Y (X_Y) 
 return count_split, X_Y 

sort_count_inversions(inversions)

edited May 16 at 17:45

200_success

123k14143399

asked May 15 at 13:35

Felix

edited May 16 at 17:45

200_success

123k14143399

edited May 16 at 17:45

200_success

123k14143399

edited May 16 at 17:45

200_success

123k14143399

asked May 15 at 13:35

Felix

asked May 15 at 13:35

Felix

asked May 15 at 13:35

Felix

1

Happy to take a look, but have you profiled your code yet?
â€“Â scnerd
May 15 at 13:43

Could you provide a sample of inversions?
â€“Â scnerd
May 15 at 13:46

1

You say the algorithm works, but the sorted result that gets returned is much larger than the original input array. Whatever is causing that might also be the source of the slowness.
â€“Â scnerd
May 15 at 13:52

add a commentÂ |Â

1

Happy to take a look, but have you profiled your code yet?
â€“Â scnerd
May 15 at 13:43

Could you provide a sample of inversions?
â€“Â scnerd
May 15 at 13:46

1

You say the algorithm works, but the sorted result that gets returned is much larger than the original input array. Whatever is causing that might also be the source of the slowness.
â€“Â scnerd
May 15 at 13:52

Happy to take a look, but have you profiled your code yet?
â€“Â scnerd
May 15 at 13:43

Could you provide a sample of inversions?
â€“Â scnerd
May 15 at 13:46

You say the algorithm works, but the sorted result that gets returned is much larger than the original input array. Whatever is causing that might also be the source of the slowness.
â€“Â scnerd
May 15 at 13:52

add a commentÂ |Â

1 Answer
1

active

oldest

votes

up vote
2
down vote

accepted

There's a bug in merge_count_split_inversions.

 for i in range(n):
 if j == len(X):
 add len(Y)-k elements to X_Y
 elif k == len(Y):
 add len(X)-j elements to X_Y
 elif X[j] < Y[k]:
 add 1 element to X_Y
 j += 1
 else:
 add 1 element to X_Y
 k += 1

This should have the invariant that after the ith time round the loop it has added i elements to X_Y. Either the first two cases need to add a single element, or they need to break.

answered May 15 at 14:25

Peter Taylor

14k2454

Great thanks for the spot, all runs fine and quickly now :-)
â€“Â Felix
May 16 at 17:28

@Felix I have rolled back Rev 4 Ã¢Â†Â’ 2. Please see What to do when someone answers.
â€“Â 200_success
May 16 at 17:46

add a commentÂ |Â

Your Answer

StackExchange.ifUsing("editor", function ()
return StackExchange.using("mathjaxEditing", function ()
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["\$", "\$"]]);
);
);
, "mathjax-editing");

StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "196"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
convertImagesToLinks: false,
noModals: false,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f194456%2frecursive-merge-sort-inversion-count-algorithm-very-slow%23new-answer', 'question_page');

);

Post as a guest

Name

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

up vote
2
down vote

accepted

There's a bug in merge_count_split_inversions.

 for i in range(n):
 if j == len(X):
 add len(Y)-k elements to X_Y
 elif k == len(Y):
 add len(X)-j elements to X_Y
 elif X[j] < Y[k]:
 add 1 element to X_Y
 j += 1
 else:
 add 1 element to X_Y
 k += 1

This should have the invariant that after the ith time round the loop it has added i elements to X_Y. Either the first two cases need to add a single element, or they need to break.

answered May 15 at 14:25

Peter Taylor

14k2454

Great thanks for the spot, all runs fine and quickly now :-)
â€“Â Felix
May 16 at 17:28

@Felix I have rolled back Rev 4 Ã¢Â†Â’ 2. Please see What to do when someone answers.
â€“Â 200_success
May 16 at 17:46

add a commentÂ |Â

up vote
2
down vote

accepted

There's a bug in merge_count_split_inversions.

 for i in range(n):
 if j == len(X):
 add len(Y)-k elements to X_Y
 elif k == len(Y):
 add len(X)-j elements to X_Y
 elif X[j] < Y[k]:
 add 1 element to X_Y
 j += 1
 else:
 add 1 element to X_Y
 k += 1

This should have the invariant that after the ith time round the loop it has added i elements to X_Y. Either the first two cases need to add a single element, or they need to break.

answered May 15 at 14:25

Peter Taylor

14k2454

Great thanks for the spot, all runs fine and quickly now :-)
â€“Â Felix
May 16 at 17:28

@Felix I have rolled back Rev 4 Ã¢Â†Â’ 2. Please see What to do when someone answers.
â€“Â 200_success
May 16 at 17:46

add a commentÂ |Â

up vote
2
down vote

accepted

There's a bug in merge_count_split_inversions.

 for i in range(n):
 if j == len(X):
 add len(Y)-k elements to X_Y
 elif k == len(Y):
 add len(X)-j elements to X_Y
 elif X[j] < Y[k]:
 add 1 element to X_Y
 j += 1
 else:
 add 1 element to X_Y
 k += 1

This should have the invariant that after the ith time round the loop it has added i elements to X_Y. Either the first two cases need to add a single element, or they need to break.

answered May 15 at 14:25

Peter Taylor

14k2454

There's a bug in merge_count_split_inversions.

 for i in range(n):
 if j == len(X):
 add len(Y)-k elements to X_Y
 elif k == len(Y):
 add len(X)-j elements to X_Y
 elif X[j] < Y[k]:
 add 1 element to X_Y
 j += 1
 else:
 add 1 element to X_Y
 k += 1

This should have the invariant that after the ith time round the loop it has added i elements to X_Y. Either the first two cases need to add a single element, or they need to break.

answered May 15 at 14:25

Peter Taylor

14k2454

answered May 15 at 14:25

Peter Taylor

14k2454

answered May 15 at 14:25

Peter Taylor

14k2454

answered May 15 at 14:25

Peter Taylor

14k2454

Great thanks for the spot, all runs fine and quickly now :-)
â€“Â Felix
May 16 at 17:28

@Felix I have rolled back Rev 4 Ã¢Â†Â’ 2. Please see What to do when someone answers.
â€“Â 200_success
May 16 at 17:46

add a commentÂ |Â

Great thanks for the spot, all runs fine and quickly now :-)
â€“Â Felix
May 16 at 17:28

@Felix I have rolled back Rev 4 Ã¢Â†Â’ 2. Please see What to do when someone answers.
â€“Â 200_success
May 16 at 17:46

Great thanks for the spot, all runs fine and quickly now :-)
â€“Â Felix
May 16 at 17:28

@Felix I have rolled back Rev 4 Ã¢Â†Â’ 2. Please see What to do when someone answers.
â€“Â 200_success
May 16 at 17:46

add a commentÂ |Â

draft saved

draft discarded

draft saved

draft discarded

Post as a guest

Name

搜尋此網誌

trjhtr