Calculate median of a matrix

.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty margin-bottom:0;

up vote
0
down vote

favorite

I am solving interview questions from here.

Problem : Given a N cross M matrix in which each row is sorted, find the overall median of the matrix. Assume N*M is odd.

Note: No extra memory is allowed.
For example:
 Matrix= [1, 3, 5]
 [2, 6, 9]
 [3, 6, 9]
A = [1, 2, 3, 3, 5, 6, 6, 9, 9]
Median is 5. So, output is 5.

This is my approach:

def find_median( A):
 """Returns the median value from given list"""
 for i in range(1,len(A)):
 A[0].extend(A[i])

 return (sorted(A[0])).pop(len(A[0])/2)

Test Cases:

assert find_median([[1,3,5],[2,5,9],[3,6,11]]) == 5
assert find_median([[0,1,1],[2,6,10],[3,5,9]]) == 3
assert find_median([[1,3,4,12,14],[1,6,9,10,15],[0,1,3,3,4]]) == 4

I am able to solve the problem but I wanted to know is there a better approach to solve this problem?

edited May 25 at 8:30

t3chb0t

31.9k54195

asked May 24 at 17:32

Latika Agarwal

861216

1

Presumably by "no extra memory is allowed" they actually mean "$O(1)$ extra memory is allowed" (otherwise there would be no solution). But the code in the post uses $Theta(NM)$ extra memory and so does not solve the problem.
â€“Â Gareth Rees
May 24 at 17:39

@Gareth Raees Updated code s.t. constraint "no extra memory allocation required"
â€“Â Latika Agarwal
May 24 at 18:56

2

It's not there yet: the calls to A[0].extend(A[i]) have to allocate $Theta(MN)$ extra memory in order to extend the list. If you're having trouble telling how much extra memory you are using, it might help to use the __sizeof__ method to determine how much memory a particular object is using, for example A[0].__sizeof__() tells you the memory used by the list A[0] in bytes.
â€“Â Gareth Rees
May 25 at 9:07

1

I'm pretty sure this can be solved without allocating any memory on the heap. You will need some allocation to keep track of some counters (namely $O(n)$). You know on which index the median will be (because you know the dimensions of the matrix). Since all rows are sorted, you can make use of a merge-sort-like iteration of the rows to get to that index. You'll need a "current" counter for each row, though.
â€“Â Vogel612â™¦
May 25 at 9:49

I have been educated in chat. The solution is not what you have here, but it's not pretty on the time complexity level :/
â€“Â Vogel612â™¦
May 25 at 10:28

add a commentÂ |Â

up vote
0
down vote

favorite

I am solving interview questions from here.

Problem : Given a N cross M matrix in which each row is sorted, find the overall median of the matrix. Assume N*M is odd.

Note: No extra memory is allowed.
For example:
 Matrix= [1, 3, 5]
 [2, 6, 9]
 [3, 6, 9]
A = [1, 2, 3, 3, 5, 6, 6, 9, 9]
Median is 5. So, output is 5.

This is my approach:

def find_median( A):
 """Returns the median value from given list"""
 for i in range(1,len(A)):
 A[0].extend(A[i])

 return (sorted(A[0])).pop(len(A[0])/2)

Test Cases:

assert find_median([[1,3,5],[2,5,9],[3,6,11]]) == 5
assert find_median([[0,1,1],[2,6,10],[3,5,9]]) == 3
assert find_median([[1,3,4,12,14],[1,6,9,10,15],[0,1,3,3,4]]) == 4

I am able to solve the problem but I wanted to know is there a better approach to solve this problem?

edited May 25 at 8:30

t3chb0t

31.9k54195

asked May 24 at 17:32

Latika Agarwal

861216

1

Presumably by "no extra memory is allowed" they actually mean "$O(1)$ extra memory is allowed" (otherwise there would be no solution). But the code in the post uses $Theta(NM)$ extra memory and so does not solve the problem.
â€“Â Gareth Rees
May 24 at 17:39

@Gareth Raees Updated code s.t. constraint "no extra memory allocation required"
â€“Â Latika Agarwal
May 24 at 18:56

2

It's not there yet: the calls to A[0].extend(A[i]) have to allocate $Theta(MN)$ extra memory in order to extend the list. If you're having trouble telling how much extra memory you are using, it might help to use the __sizeof__ method to determine how much memory a particular object is using, for example A[0].__sizeof__() tells you the memory used by the list A[0] in bytes.
â€“Â Gareth Rees
May 25 at 9:07

1

I'm pretty sure this can be solved without allocating any memory on the heap. You will need some allocation to keep track of some counters (namely $O(n)$). You know on which index the median will be (because you know the dimensions of the matrix). Since all rows are sorted, you can make use of a merge-sort-like iteration of the rows to get to that index. You'll need a "current" counter for each row, though.
â€“Â Vogel612â™¦
May 25 at 9:49

I have been educated in chat. The solution is not what you have here, but it's not pretty on the time complexity level :/
â€“Â Vogel612â™¦
May 25 at 10:28

add a commentÂ |Â

up vote
0
down vote

favorite

I am solving interview questions from here.

Problem : Given a N cross M matrix in which each row is sorted, find the overall median of the matrix. Assume N*M is odd.

Note: No extra memory is allowed.
For example:
 Matrix= [1, 3, 5]
 [2, 6, 9]
 [3, 6, 9]
A = [1, 2, 3, 3, 5, 6, 6, 9, 9]
Median is 5. So, output is 5.

This is my approach:

def find_median( A):
 """Returns the median value from given list"""
 for i in range(1,len(A)):
 A[0].extend(A[i])

 return (sorted(A[0])).pop(len(A[0])/2)

Test Cases:

assert find_median([[1,3,5],[2,5,9],[3,6,11]]) == 5
assert find_median([[0,1,1],[2,6,10],[3,5,9]]) == 3
assert find_median([[1,3,4,12,14],[1,6,9,10,15],[0,1,3,3,4]]) == 4

I am able to solve the problem but I wanted to know is there a better approach to solve this problem?

edited May 25 at 8:30

t3chb0t

31.9k54195

asked May 24 at 17:32

Latika Agarwal

861216

I am solving interview questions from here.

Problem : Given a N cross M matrix in which each row is sorted, find the overall median of the matrix. Assume N*M is odd.

Note: No extra memory is allowed.
For example:
 Matrix= [1, 3, 5]
 [2, 6, 9]
 [3, 6, 9]
A = [1, 2, 3, 3, 5, 6, 6, 9, 9]
Median is 5. So, output is 5.

This is my approach:

def find_median( A):
 """Returns the median value from given list"""
 for i in range(1,len(A)):
 A[0].extend(A[i])

 return (sorted(A[0])).pop(len(A[0])/2)

Test Cases:

assert find_median([[1,3,5],[2,5,9],[3,6,11]]) == 5
assert find_median([[0,1,1],[2,6,10],[3,5,9]]) == 3
assert find_median([[1,3,4,12,14],[1,6,9,10,15],[0,1,3,3,4]]) == 4

I am able to solve the problem but I wanted to know is there a better approach to solve this problem?

edited May 25 at 8:30

t3chb0t

31.9k54195

asked May 24 at 17:32

Latika Agarwal

861216

edited May 25 at 8:30

t3chb0t

31.9k54195

edited May 25 at 8:30

t3chb0t

31.9k54195

edited May 25 at 8:30

t3chb0t

31.9k54195

asked May 24 at 17:32

Latika Agarwal

861216

asked May 24 at 17:32

Latika Agarwal

861216

asked May 24 at 17:32

Latika Agarwal

861216

1

Presumably by "no extra memory is allowed" they actually mean "$O(1)$ extra memory is allowed" (otherwise there would be no solution). But the code in the post uses $Theta(NM)$ extra memory and so does not solve the problem.
â€“Â Gareth Rees
May 24 at 17:39

@Gareth Raees Updated code s.t. constraint "no extra memory allocation required"
â€“Â Latika Agarwal
May 24 at 18:56

2

It's not there yet: the calls to A[0].extend(A[i]) have to allocate $Theta(MN)$ extra memory in order to extend the list. If you're having trouble telling how much extra memory you are using, it might help to use the __sizeof__ method to determine how much memory a particular object is using, for example A[0].__sizeof__() tells you the memory used by the list A[0] in bytes.
â€“Â Gareth Rees
May 25 at 9:07

1

I'm pretty sure this can be solved without allocating any memory on the heap. You will need some allocation to keep track of some counters (namely $O(n)$). You know on which index the median will be (because you know the dimensions of the matrix). Since all rows are sorted, you can make use of a merge-sort-like iteration of the rows to get to that index. You'll need a "current" counter for each row, though.
â€“Â Vogel612â™¦
May 25 at 9:49

I have been educated in chat. The solution is not what you have here, but it's not pretty on the time complexity level :/
â€“Â Vogel612â™¦
May 25 at 10:28

add a commentÂ |Â

1

Presumably by "no extra memory is allowed" they actually mean "$O(1)$ extra memory is allowed" (otherwise there would be no solution). But the code in the post uses $Theta(NM)$ extra memory and so does not solve the problem.
â€“Â Gareth Rees
May 24 at 17:39

@Gareth Raees Updated code s.t. constraint "no extra memory allocation required"
â€“Â Latika Agarwal
May 24 at 18:56

2

It's not there yet: the calls to A[0].extend(A[i]) have to allocate $Theta(MN)$ extra memory in order to extend the list. If you're having trouble telling how much extra memory you are using, it might help to use the __sizeof__ method to determine how much memory a particular object is using, for example A[0].__sizeof__() tells you the memory used by the list A[0] in bytes.
â€“Â Gareth Rees
May 25 at 9:07

1

I'm pretty sure this can be solved without allocating any memory on the heap. You will need some allocation to keep track of some counters (namely $O(n)$). You know on which index the median will be (because you know the dimensions of the matrix). Since all rows are sorted, you can make use of a merge-sort-like iteration of the rows to get to that index. You'll need a "current" counter for each row, though.
â€“Â Vogel612â™¦
May 25 at 9:49

I have been educated in chat. The solution is not what you have here, but it's not pretty on the time complexity level :/
â€“Â Vogel612â™¦
May 25 at 10:28

Presumably by "no extra memory is allowed" they actually mean "$O(1)$ extra memory is allowed" (otherwise there would be no solution). But the code in the post uses $Theta(NM)$ extra memory and so does not solve the problem.
â€“Â Gareth Rees
May 24 at 17:39

@Gareth Raees Updated code s.t. constraint "no extra memory allocation required"
â€“Â Latika Agarwal
May 24 at 18:56

It's not there yet: the calls to A[0].extend(A[i]) have to allocate $Theta(MN)$ extra memory in order to extend the list. If you're having trouble telling how much extra memory you are using, it might help to use the __sizeof__ method to determine how much memory a particular object is using, for example A[0].__sizeof__() tells you the memory used by the list A[0] in bytes.
â€“Â Gareth Rees
May 25 at 9:07

I'm pretty sure this can be solved without allocating any memory on the heap. You will need some allocation to keep track of some counters (namely $O(n)$). You know on which index the median will be (because you know the dimensions of the matrix). Since all rows are sorted, you can make use of a merge-sort-like iteration of the rows to get to that index. You'll need a "current" counter for each row, though.
â€“Â Vogel612â™¦
May 25 at 9:49

I have been educated in chat. The solution is not what you have here, but it's not pretty on the time complexity level :/
â€“Â Vogel612â™¦
May 25 at 10:28

add a commentÂ |Â

3 Answers
3

active

oldest

votes

up vote
1
down vote

accepted

Sorting the contents of the matrix and then picking the index with the median value is a good approach. Lets see if we can do it with constant extra memory.

for i in range(1,len(A)):
 A[0].extend(A[i])

This extends the first row of the matrix to contain every row in a flat list. Before the matrix was of size N * M, whereas now it is N * M (the first row + (N - 1) * M (all the other rows). Subtracting the original size from this tells us how much extra memory we are using. We use (N - 1) * M additional memory or in other words O(NM) extra memory. This is not what we want.

The reason to put all the elements in one list is to make sorting easy. Lets see if we can sort without needing a flatten (1d) list. There are many sorts that don't require extra memory, they are called "inplace" sorting algorithms. For simplicity we will modify selection sort to work for our case.

How selection sort works is it picks the smallest element in the list, and puts it at the front. Then it finds the next smallest element, and puts it second, and so forth. To implement this, we can find the smallest in the whole list, and swap it with the first element. Then we can find the smallest of the list skipping the first slot.

def index_of_smallest(numbers, starting_index):
 # Assume numbers is not empty.
 smallest, index = numbers[starting_index], starting_index
 for i, number in enumerate(numbers[starting_index:], starting_index):
 if number < smallest:
 smallest, index = number, i

 return index


def selection_sort(numbers):
 size = len(numbers)
 for i in range(size):
 index = index_of_smallest(numbers, i)
 numbers[i], numbers[index] = numbers[index], numbers[i]

 # Don't return anything, we are modifying it inplace.

Now, we need this process to work on a matrix instead of a flat list. This is straightforward enough, we can loop over the matrix (left to right, top to bottom) and ignore cells we have already dealt with. In the below code x is the row coordinate, and y is the column coordinate.

def coordinates_of_smallest(matrix, starting_x, starting_y):
 smallest, smallest_x, smallest_y = matrix[starting_x][starting_y], starting_x, starting_y
 for x, row in enumerate(matrix):
 for y, cell in enumerate(row):
 if x < starting_x or (x == starting_x and y < starting_y):
 continue
 if cell < smallest:
 smallest, smallest_x, smallest_y = cell, x, y

 return smallest_x, smallest_y


def selection_sort(matrix):
 # Assume the matrix is not empty.
 n, m = len(matrix), len(matrix[0])
 for x in range(n):
 for y in range(m):
 smallest_x, smallest_y = coordinates_of_smallest(matrix, x, y)
 matrix[x][y], matrix[smallest_x][smallest_y] = matrix[smallest_x][smallest_y], matrix[x][y]

>>> matrix = [[1, 3, 5], [2, 6, 9], [3, 6, 9]]
>>> selection_sort(matrix)
>>> print(matrix) # [[1, 2, 3], [3, 5, 6], [6, 9, 9]]

Now getting the median of this is a piece of cake, it will be in the middle slot of the middle row! Since N * M is odd, both N and M must be odd. Therefore the median is at matrix[N // 2][M // 2].

There is a little room for improvement here. While we only use constant extra memory, our time complexity has gone up from O(nm lognm) to O((nm)**2). For a better time complexity, I would recommend using inplace quicksort which brings us back to O(nm lognm).

Another point is that we are doing too much work. Once we have worked our way up to the row N // 2 and the slot M // 2, we are actually done! We have put the median element in it's place, and we can stop. This is a simple enough check to add, but can cut the actual running time of the code in half.

edited May 25 at 14:53

answered May 25 at 14:36

spyr03

1,122418

this makes no use of the fact that the numbers in the row are in order, and the in-place sorting is unnecessary
â€“Â Maarten FabrÃ©
May 28 at 8:38

@MaartenFabrÃ© The inplace sorting is critical for it to use constant additional memory. That's the number one priority, not time complexity. Yes it it not necessary to do a full sort, but I don't think the additional code complexity is worth it. Also yes finding the smallest element can be done quicker by taking into account the rows are sorted, but again it is additional code complexity for something that isn't a priority.
â€“Â spyr03
May 29 at 13:00

add a commentÂ |Â

up vote
1
down vote

Follow PEP8
- A is a bad variable name, use say matrix.
- You should remove the space in-front of the function argument.
- You should put spaces after ,.
- You don't need the () surrounding sorted.
- You could add some space around your division.

You can use // rather than / to make your code Python 2 and Python 3 compatable.

You don't need to use pop, normal indexing will work too.

def find_median(matrix):
 """Returns the median value from given matrix"""
 for i in range(1, len(matrix)):
 matrix[0].extend(matrix[i])
 return sorted(matrix[0])[len(matrix[0]) // 2]

Your code doesn't work as the challenge asks, if you add print(matrix) before return you'll see:

[[1, 3, 5, 2, 6, 9, 3, 6, 9], [2, 6, 9], [3, 6, 9]]

answered May 25 at 10:47

Peilonrayz

24.3k336102

I think you could be a little clearer as to why the original code doesn't work, mentioning "you've used extra space" would be a clearer.
â€“Â spyr03
May 25 at 14:39

add a commentÂ |Â

up vote
1
down vote

making one list

Name your variables correctly. When you look back at your code in a few month's, you will have to look for a few minutes to figure out what you did. You'll have to figure out that A[0] is the list with all the values of the rows appended, and that len(A[0])/2 is the index of the median.

PS. this code will fail in python 3. If you really need floor division, use //, which is clear in both Python 2 and 3

instead of

 for i in range(1,len(A)):
 A[0].extend(A[i])

at least you can do

all_elements = 
for row in A:
 all_elements.extend(row)

or even better, use itertools.chain.from_iterable

from itertools import chain
all_elements = chain.from_iterable(A)
median_index = len(A) * len(A[0]) // 2
return sorted(all_elements)[median_index]

alternative approach

In your solution, you'll have 3 copies of the whole matrix (+ whatever sorted uses internally):

A[0] contains a copy of each element of the matrix because OP appends them all there.

The rest of the rows A also still exist, so they also contain an extra copy of each element (apart from the first row),

sorted generates a third list

By using chain you eliminate the first, so you still remain with 2 copies.

The easiest way to do this without copying the while matrix in a sorted list, is to use a sorted queue of iterators of the different rows, sorted by the next value, and pop and reinsert on this queue until you have the median. I use bisect.insort_left for the insertion in order

from bisect import insort_left

def find_median(matrix):
 """
 finds the median in a matrix with sorted rows
 """
 median_index = len(matrix) * len(matrix[0]) // 2
 iterators = map(iter, matrix)
 iterators = deque(sorted((next(it), row, it) for (row, it) in enumerate(iterators)))
 idx = 0
 while idx <= median_index:
 value, row, iterator = iterators.popleft()
 try:
 item = next(iterator), row, iterator
 insort_left(iterators, item)
 except StopIteration:
 pass
 idx += 1
# print(value, idx)
 return value

The deque consumes some extra memory, but only $O(N)$ instead of $O(NM)$
This can also be done using a list of length N with the index of the iteration, doing the iteration over the different rows yourself.

The row is added to the item as tiebreaker when there are multiple rows with the same value because iterators are not sortable.

standard library

I found out that heapq.merge does the same as what I do with the deque of iterators, so this works too:

from heapq import merge
from itertools import islice
def find_median_heapq(matrix):
 median_index = len(matrix) * len(matrix[0]) // 2
 all_items = merge(*matrix)
 return next(islice(all_items, median_index, None))

edited May 28 at 6:41

answered May 25 at 11:55

Maarten FabrÃ©

3,204214

A[0] doesn't copy, what do you mean by it? "The original A, A[0] and"
â€“Â Peilonrayz
May 25 at 12:09

A[0] contains a copy of each element of the matrix because OP appends them all there. The rest of the rows A also still exist, so they also contain an extra copy of each element (apart from the first row), and sorted generates a third list
â€“Â Maarten FabrÃ©
May 25 at 12:18

Please can you clarify that in your answer, as it currently it reads as indexing a list returns a copy.
â€“Â Peilonrayz
May 25 at 12:20

Shouldn't "1st" be used instead of "1nd"?
â€“Â Mathias Ettinger
May 25 at 13:30

add a commentÂ |Â

Your Answer

StackExchange.ifUsing("editor", function ()
return StackExchange.using("mathjaxEditing", function ()
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["\$", "\$"]]);
);
);
, "mathjax-editing");

StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "196"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
convertImagesToLinks: false,
noModals: false,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f195104%2fcalculate-median-of-a-matrix%23new-answer', 'question_page');

);

Post as a guest

Name

3 Answers
3

active

oldest

votes

3 Answers
3

active

oldest

votes

up vote
1
down vote

accepted

Sorting the contents of the matrix and then picking the index with the median value is a good approach. Lets see if we can do it with constant extra memory.

for i in range(1,len(A)):
 A[0].extend(A[i])

def index_of_smallest(numbers, starting_index):
 # Assume numbers is not empty.
 smallest, index = numbers[starting_index], starting_index
 for i, number in enumerate(numbers[starting_index:], starting_index):
 if number < smallest:
 smallest, index = number, i

 return index


def selection_sort(numbers):
 size = len(numbers)
 for i in range(size):
 index = index_of_smallest(numbers, i)
 numbers[i], numbers[index] = numbers[index], numbers[i]

 # Don't return anything, we are modifying it inplace.

def coordinates_of_smallest(matrix, starting_x, starting_y):
 smallest, smallest_x, smallest_y = matrix[starting_x][starting_y], starting_x, starting_y
 for x, row in enumerate(matrix):
 for y, cell in enumerate(row):
 if x < starting_x or (x == starting_x and y < starting_y):
 continue
 if cell < smallest:
 smallest, smallest_x, smallest_y = cell, x, y

 return smallest_x, smallest_y


def selection_sort(matrix):
 # Assume the matrix is not empty.
 n, m = len(matrix), len(matrix[0])
 for x in range(n):
 for y in range(m):
 smallest_x, smallest_y = coordinates_of_smallest(matrix, x, y)
 matrix[x][y], matrix[smallest_x][smallest_y] = matrix[smallest_x][smallest_y], matrix[x][y]

>>> matrix = [[1, 3, 5], [2, 6, 9], [3, 6, 9]]
>>> selection_sort(matrix)
>>> print(matrix) # [[1, 2, 3], [3, 5, 6], [6, 9, 9]]

Now getting the median of this is a piece of cake, it will be in the middle slot of the middle row! Since N * M is odd, both N and M must be odd. Therefore the median is at matrix[N // 2][M // 2].

edited May 25 at 14:53

answered May 25 at 14:36

spyr03

1,122418

this makes no use of the fact that the numbers in the row are in order, and the in-place sorting is unnecessary
â€“Â Maarten FabrÃ©
May 28 at 8:38

@MaartenFabrÃ© The inplace sorting is critical for it to use constant additional memory. That's the number one priority, not time complexity. Yes it it not necessary to do a full sort, but I don't think the additional code complexity is worth it. Also yes finding the smallest element can be done quicker by taking into account the rows are sorted, but again it is additional code complexity for something that isn't a priority.
â€“Â spyr03
May 29 at 13:00

add a commentÂ |Â

up vote
1
down vote

accepted

Sorting the contents of the matrix and then picking the index with the median value is a good approach. Lets see if we can do it with constant extra memory.

for i in range(1,len(A)):
 A[0].extend(A[i])

def index_of_smallest(numbers, starting_index):
 # Assume numbers is not empty.
 smallest, index = numbers[starting_index], starting_index
 for i, number in enumerate(numbers[starting_index:], starting_index):
 if number < smallest:
 smallest, index = number, i

 return index


def selection_sort(numbers):
 size = len(numbers)
 for i in range(size):
 index = index_of_smallest(numbers, i)
 numbers[i], numbers[index] = numbers[index], numbers[i]

 # Don't return anything, we are modifying it inplace.

def coordinates_of_smallest(matrix, starting_x, starting_y):
 smallest, smallest_x, smallest_y = matrix[starting_x][starting_y], starting_x, starting_y
 for x, row in enumerate(matrix):
 for y, cell in enumerate(row):
 if x < starting_x or (x == starting_x and y < starting_y):
 continue
 if cell < smallest:
 smallest, smallest_x, smallest_y = cell, x, y

 return smallest_x, smallest_y


def selection_sort(matrix):
 # Assume the matrix is not empty.
 n, m = len(matrix), len(matrix[0])
 for x in range(n):
 for y in range(m):
 smallest_x, smallest_y = coordinates_of_smallest(matrix, x, y)
 matrix[x][y], matrix[smallest_x][smallest_y] = matrix[smallest_x][smallest_y], matrix[x][y]

>>> matrix = [[1, 3, 5], [2, 6, 9], [3, 6, 9]]
>>> selection_sort(matrix)
>>> print(matrix) # [[1, 2, 3], [3, 5, 6], [6, 9, 9]]

Now getting the median of this is a piece of cake, it will be in the middle slot of the middle row! Since N * M is odd, both N and M must be odd. Therefore the median is at matrix[N // 2][M // 2].

edited May 25 at 14:53

answered May 25 at 14:36

spyr03

1,122418

this makes no use of the fact that the numbers in the row are in order, and the in-place sorting is unnecessary
â€“Â Maarten FabrÃ©
May 28 at 8:38

@MaartenFabrÃ© The inplace sorting is critical for it to use constant additional memory. That's the number one priority, not time complexity. Yes it it not necessary to do a full sort, but I don't think the additional code complexity is worth it. Also yes finding the smallest element can be done quicker by taking into account the rows are sorted, but again it is additional code complexity for something that isn't a priority.
â€“Â spyr03
May 29 at 13:00

add a commentÂ |Â

up vote
1
down vote

accepted

Sorting the contents of the matrix and then picking the index with the median value is a good approach. Lets see if we can do it with constant extra memory.

for i in range(1,len(A)):
 A[0].extend(A[i])

def index_of_smallest(numbers, starting_index):
 # Assume numbers is not empty.
 smallest, index = numbers[starting_index], starting_index
 for i, number in enumerate(numbers[starting_index:], starting_index):
 if number < smallest:
 smallest, index = number, i

 return index


def selection_sort(numbers):
 size = len(numbers)
 for i in range(size):
 index = index_of_smallest(numbers, i)
 numbers[i], numbers[index] = numbers[index], numbers[i]

 # Don't return anything, we are modifying it inplace.

def coordinates_of_smallest(matrix, starting_x, starting_y):
 smallest, smallest_x, smallest_y = matrix[starting_x][starting_y], starting_x, starting_y
 for x, row in enumerate(matrix):
 for y, cell in enumerate(row):
 if x < starting_x or (x == starting_x and y < starting_y):
 continue
 if cell < smallest:
 smallest, smallest_x, smallest_y = cell, x, y

 return smallest_x, smallest_y


def selection_sort(matrix):
 # Assume the matrix is not empty.
 n, m = len(matrix), len(matrix[0])
 for x in range(n):
 for y in range(m):
 smallest_x, smallest_y = coordinates_of_smallest(matrix, x, y)
 matrix[x][y], matrix[smallest_x][smallest_y] = matrix[smallest_x][smallest_y], matrix[x][y]

>>> matrix = [[1, 3, 5], [2, 6, 9], [3, 6, 9]]
>>> selection_sort(matrix)
>>> print(matrix) # [[1, 2, 3], [3, 5, 6], [6, 9, 9]]

Now getting the median of this is a piece of cake, it will be in the middle slot of the middle row! Since N * M is odd, both N and M must be odd. Therefore the median is at matrix[N // 2][M // 2].

edited May 25 at 14:53

answered May 25 at 14:36

spyr03

1,122418

Sorting the contents of the matrix and then picking the index with the median value is a good approach. Lets see if we can do it with constant extra memory.

for i in range(1,len(A)):
 A[0].extend(A[i])

def index_of_smallest(numbers, starting_index):
 # Assume numbers is not empty.
 smallest, index = numbers[starting_index], starting_index
 for i, number in enumerate(numbers[starting_index:], starting_index):
 if number < smallest:
 smallest, index = number, i

 return index


def selection_sort(numbers):
 size = len(numbers)
 for i in range(size):
 index = index_of_smallest(numbers, i)
 numbers[i], numbers[index] = numbers[index], numbers[i]

 # Don't return anything, we are modifying it inplace.

def coordinates_of_smallest(matrix, starting_x, starting_y):
 smallest, smallest_x, smallest_y = matrix[starting_x][starting_y], starting_x, starting_y
 for x, row in enumerate(matrix):
 for y, cell in enumerate(row):
 if x < starting_x or (x == starting_x and y < starting_y):
 continue
 if cell < smallest:
 smallest, smallest_x, smallest_y = cell, x, y

 return smallest_x, smallest_y


def selection_sort(matrix):
 # Assume the matrix is not empty.
 n, m = len(matrix), len(matrix[0])
 for x in range(n):
 for y in range(m):
 smallest_x, smallest_y = coordinates_of_smallest(matrix, x, y)
 matrix[x][y], matrix[smallest_x][smallest_y] = matrix[smallest_x][smallest_y], matrix[x][y]

>>> matrix = [[1, 3, 5], [2, 6, 9], [3, 6, 9]]
>>> selection_sort(matrix)
>>> print(matrix) # [[1, 2, 3], [3, 5, 6], [6, 9, 9]]

Now getting the median of this is a piece of cake, it will be in the middle slot of the middle row! Since N * M is odd, both N and M must be odd. Therefore the median is at matrix[N // 2][M // 2].

edited May 25 at 14:53

answered May 25 at 14:36

spyr03

1,122418

edited May 25 at 14:53

answered May 25 at 14:36

spyr03

1,122418

answered May 25 at 14:36

spyr03

1,122418

answered May 25 at 14:36

spyr03

1,122418

this makes no use of the fact that the numbers in the row are in order, and the in-place sorting is unnecessary
â€“Â Maarten FabrÃ©
May 28 at 8:38

@MaartenFabrÃ© The inplace sorting is critical for it to use constant additional memory. That's the number one priority, not time complexity. Yes it it not necessary to do a full sort, but I don't think the additional code complexity is worth it. Also yes finding the smallest element can be done quicker by taking into account the rows are sorted, but again it is additional code complexity for something that isn't a priority.
â€“Â spyr03
May 29 at 13:00

add a commentÂ |Â

this makes no use of the fact that the numbers in the row are in order, and the in-place sorting is unnecessary
â€“Â Maarten FabrÃ©
May 28 at 8:38

@MaartenFabrÃ© The inplace sorting is critical for it to use constant additional memory. That's the number one priority, not time complexity. Yes it it not necessary to do a full sort, but I don't think the additional code complexity is worth it. Also yes finding the smallest element can be done quicker by taking into account the rows are sorted, but again it is additional code complexity for something that isn't a priority.
â€“Â spyr03
May 29 at 13:00

this makes no use of the fact that the numbers in the row are in order, and the in-place sorting is unnecessary
â€“Â Maarten FabrÃ©
May 28 at 8:38

@MaartenFabrÃ© The inplace sorting is critical for it to use constant additional memory. That's the number one priority, not time complexity. Yes it it not necessary to do a full sort, but I don't think the additional code complexity is worth it. Also yes finding the smallest element can be done quicker by taking into account the rows are sorted, but again it is additional code complexity for something that isn't a priority.
â€“Â spyr03
May 29 at 13:00

add a commentÂ |Â

up vote
1
down vote

Follow PEP8
- A is a bad variable name, use say matrix.
- You should remove the space in-front of the function argument.
- You should put spaces after ,.
- You don't need the () surrounding sorted.
- You could add some space around your division.

You can use // rather than / to make your code Python 2 and Python 3 compatable.

You don't need to use pop, normal indexing will work too.

def find_median(matrix):
 """Returns the median value from given matrix"""
 for i in range(1, len(matrix)):
 matrix[0].extend(matrix[i])
 return sorted(matrix[0])[len(matrix[0]) // 2]

Your code doesn't work as the challenge asks, if you add print(matrix) before return you'll see:

[[1, 3, 5, 2, 6, 9, 3, 6, 9], [2, 6, 9], [3, 6, 9]]

answered May 25 at 10:47

Peilonrayz

24.3k336102

I think you could be a little clearer as to why the original code doesn't work, mentioning "you've used extra space" would be a clearer.
â€“Â spyr03
May 25 at 14:39

add a commentÂ |Â

up vote
1
down vote

Follow PEP8
- A is a bad variable name, use say matrix.
- You should remove the space in-front of the function argument.
- You should put spaces after ,.
- You don't need the () surrounding sorted.
- You could add some space around your division.

You can use // rather than / to make your code Python 2 and Python 3 compatable.

You don't need to use pop, normal indexing will work too.

def find_median(matrix):
 """Returns the median value from given matrix"""
 for i in range(1, len(matrix)):
 matrix[0].extend(matrix[i])
 return sorted(matrix[0])[len(matrix[0]) // 2]

Your code doesn't work as the challenge asks, if you add print(matrix) before return you'll see:

[[1, 3, 5, 2, 6, 9, 3, 6, 9], [2, 6, 9], [3, 6, 9]]

answered May 25 at 10:47

Peilonrayz

24.3k336102

I think you could be a little clearer as to why the original code doesn't work, mentioning "you've used extra space" would be a clearer.
â€“Â spyr03
May 25 at 14:39

add a commentÂ |Â

up vote
1
down vote

Follow PEP8
- A is a bad variable name, use say matrix.
- You should remove the space in-front of the function argument.
- You should put spaces after ,.
- You don't need the () surrounding sorted.
- You could add some space around your division.

You can use // rather than / to make your code Python 2 and Python 3 compatable.

You don't need to use pop, normal indexing will work too.

def find_median(matrix):
 """Returns the median value from given matrix"""
 for i in range(1, len(matrix)):
 matrix[0].extend(matrix[i])
 return sorted(matrix[0])[len(matrix[0]) // 2]

Your code doesn't work as the challenge asks, if you add print(matrix) before return you'll see:

[[1, 3, 5, 2, 6, 9, 3, 6, 9], [2, 6, 9], [3, 6, 9]]

answered May 25 at 10:47

Peilonrayz

24.3k336102

Follow PEP8
- A is a bad variable name, use say matrix.
- You should remove the space in-front of the function argument.
- You should put spaces after ,.
- You don't need the () surrounding sorted.
- You could add some space around your division.

You can use // rather than / to make your code Python 2 and Python 3 compatable.

You don't need to use pop, normal indexing will work too.

def find_median(matrix):
 """Returns the median value from given matrix"""
 for i in range(1, len(matrix)):
 matrix[0].extend(matrix[i])
 return sorted(matrix[0])[len(matrix[0]) // 2]

Your code doesn't work as the challenge asks, if you add print(matrix) before return you'll see:

[[1, 3, 5, 2, 6, 9, 3, 6, 9], [2, 6, 9], [3, 6, 9]]

answered May 25 at 10:47

Peilonrayz

24.3k336102

answered May 25 at 10:47

Peilonrayz

24.3k336102

answered May 25 at 10:47

Peilonrayz

24.3k336102

answered May 25 at 10:47

Peilonrayz

24.3k336102

I think you could be a little clearer as to why the original code doesn't work, mentioning "you've used extra space" would be a clearer.
â€“Â spyr03
May 25 at 14:39

add a commentÂ |Â

I think you could be a little clearer as to why the original code doesn't work, mentioning "you've used extra space" would be a clearer.
â€“Â spyr03
May 25 at 14:39

I think you could be a little clearer as to why the original code doesn't work, mentioning "you've used extra space" would be a clearer.
â€“Â spyr03
May 25 at 14:39

add a commentÂ |Â

up vote
1
down vote

making one list

PS. this code will fail in python 3. If you really need floor division, use //, which is clear in both Python 2 and 3

instead of

 for i in range(1,len(A)):
 A[0].extend(A[i])

at least you can do

all_elements = 
for row in A:
 all_elements.extend(row)

or even better, use itertools.chain.from_iterable

from itertools import chain
all_elements = chain.from_iterable(A)
median_index = len(A) * len(A[0]) // 2
return sorted(all_elements)[median_index]

alternative approach

In your solution, you'll have 3 copies of the whole matrix (+ whatever sorted uses internally):

A[0] contains a copy of each element of the matrix because OP appends them all there.

The rest of the rows A also still exist, so they also contain an extra copy of each element (apart from the first row),

sorted generates a third list

By using chain you eliminate the first, so you still remain with 2 copies.

from bisect import insort_left

def find_median(matrix):
 """
 finds the median in a matrix with sorted rows
 """
 median_index = len(matrix) * len(matrix[0]) // 2
 iterators = map(iter, matrix)
 iterators = deque(sorted((next(it), row, it) for (row, it) in enumerate(iterators)))
 idx = 0
 while idx <= median_index:
 value, row, iterator = iterators.popleft()
 try:
 item = next(iterator), row, iterator
 insort_left(iterators, item)
 except StopIteration:
 pass
 idx += 1
# print(value, idx)
 return value

The row is added to the item as tiebreaker when there are multiple rows with the same value because iterators are not sortable.

standard library

I found out that heapq.merge does the same as what I do with the deque of iterators, so this works too:

from heapq import merge
from itertools import islice
def find_median_heapq(matrix):
 median_index = len(matrix) * len(matrix[0]) // 2
 all_items = merge(*matrix)
 return next(islice(all_items, median_index, None))

edited May 28 at 6:41

answered May 25 at 11:55

Maarten FabrÃ©

3,204214

A[0] doesn't copy, what do you mean by it? "The original A, A[0] and"
â€“Â Peilonrayz
May 25 at 12:09

A[0] contains a copy of each element of the matrix because OP appends them all there. The rest of the rows A also still exist, so they also contain an extra copy of each element (apart from the first row), and sorted generates a third list
â€“Â Maarten FabrÃ©
May 25 at 12:18

Please can you clarify that in your answer, as it currently it reads as indexing a list returns a copy.
â€“Â Peilonrayz
May 25 at 12:20

Shouldn't "1st" be used instead of "1nd"?
â€“Â Mathias Ettinger
May 25 at 13:30

add a commentÂ |Â

up vote
1
down vote

making one list

PS. this code will fail in python 3. If you really need floor division, use //, which is clear in both Python 2 and 3

instead of

 for i in range(1,len(A)):
 A[0].extend(A[i])

at least you can do

all_elements = 
for row in A:
 all_elements.extend(row)

or even better, use itertools.chain.from_iterable

from itertools import chain
all_elements = chain.from_iterable(A)
median_index = len(A) * len(A[0]) // 2
return sorted(all_elements)[median_index]

alternative approach

In your solution, you'll have 3 copies of the whole matrix (+ whatever sorted uses internally):

A[0] contains a copy of each element of the matrix because OP appends them all there.

The rest of the rows A also still exist, so they also contain an extra copy of each element (apart from the first row),

sorted generates a third list

By using chain you eliminate the first, so you still remain with 2 copies.

from bisect import insort_left

def find_median(matrix):
 """
 finds the median in a matrix with sorted rows
 """
 median_index = len(matrix) * len(matrix[0]) // 2
 iterators = map(iter, matrix)
 iterators = deque(sorted((next(it), row, it) for (row, it) in enumerate(iterators)))
 idx = 0
 while idx <= median_index:
 value, row, iterator = iterators.popleft()
 try:
 item = next(iterator), row, iterator
 insort_left(iterators, item)
 except StopIteration:
 pass
 idx += 1
# print(value, idx)
 return value

The row is added to the item as tiebreaker when there are multiple rows with the same value because iterators are not sortable.

standard library

I found out that heapq.merge does the same as what I do with the deque of iterators, so this works too:

from heapq import merge
from itertools import islice
def find_median_heapq(matrix):
 median_index = len(matrix) * len(matrix[0]) // 2
 all_items = merge(*matrix)
 return next(islice(all_items, median_index, None))

edited May 28 at 6:41

answered May 25 at 11:55

Maarten FabrÃ©

3,204214

A[0] doesn't copy, what do you mean by it? "The original A, A[0] and"
â€“Â Peilonrayz
May 25 at 12:09

A[0] contains a copy of each element of the matrix because OP appends them all there. The rest of the rows A also still exist, so they also contain an extra copy of each element (apart from the first row), and sorted generates a third list
â€“Â Maarten FabrÃ©
May 25 at 12:18

Please can you clarify that in your answer, as it currently it reads as indexing a list returns a copy.
â€“Â Peilonrayz
May 25 at 12:20

Shouldn't "1st" be used instead of "1nd"?
â€“Â Mathias Ettinger
May 25 at 13:30

add a commentÂ |Â

up vote
1
down vote

making one list

PS. this code will fail in python 3. If you really need floor division, use //, which is clear in both Python 2 and 3

instead of

 for i in range(1,len(A)):
 A[0].extend(A[i])

at least you can do

all_elements = 
for row in A:
 all_elements.extend(row)

or even better, use itertools.chain.from_iterable

from itertools import chain
all_elements = chain.from_iterable(A)
median_index = len(A) * len(A[0]) // 2
return sorted(all_elements)[median_index]

alternative approach

In your solution, you'll have 3 copies of the whole matrix (+ whatever sorted uses internally):

A[0] contains a copy of each element of the matrix because OP appends them all there.

The rest of the rows A also still exist, so they also contain an extra copy of each element (apart from the first row),

sorted generates a third list

By using chain you eliminate the first, so you still remain with 2 copies.

from bisect import insort_left

def find_median(matrix):
 """
 finds the median in a matrix with sorted rows
 """
 median_index = len(matrix) * len(matrix[0]) // 2
 iterators = map(iter, matrix)
 iterators = deque(sorted((next(it), row, it) for (row, it) in enumerate(iterators)))
 idx = 0
 while idx <= median_index:
 value, row, iterator = iterators.popleft()
 try:
 item = next(iterator), row, iterator
 insort_left(iterators, item)
 except StopIteration:
 pass
 idx += 1
# print(value, idx)
 return value

The row is added to the item as tiebreaker when there are multiple rows with the same value because iterators are not sortable.

standard library

I found out that heapq.merge does the same as what I do with the deque of iterators, so this works too:

from heapq import merge
from itertools import islice
def find_median_heapq(matrix):
 median_index = len(matrix) * len(matrix[0]) // 2
 all_items = merge(*matrix)
 return next(islice(all_items, median_index, None))

edited May 28 at 6:41

answered May 25 at 11:55

Maarten FabrÃ©

3,204214

making one list

PS. this code will fail in python 3. If you really need floor division, use //, which is clear in both Python 2 and 3

instead of

 for i in range(1,len(A)):
 A[0].extend(A[i])

at least you can do

all_elements = 
for row in A:
 all_elements.extend(row)

or even better, use itertools.chain.from_iterable

from itertools import chain
all_elements = chain.from_iterable(A)
median_index = len(A) * len(A[0]) // 2
return sorted(all_elements)[median_index]

alternative approach

In your solution, you'll have 3 copies of the whole matrix (+ whatever sorted uses internally):

A[0] contains a copy of each element of the matrix because OP appends them all there.

The rest of the rows A also still exist, so they also contain an extra copy of each element (apart from the first row),

sorted generates a third list

By using chain you eliminate the first, so you still remain with 2 copies.

from bisect import insort_left

def find_median(matrix):
 """
 finds the median in a matrix with sorted rows
 """
 median_index = len(matrix) * len(matrix[0]) // 2
 iterators = map(iter, matrix)
 iterators = deque(sorted((next(it), row, it) for (row, it) in enumerate(iterators)))
 idx = 0
 while idx <= median_index:
 value, row, iterator = iterators.popleft()
 try:
 item = next(iterator), row, iterator
 insort_left(iterators, item)
 except StopIteration:
 pass
 idx += 1
# print(value, idx)
 return value

The row is added to the item as tiebreaker when there are multiple rows with the same value because iterators are not sortable.

standard library

I found out that heapq.merge does the same as what I do with the deque of iterators, so this works too:

from heapq import merge
from itertools import islice
def find_median_heapq(matrix):
 median_index = len(matrix) * len(matrix[0]) // 2
 all_items = merge(*matrix)
 return next(islice(all_items, median_index, None))

edited May 28 at 6:41

answered May 25 at 11:55

Maarten FabrÃ©

3,204214

edited May 28 at 6:41

answered May 25 at 11:55

Maarten FabrÃ©

3,204214

answered May 25 at 11:55

Maarten FabrÃ©

3,204214

answered May 25 at 11:55

Maarten FabrÃ©

3,204214

A[0] doesn't copy, what do you mean by it? "The original A, A[0] and"
â€“Â Peilonrayz
May 25 at 12:09

A[0] contains a copy of each element of the matrix because OP appends them all there. The rest of the rows A also still exist, so they also contain an extra copy of each element (apart from the first row), and sorted generates a third list
â€“Â Maarten FabrÃ©
May 25 at 12:18

Please can you clarify that in your answer, as it currently it reads as indexing a list returns a copy.
â€“Â Peilonrayz
May 25 at 12:20

Shouldn't "1st" be used instead of "1nd"?
â€“Â Mathias Ettinger
May 25 at 13:30

add a commentÂ |Â

A[0] doesn't copy, what do you mean by it? "The original A, A[0] and"
â€“Â Peilonrayz
May 25 at 12:09

A[0] contains a copy of each element of the matrix because OP appends them all there. The rest of the rows A also still exist, so they also contain an extra copy of each element (apart from the first row), and sorted generates a third list
â€“Â Maarten FabrÃ©
May 25 at 12:18

Please can you clarify that in your answer, as it currently it reads as indexing a list returns a copy.
â€“Â Peilonrayz
May 25 at 12:20

Shouldn't "1st" be used instead of "1nd"?
â€“Â Mathias Ettinger
May 25 at 13:30

A[0] doesn't copy, what do you mean by it? "The original A, A[0] and"
â€“Â Peilonrayz
May 25 at 12:09

A[0] contains a copy of each element of the matrix because OP appends them all there. The rest of the rows A also still exist, so they also contain an extra copy of each element (apart from the first row), and sorted generates a third list
â€“Â Maarten FabrÃ©
May 25 at 12:18

Please can you clarify that in your answer, as it currently it reads as indexing a list returns a copy.
â€“Â Peilonrayz
May 25 at 12:20

Shouldn't "1st" be used instead of "1nd"?
â€“Â Mathias Ettinger
May 25 at 13:30

add a commentÂ |Â

draft saved

draft discarded

draft saved

draft discarded

Post as a guest

Name

搜尋此網誌

trjhtr

Calculate median of a matrix

3 Answers
3

making one list

alternative approach

standard library

Your Answer

Post as a guest

3 Answers
3

3 Answers
3

making one list

alternative approach

standard library

making one list

alternative approach

standard library

making one list

alternative approach

standard library

making one list

alternative approach

standard library

Post as a guest

Popular posts from this blog

Chat program with C++ and SFML

Read files from a directory using Promises

Read an image with ADNS2610 optical sensor and Arduino Uno

Calculate median of a matrix

3 Answers 3

making one list

alternative approach

standard library

Your Answer

Sign up or log in

Post as a guest

Post as a guest

3 Answers 3

3 Answers 3

making one list

alternative approach

standard library

making one list

alternative approach

standard library

making one list

alternative approach

standard library

making one list

alternative approach

standard library

Sign up or log in

Post as a guest

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Popular posts from this blog

Chat program with C++ and SFML

Read files from a directory using Promises

Read an image with ADNS2610 optical sensor and Arduino Uno

3 Answers
3

3 Answers
3

3 Answers
3