Remove pixel patch in image which is stored as array

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP





.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty margin-bottom:0;







up vote
8
down vote

favorite
3












I have an array I which stores N images of size P (number of pixels). Every image is of size P = q*q.



Now I want to delete patches of size ps around a selected index IDX (set all values to zero).



My approach was to reshape every single image using reshape(q,q) and delete the pixels around IDX. I also have to check if the index is not outside the image.



Here is an example:



BEFORE: enter image description here



AFTER: enter image description here



My code is a real bottleneck and I would like to know if there is a way to improve the performance of my approach.



import numpy as np
import matplotlib.pyplot as plt
import time

def myplot(I):
imgs = 5
for i in range(imgs**2):
plt.subplot(imgs,imgs,(i+1))
plt.imshow(I[i].reshape(q,q), cmap="viridis", interpolation="none")
plt.axis("off")
plt.show()

N = 10000
q = 28
P = q*q
I = np.ones((N,P))
myplot(I)

ps = 5
IDX = np.random.randint(0,P,(N,1))
x0, y0 = np.unravel_index(IDX,(q,q))

t0 = time.time()

# HOW TO IMPROVE THIS PART ? #
for i in range(N):
img = I[i].reshape(q,q)
for x in range(ps):
for y in range(ps):
if (x0[i]+x < q) and (y0[i]+y < q):
img[x0[i]+x,y0[i]+y] = 0.0
I[i] = img.reshape(1,q*q)

print(time.time()-t0)
myplot(I)


I call this code (without the plotting procedure) about one million times from another code. Every call takes about 1 second on my system. This makes the code so far quite useless.



Any advice?







share|improve this question

















  • 1




    Hi! I have rolled back your last edit. Please don't change or add to the code in your question after you have received answers. See What should I do when someone answers my question? Thank you.
    – Phrancis
    Jun 7 at 21:55
















up vote
8
down vote

favorite
3












I have an array I which stores N images of size P (number of pixels). Every image is of size P = q*q.



Now I want to delete patches of size ps around a selected index IDX (set all values to zero).



My approach was to reshape every single image using reshape(q,q) and delete the pixels around IDX. I also have to check if the index is not outside the image.



Here is an example:



BEFORE: enter image description here



AFTER: enter image description here



My code is a real bottleneck and I would like to know if there is a way to improve the performance of my approach.



import numpy as np
import matplotlib.pyplot as plt
import time

def myplot(I):
imgs = 5
for i in range(imgs**2):
plt.subplot(imgs,imgs,(i+1))
plt.imshow(I[i].reshape(q,q), cmap="viridis", interpolation="none")
plt.axis("off")
plt.show()

N = 10000
q = 28
P = q*q
I = np.ones((N,P))
myplot(I)

ps = 5
IDX = np.random.randint(0,P,(N,1))
x0, y0 = np.unravel_index(IDX,(q,q))

t0 = time.time()

# HOW TO IMPROVE THIS PART ? #
for i in range(N):
img = I[i].reshape(q,q)
for x in range(ps):
for y in range(ps):
if (x0[i]+x < q) and (y0[i]+y < q):
img[x0[i]+x,y0[i]+y] = 0.0
I[i] = img.reshape(1,q*q)

print(time.time()-t0)
myplot(I)


I call this code (without the plotting procedure) about one million times from another code. Every call takes about 1 second on my system. This makes the code so far quite useless.



Any advice?







share|improve this question

















  • 1




    Hi! I have rolled back your last edit. Please don't change or add to the code in your question after you have received answers. See What should I do when someone answers my question? Thank you.
    – Phrancis
    Jun 7 at 21:55












up vote
8
down vote

favorite
3









up vote
8
down vote

favorite
3






3





I have an array I which stores N images of size P (number of pixels). Every image is of size P = q*q.



Now I want to delete patches of size ps around a selected index IDX (set all values to zero).



My approach was to reshape every single image using reshape(q,q) and delete the pixels around IDX. I also have to check if the index is not outside the image.



Here is an example:



BEFORE: enter image description here



AFTER: enter image description here



My code is a real bottleneck and I would like to know if there is a way to improve the performance of my approach.



import numpy as np
import matplotlib.pyplot as plt
import time

def myplot(I):
imgs = 5
for i in range(imgs**2):
plt.subplot(imgs,imgs,(i+1))
plt.imshow(I[i].reshape(q,q), cmap="viridis", interpolation="none")
plt.axis("off")
plt.show()

N = 10000
q = 28
P = q*q
I = np.ones((N,P))
myplot(I)

ps = 5
IDX = np.random.randint(0,P,(N,1))
x0, y0 = np.unravel_index(IDX,(q,q))

t0 = time.time()

# HOW TO IMPROVE THIS PART ? #
for i in range(N):
img = I[i].reshape(q,q)
for x in range(ps):
for y in range(ps):
if (x0[i]+x < q) and (y0[i]+y < q):
img[x0[i]+x,y0[i]+y] = 0.0
I[i] = img.reshape(1,q*q)

print(time.time()-t0)
myplot(I)


I call this code (without the plotting procedure) about one million times from another code. Every call takes about 1 second on my system. This makes the code so far quite useless.



Any advice?







share|improve this question













I have an array I which stores N images of size P (number of pixels). Every image is of size P = q*q.



Now I want to delete patches of size ps around a selected index IDX (set all values to zero).



My approach was to reshape every single image using reshape(q,q) and delete the pixels around IDX. I also have to check if the index is not outside the image.



Here is an example:



BEFORE: enter image description here



AFTER: enter image description here



My code is a real bottleneck and I would like to know if there is a way to improve the performance of my approach.



import numpy as np
import matplotlib.pyplot as plt
import time

def myplot(I):
imgs = 5
for i in range(imgs**2):
plt.subplot(imgs,imgs,(i+1))
plt.imshow(I[i].reshape(q,q), cmap="viridis", interpolation="none")
plt.axis("off")
plt.show()

N = 10000
q = 28
P = q*q
I = np.ones((N,P))
myplot(I)

ps = 5
IDX = np.random.randint(0,P,(N,1))
x0, y0 = np.unravel_index(IDX,(q,q))

t0 = time.time()

# HOW TO IMPROVE THIS PART ? #
for i in range(N):
img = I[i].reshape(q,q)
for x in range(ps):
for y in range(ps):
if (x0[i]+x < q) and (y0[i]+y < q):
img[x0[i]+x,y0[i]+y] = 0.0
I[i] = img.reshape(1,q*q)

print(time.time()-t0)
myplot(I)


I call this code (without the plotting procedure) about one million times from another code. Every call takes about 1 second on my system. This makes the code so far quite useless.



Any advice?









share|improve this question












share|improve this question




share|improve this question








edited Jun 7 at 21:55









Phrancis

14.6k644137




14.6k644137









asked Jun 7 at 10:13









Samuel

1838




1838







  • 1




    Hi! I have rolled back your last edit. Please don't change or add to the code in your question after you have received answers. See What should I do when someone answers my question? Thank you.
    – Phrancis
    Jun 7 at 21:55












  • 1




    Hi! I have rolled back your last edit. Please don't change or add to the code in your question after you have received answers. See What should I do when someone answers my question? Thank you.
    – Phrancis
    Jun 7 at 21:55







1




1




Hi! I have rolled back your last edit. Please don't change or add to the code in your question after you have received answers. See What should I do when someone answers my question? Thank you.
– Phrancis
Jun 7 at 21:55




Hi! I have rolled back your last edit. Please don't change or add to the code in your question after you have received answers. See What should I do when someone answers my question? Thank you.
– Phrancis
Jun 7 at 21:55










1 Answer
1






active

oldest

votes

















up vote
18
down vote



accepted










  1. On my computer it takes 1.745 seconds to run the code in the post.



  2. There's no need for the array of random indexes to be two-dimensional:



    IDX = np.random.randint(0,P,(N,1))


    In fact this is harmful for performance, because it means that x0[i] is an array of length 1 (not a scalar) and so img[x0[i]+x,y0[i]+y] requires "fancy indexing" which is slower than normal indexing.



    It would be simpler to make the array of indexes one-dimensional:



    IDX = np.random.randint(P, size=N)


    This reduces the runtime to about 0.459 seconds (26.3% of the original).




  3. There is no need to reassign I[i] at the end of the loop. When you call the reshape method on a NumPy array, what you get is a view onto the original array (not a copy) if possible. (And it is possible in this case.) So updating the view also updates the original.



    This reduces the runtime to about 0.449 seconds (25.8%).




  4. Instead of looping over range(N) and then looking up I[i] and x0[i] and y0[i], use zip to loop over all the arrays simultaneously:



    for img, xx, yy in zip(I, x0, y0):
    img = img.reshape(q,q)
    for x in range(ps):
    for y in range(ps):
    if xx + x < q and yy + y < q:
    img[xy + x, yy + y] = 0.0


    This reduces the runtime to about 0.358 seconds (20.5%).




  5. Instead of looping over all the pixels in the patch and updating each pixel individually, use slices to update the whole region in one step:



    for image, x, y in zip(I, x0, y0):
    image.reshape(q, q)[x:x + ps, y:y + ps] = 0.0


    This works because NumPy (and Python generally) ensures that the bounds of a slice do not go beyond the end of the array. See the slicing documentation:




    The slice of $s$ from $i$ to $j$ is defined as the sequence of items with index $k$ such that $i le k < j$. If $i$ or $j$ is greater than len(s), use len(s).




    This reduces the runtime to about 0.025 seconds (1.4%).




  6. We can vectorize the additions x + ps and y + ps:



    for image, x, y, x1, y1 in zip(I, x0, y0, x0 + ps, y0 + ps):
    image.reshape(q, q)[x:x1, y:y1] = 0.0


    This reduces the runtime to about 0.021 seconds (1.2%).




  7. We could avoid the reshape inside the loop by doing a single reshape of the whole I array:



    images = I.reshape(N, q, q)


    and then:



    for image, x, y, x1, y1 in zip(images, x0, y0, x0 + ps, y0 + ps):
    image[x:x1, y:y1] = 0.0


    This reduces the runtime to about 0.018 seconds (1.0%).




  8. We can halve the number of indexing operations by indexing the images array just once on each loop iteration:



    for i, x, y, x1, y1 in zip(range(N), x0, y0, x0 + ps, y0 + ps):
    images[i, x:x1, y:y1] = 0.0


    This reduces the runtime to about 0.011 seconds (0.6%).



That's about 150 times speedup overall, so calling this a million times will still take about 3 hours on my computer. There may be other improvements to be had if only we could see more of your code, but you'll need to make a new post for that.






share|improve this answer























  • Awesome! Now I try to understand everything.
    – Samuel
    Jun 7 at 13:57










  • I am still amazed by your answer. Did not think that it would go so fast using Python. There is one thing I noted using your approach. Given a odd patch size, for example ps=3 and coordinates x0 and y0 your approach deletes 9 pixels from top left to bottom right (that is the square from [x0,y0] to [x0+ps,y0+ps]). But how can I delete the surrounding pixels instead, that is [x0-1, y0-1] to [x0+1, y0+1]? I tried just doing x0=x0-1 and y0=y0-1which results quite often in images where no pixels got deleted. What would be the right approach to do that?
    – Samuel
    Jun 7 at 21:44










  • Use np.maximum(x0 - 1, 0) instead of x0 - 1.
    – Gareth Rees
    Jun 7 at 22:04











  • I wanted to replace the patches with random numbers instead of zeros. So I tried images[i, x:x1, y:y1] = np.random.rand(x1-x,y1-y) which does not work. Any suggestions?
    – Samuel
    Jun 13 at 6:57










Your Answer




StackExchange.ifUsing("editor", function ()
return StackExchange.using("mathjaxEditing", function ()
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["\$", "\$"]]);
);
);
, "mathjax-editing");

StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "196"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
convertImagesToLinks: false,
noModals: false,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);



);








 

draft saved


draft discarded


















StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f196018%2fremove-pixel-patch-in-image-which-is-stored-as-array%23new-answer', 'question_page');

);

Post as a guest






























1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes








up vote
18
down vote



accepted










  1. On my computer it takes 1.745 seconds to run the code in the post.



  2. There's no need for the array of random indexes to be two-dimensional:



    IDX = np.random.randint(0,P,(N,1))


    In fact this is harmful for performance, because it means that x0[i] is an array of length 1 (not a scalar) and so img[x0[i]+x,y0[i]+y] requires "fancy indexing" which is slower than normal indexing.



    It would be simpler to make the array of indexes one-dimensional:



    IDX = np.random.randint(P, size=N)


    This reduces the runtime to about 0.459 seconds (26.3% of the original).




  3. There is no need to reassign I[i] at the end of the loop. When you call the reshape method on a NumPy array, what you get is a view onto the original array (not a copy) if possible. (And it is possible in this case.) So updating the view also updates the original.



    This reduces the runtime to about 0.449 seconds (25.8%).




  4. Instead of looping over range(N) and then looking up I[i] and x0[i] and y0[i], use zip to loop over all the arrays simultaneously:



    for img, xx, yy in zip(I, x0, y0):
    img = img.reshape(q,q)
    for x in range(ps):
    for y in range(ps):
    if xx + x < q and yy + y < q:
    img[xy + x, yy + y] = 0.0


    This reduces the runtime to about 0.358 seconds (20.5%).




  5. Instead of looping over all the pixels in the patch and updating each pixel individually, use slices to update the whole region in one step:



    for image, x, y in zip(I, x0, y0):
    image.reshape(q, q)[x:x + ps, y:y + ps] = 0.0


    This works because NumPy (and Python generally) ensures that the bounds of a slice do not go beyond the end of the array. See the slicing documentation:




    The slice of $s$ from $i$ to $j$ is defined as the sequence of items with index $k$ such that $i le k < j$. If $i$ or $j$ is greater than len(s), use len(s).




    This reduces the runtime to about 0.025 seconds (1.4%).




  6. We can vectorize the additions x + ps and y + ps:



    for image, x, y, x1, y1 in zip(I, x0, y0, x0 + ps, y0 + ps):
    image.reshape(q, q)[x:x1, y:y1] = 0.0


    This reduces the runtime to about 0.021 seconds (1.2%).




  7. We could avoid the reshape inside the loop by doing a single reshape of the whole I array:



    images = I.reshape(N, q, q)


    and then:



    for image, x, y, x1, y1 in zip(images, x0, y0, x0 + ps, y0 + ps):
    image[x:x1, y:y1] = 0.0


    This reduces the runtime to about 0.018 seconds (1.0%).




  8. We can halve the number of indexing operations by indexing the images array just once on each loop iteration:



    for i, x, y, x1, y1 in zip(range(N), x0, y0, x0 + ps, y0 + ps):
    images[i, x:x1, y:y1] = 0.0


    This reduces the runtime to about 0.011 seconds (0.6%).



That's about 150 times speedup overall, so calling this a million times will still take about 3 hours on my computer. There may be other improvements to be had if only we could see more of your code, but you'll need to make a new post for that.






share|improve this answer























  • Awesome! Now I try to understand everything.
    – Samuel
    Jun 7 at 13:57










  • I am still amazed by your answer. Did not think that it would go so fast using Python. There is one thing I noted using your approach. Given a odd patch size, for example ps=3 and coordinates x0 and y0 your approach deletes 9 pixels from top left to bottom right (that is the square from [x0,y0] to [x0+ps,y0+ps]). But how can I delete the surrounding pixels instead, that is [x0-1, y0-1] to [x0+1, y0+1]? I tried just doing x0=x0-1 and y0=y0-1which results quite often in images where no pixels got deleted. What would be the right approach to do that?
    – Samuel
    Jun 7 at 21:44










  • Use np.maximum(x0 - 1, 0) instead of x0 - 1.
    – Gareth Rees
    Jun 7 at 22:04











  • I wanted to replace the patches with random numbers instead of zeros. So I tried images[i, x:x1, y:y1] = np.random.rand(x1-x,y1-y) which does not work. Any suggestions?
    – Samuel
    Jun 13 at 6:57














up vote
18
down vote



accepted










  1. On my computer it takes 1.745 seconds to run the code in the post.



  2. There's no need for the array of random indexes to be two-dimensional:



    IDX = np.random.randint(0,P,(N,1))


    In fact this is harmful for performance, because it means that x0[i] is an array of length 1 (not a scalar) and so img[x0[i]+x,y0[i]+y] requires "fancy indexing" which is slower than normal indexing.



    It would be simpler to make the array of indexes one-dimensional:



    IDX = np.random.randint(P, size=N)


    This reduces the runtime to about 0.459 seconds (26.3% of the original).




  3. There is no need to reassign I[i] at the end of the loop. When you call the reshape method on a NumPy array, what you get is a view onto the original array (not a copy) if possible. (And it is possible in this case.) So updating the view also updates the original.



    This reduces the runtime to about 0.449 seconds (25.8%).




  4. Instead of looping over range(N) and then looking up I[i] and x0[i] and y0[i], use zip to loop over all the arrays simultaneously:



    for img, xx, yy in zip(I, x0, y0):
    img = img.reshape(q,q)
    for x in range(ps):
    for y in range(ps):
    if xx + x < q and yy + y < q:
    img[xy + x, yy + y] = 0.0


    This reduces the runtime to about 0.358 seconds (20.5%).




  5. Instead of looping over all the pixels in the patch and updating each pixel individually, use slices to update the whole region in one step:



    for image, x, y in zip(I, x0, y0):
    image.reshape(q, q)[x:x + ps, y:y + ps] = 0.0


    This works because NumPy (and Python generally) ensures that the bounds of a slice do not go beyond the end of the array. See the slicing documentation:




    The slice of $s$ from $i$ to $j$ is defined as the sequence of items with index $k$ such that $i le k < j$. If $i$ or $j$ is greater than len(s), use len(s).




    This reduces the runtime to about 0.025 seconds (1.4%).




  6. We can vectorize the additions x + ps and y + ps:



    for image, x, y, x1, y1 in zip(I, x0, y0, x0 + ps, y0 + ps):
    image.reshape(q, q)[x:x1, y:y1] = 0.0


    This reduces the runtime to about 0.021 seconds (1.2%).




  7. We could avoid the reshape inside the loop by doing a single reshape of the whole I array:



    images = I.reshape(N, q, q)


    and then:



    for image, x, y, x1, y1 in zip(images, x0, y0, x0 + ps, y0 + ps):
    image[x:x1, y:y1] = 0.0


    This reduces the runtime to about 0.018 seconds (1.0%).




  8. We can halve the number of indexing operations by indexing the images array just once on each loop iteration:



    for i, x, y, x1, y1 in zip(range(N), x0, y0, x0 + ps, y0 + ps):
    images[i, x:x1, y:y1] = 0.0


    This reduces the runtime to about 0.011 seconds (0.6%).



That's about 150 times speedup overall, so calling this a million times will still take about 3 hours on my computer. There may be other improvements to be had if only we could see more of your code, but you'll need to make a new post for that.






share|improve this answer























  • Awesome! Now I try to understand everything.
    – Samuel
    Jun 7 at 13:57










  • I am still amazed by your answer. Did not think that it would go so fast using Python. There is one thing I noted using your approach. Given a odd patch size, for example ps=3 and coordinates x0 and y0 your approach deletes 9 pixels from top left to bottom right (that is the square from [x0,y0] to [x0+ps,y0+ps]). But how can I delete the surrounding pixels instead, that is [x0-1, y0-1] to [x0+1, y0+1]? I tried just doing x0=x0-1 and y0=y0-1which results quite often in images where no pixels got deleted. What would be the right approach to do that?
    – Samuel
    Jun 7 at 21:44










  • Use np.maximum(x0 - 1, 0) instead of x0 - 1.
    – Gareth Rees
    Jun 7 at 22:04











  • I wanted to replace the patches with random numbers instead of zeros. So I tried images[i, x:x1, y:y1] = np.random.rand(x1-x,y1-y) which does not work. Any suggestions?
    – Samuel
    Jun 13 at 6:57












up vote
18
down vote



accepted







up vote
18
down vote



accepted






  1. On my computer it takes 1.745 seconds to run the code in the post.



  2. There's no need for the array of random indexes to be two-dimensional:



    IDX = np.random.randint(0,P,(N,1))


    In fact this is harmful for performance, because it means that x0[i] is an array of length 1 (not a scalar) and so img[x0[i]+x,y0[i]+y] requires "fancy indexing" which is slower than normal indexing.



    It would be simpler to make the array of indexes one-dimensional:



    IDX = np.random.randint(P, size=N)


    This reduces the runtime to about 0.459 seconds (26.3% of the original).




  3. There is no need to reassign I[i] at the end of the loop. When you call the reshape method on a NumPy array, what you get is a view onto the original array (not a copy) if possible. (And it is possible in this case.) So updating the view also updates the original.



    This reduces the runtime to about 0.449 seconds (25.8%).




  4. Instead of looping over range(N) and then looking up I[i] and x0[i] and y0[i], use zip to loop over all the arrays simultaneously:



    for img, xx, yy in zip(I, x0, y0):
    img = img.reshape(q,q)
    for x in range(ps):
    for y in range(ps):
    if xx + x < q and yy + y < q:
    img[xy + x, yy + y] = 0.0


    This reduces the runtime to about 0.358 seconds (20.5%).




  5. Instead of looping over all the pixels in the patch and updating each pixel individually, use slices to update the whole region in one step:



    for image, x, y in zip(I, x0, y0):
    image.reshape(q, q)[x:x + ps, y:y + ps] = 0.0


    This works because NumPy (and Python generally) ensures that the bounds of a slice do not go beyond the end of the array. See the slicing documentation:




    The slice of $s$ from $i$ to $j$ is defined as the sequence of items with index $k$ such that $i le k < j$. If $i$ or $j$ is greater than len(s), use len(s).




    This reduces the runtime to about 0.025 seconds (1.4%).




  6. We can vectorize the additions x + ps and y + ps:



    for image, x, y, x1, y1 in zip(I, x0, y0, x0 + ps, y0 + ps):
    image.reshape(q, q)[x:x1, y:y1] = 0.0


    This reduces the runtime to about 0.021 seconds (1.2%).




  7. We could avoid the reshape inside the loop by doing a single reshape of the whole I array:



    images = I.reshape(N, q, q)


    and then:



    for image, x, y, x1, y1 in zip(images, x0, y0, x0 + ps, y0 + ps):
    image[x:x1, y:y1] = 0.0


    This reduces the runtime to about 0.018 seconds (1.0%).




  8. We can halve the number of indexing operations by indexing the images array just once on each loop iteration:



    for i, x, y, x1, y1 in zip(range(N), x0, y0, x0 + ps, y0 + ps):
    images[i, x:x1, y:y1] = 0.0


    This reduces the runtime to about 0.011 seconds (0.6%).



That's about 150 times speedup overall, so calling this a million times will still take about 3 hours on my computer. There may be other improvements to be had if only we could see more of your code, but you'll need to make a new post for that.






share|improve this answer















  1. On my computer it takes 1.745 seconds to run the code in the post.



  2. There's no need for the array of random indexes to be two-dimensional:



    IDX = np.random.randint(0,P,(N,1))


    In fact this is harmful for performance, because it means that x0[i] is an array of length 1 (not a scalar) and so img[x0[i]+x,y0[i]+y] requires "fancy indexing" which is slower than normal indexing.



    It would be simpler to make the array of indexes one-dimensional:



    IDX = np.random.randint(P, size=N)


    This reduces the runtime to about 0.459 seconds (26.3% of the original).




  3. There is no need to reassign I[i] at the end of the loop. When you call the reshape method on a NumPy array, what you get is a view onto the original array (not a copy) if possible. (And it is possible in this case.) So updating the view also updates the original.



    This reduces the runtime to about 0.449 seconds (25.8%).




  4. Instead of looping over range(N) and then looking up I[i] and x0[i] and y0[i], use zip to loop over all the arrays simultaneously:



    for img, xx, yy in zip(I, x0, y0):
    img = img.reshape(q,q)
    for x in range(ps):
    for y in range(ps):
    if xx + x < q and yy + y < q:
    img[xy + x, yy + y] = 0.0


    This reduces the runtime to about 0.358 seconds (20.5%).




  5. Instead of looping over all the pixels in the patch and updating each pixel individually, use slices to update the whole region in one step:



    for image, x, y in zip(I, x0, y0):
    image.reshape(q, q)[x:x + ps, y:y + ps] = 0.0


    This works because NumPy (and Python generally) ensures that the bounds of a slice do not go beyond the end of the array. See the slicing documentation:




    The slice of $s$ from $i$ to $j$ is defined as the sequence of items with index $k$ such that $i le k < j$. If $i$ or $j$ is greater than len(s), use len(s).




    This reduces the runtime to about 0.025 seconds (1.4%).




  6. We can vectorize the additions x + ps and y + ps:



    for image, x, y, x1, y1 in zip(I, x0, y0, x0 + ps, y0 + ps):
    image.reshape(q, q)[x:x1, y:y1] = 0.0


    This reduces the runtime to about 0.021 seconds (1.2%).




  7. We could avoid the reshape inside the loop by doing a single reshape of the whole I array:



    images = I.reshape(N, q, q)


    and then:



    for image, x, y, x1, y1 in zip(images, x0, y0, x0 + ps, y0 + ps):
    image[x:x1, y:y1] = 0.0


    This reduces the runtime to about 0.018 seconds (1.0%).




  8. We can halve the number of indexing operations by indexing the images array just once on each loop iteration:



    for i, x, y, x1, y1 in zip(range(N), x0, y0, x0 + ps, y0 + ps):
    images[i, x:x1, y:y1] = 0.0


    This reduces the runtime to about 0.011 seconds (0.6%).



That's about 150 times speedup overall, so calling this a million times will still take about 3 hours on my computer. There may be other improvements to be had if only we could see more of your code, but you'll need to make a new post for that.







share|improve this answer















share|improve this answer



share|improve this answer








edited Jun 7 at 19:09


























answered Jun 7 at 11:06









Gareth Rees

41.1k394166




41.1k394166











  • Awesome! Now I try to understand everything.
    – Samuel
    Jun 7 at 13:57










  • I am still amazed by your answer. Did not think that it would go so fast using Python. There is one thing I noted using your approach. Given a odd patch size, for example ps=3 and coordinates x0 and y0 your approach deletes 9 pixels from top left to bottom right (that is the square from [x0,y0] to [x0+ps,y0+ps]). But how can I delete the surrounding pixels instead, that is [x0-1, y0-1] to [x0+1, y0+1]? I tried just doing x0=x0-1 and y0=y0-1which results quite often in images where no pixels got deleted. What would be the right approach to do that?
    – Samuel
    Jun 7 at 21:44










  • Use np.maximum(x0 - 1, 0) instead of x0 - 1.
    – Gareth Rees
    Jun 7 at 22:04











  • I wanted to replace the patches with random numbers instead of zeros. So I tried images[i, x:x1, y:y1] = np.random.rand(x1-x,y1-y) which does not work. Any suggestions?
    – Samuel
    Jun 13 at 6:57
















  • Awesome! Now I try to understand everything.
    – Samuel
    Jun 7 at 13:57










  • I am still amazed by your answer. Did not think that it would go so fast using Python. There is one thing I noted using your approach. Given a odd patch size, for example ps=3 and coordinates x0 and y0 your approach deletes 9 pixels from top left to bottom right (that is the square from [x0,y0] to [x0+ps,y0+ps]). But how can I delete the surrounding pixels instead, that is [x0-1, y0-1] to [x0+1, y0+1]? I tried just doing x0=x0-1 and y0=y0-1which results quite often in images where no pixels got deleted. What would be the right approach to do that?
    – Samuel
    Jun 7 at 21:44










  • Use np.maximum(x0 - 1, 0) instead of x0 - 1.
    – Gareth Rees
    Jun 7 at 22:04











  • I wanted to replace the patches with random numbers instead of zeros. So I tried images[i, x:x1, y:y1] = np.random.rand(x1-x,y1-y) which does not work. Any suggestions?
    – Samuel
    Jun 13 at 6:57















Awesome! Now I try to understand everything.
– Samuel
Jun 7 at 13:57




Awesome! Now I try to understand everything.
– Samuel
Jun 7 at 13:57












I am still amazed by your answer. Did not think that it would go so fast using Python. There is one thing I noted using your approach. Given a odd patch size, for example ps=3 and coordinates x0 and y0 your approach deletes 9 pixels from top left to bottom right (that is the square from [x0,y0] to [x0+ps,y0+ps]). But how can I delete the surrounding pixels instead, that is [x0-1, y0-1] to [x0+1, y0+1]? I tried just doing x0=x0-1 and y0=y0-1which results quite often in images where no pixels got deleted. What would be the right approach to do that?
– Samuel
Jun 7 at 21:44




I am still amazed by your answer. Did not think that it would go so fast using Python. There is one thing I noted using your approach. Given a odd patch size, for example ps=3 and coordinates x0 and y0 your approach deletes 9 pixels from top left to bottom right (that is the square from [x0,y0] to [x0+ps,y0+ps]). But how can I delete the surrounding pixels instead, that is [x0-1, y0-1] to [x0+1, y0+1]? I tried just doing x0=x0-1 and y0=y0-1which results quite often in images where no pixels got deleted. What would be the right approach to do that?
– Samuel
Jun 7 at 21:44












Use np.maximum(x0 - 1, 0) instead of x0 - 1.
– Gareth Rees
Jun 7 at 22:04





Use np.maximum(x0 - 1, 0) instead of x0 - 1.
– Gareth Rees
Jun 7 at 22:04













I wanted to replace the patches with random numbers instead of zeros. So I tried images[i, x:x1, y:y1] = np.random.rand(x1-x,y1-y) which does not work. Any suggestions?
– Samuel
Jun 13 at 6:57




I wanted to replace the patches with random numbers instead of zeros. So I tried images[i, x:x1, y:y1] = np.random.rand(x1-x,y1-y) which does not work. Any suggestions?
– Samuel
Jun 13 at 6:57












 

draft saved


draft discarded


























 


draft saved


draft discarded














StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f196018%2fremove-pixel-patch-in-image-which-is-stored-as-array%23new-answer', 'question_page');

);

Post as a guest













































































Popular posts from this blog

Greedy Best First Search implementation in Rust

Function to Return a JSON Like Objects Using VBA Collections and Arrays

C++11 CLH Lock Implementation