Improving performance of a subroutine that checks for a vacancy in a lattice

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP





.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty margin-bottom:0;







up vote
1
down vote

favorite












I use the following subroutine to check whether a small subsection of a 3D Int array is all equal to zero. If any value in the particular subsection is non-zero, I exit and return .false.. Within CheckForVacancy, subroutine bc gets called which takes care of out of bounds array indices.



This subroutine gets called millions of times during a particular run and it is responsible for about half the running time of the entire program. In the distant past, I tried to optimize it by changing the order in which I access the Lattice array and by using the any built-in with array slices but I never saw much improvement. Furthermore, the "array slices" solution can get complicated when dealing with the boundaries of the array. I also tried fiddling with optimization flags but I was largely experimenting blindly at that point.



The code is as follows:



LOGICAL FUNCTION CheckForVacancy (x, y, z)
! Checks for vacancy for a site [x,y,z]

use atrpmodule

IMPLICIT NONE

INTEGER, INTENT(IN) :: x, y, z
INTEGER :: Sx, Sy, Sz, i
INTEGER, DIMENSION(1:26) :: SpaceX = (/1,1,1,0,-1,-1,-1,0,1,1,1,0,-1,-1,-1,0,0,1,1,1,0,-1,-1,-1,0,0/)
INTEGER, DIMENSION(1:26) :: SpaceY = (/0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,-1,-1,-1,-1,-1,-1,-1,-1,-1/)
INTEGER, DIMENSION(1:26) :: SpaceZ = (/1,0,-1,-1,-1,0,1,1,1,0,-1,-1,-1,0,1,1,0,1,0,-1,-1,-1,0,1,1,0/)
!---------------------------------------------------------------------------

checkforvacancy=.true.
do i=1,26
Sx = x + SpaceX(i)
Sy = y + SpaceY(i)
Sz = z + SpaceZ(i)
call bc(Sx,Sy,Sz)
if (lattice(Sx,Sy,Sz)/=0)then
CheckForVacancy=.false.
exit
endif
enddo

END FUNCTION CheckForVacancy

SUBROUTINE bc (x, y, z)
! Takes case of boundary conditions
USE atrpmodule
IMPLICIT NONE

INTEGER :: x, y, z
!---------------------------------------------------------------------------
IF (x < 1) then
x = x + LattXDimm
elseIF(x > LattXDimm) then
x = x - LattXDimm
endif

IF (y < 1) then
y = y + LattYDimm
elseIF (y > LattYDimm) then
y = y - LattYDimm
endif

IF (z < 1) then
z = z + LattZDimm
elseIF (z > LattZDimm) then
z = z - LattZDimm
endif

END SUBROUTINE bc


For reference, Lattice is defined like this:



INTEGER, DIMENSION(:,:,:), ALLOCATABLE :: Lattice


It is allocated depending on program input and then set equal to zero. Also for reference, I am using the GNU compiler.



I suspect that the answer to this questions may be that I cannot do better, but I wanted to see if any Fortran programmers can spot something that I am missing or may want to try.







share|improve this question



















  • I don´t know fortran but, what is the complexity of Lattice operation? How many points are you checking if they are different than 0?
    – juvian
    Mar 22 at 20:27










  • @juvian it is 26 of them.
    – sturgman
    Mar 22 at 21:09
















up vote
1
down vote

favorite












I use the following subroutine to check whether a small subsection of a 3D Int array is all equal to zero. If any value in the particular subsection is non-zero, I exit and return .false.. Within CheckForVacancy, subroutine bc gets called which takes care of out of bounds array indices.



This subroutine gets called millions of times during a particular run and it is responsible for about half the running time of the entire program. In the distant past, I tried to optimize it by changing the order in which I access the Lattice array and by using the any built-in with array slices but I never saw much improvement. Furthermore, the "array slices" solution can get complicated when dealing with the boundaries of the array. I also tried fiddling with optimization flags but I was largely experimenting blindly at that point.



The code is as follows:



LOGICAL FUNCTION CheckForVacancy (x, y, z)
! Checks for vacancy for a site [x,y,z]

use atrpmodule

IMPLICIT NONE

INTEGER, INTENT(IN) :: x, y, z
INTEGER :: Sx, Sy, Sz, i
INTEGER, DIMENSION(1:26) :: SpaceX = (/1,1,1,0,-1,-1,-1,0,1,1,1,0,-1,-1,-1,0,0,1,1,1,0,-1,-1,-1,0,0/)
INTEGER, DIMENSION(1:26) :: SpaceY = (/0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,-1,-1,-1,-1,-1,-1,-1,-1,-1/)
INTEGER, DIMENSION(1:26) :: SpaceZ = (/1,0,-1,-1,-1,0,1,1,1,0,-1,-1,-1,0,1,1,0,1,0,-1,-1,-1,0,1,1,0/)
!---------------------------------------------------------------------------

checkforvacancy=.true.
do i=1,26
Sx = x + SpaceX(i)
Sy = y + SpaceY(i)
Sz = z + SpaceZ(i)
call bc(Sx,Sy,Sz)
if (lattice(Sx,Sy,Sz)/=0)then
CheckForVacancy=.false.
exit
endif
enddo

END FUNCTION CheckForVacancy

SUBROUTINE bc (x, y, z)
! Takes case of boundary conditions
USE atrpmodule
IMPLICIT NONE

INTEGER :: x, y, z
!---------------------------------------------------------------------------
IF (x < 1) then
x = x + LattXDimm
elseIF(x > LattXDimm) then
x = x - LattXDimm
endif

IF (y < 1) then
y = y + LattYDimm
elseIF (y > LattYDimm) then
y = y - LattYDimm
endif

IF (z < 1) then
z = z + LattZDimm
elseIF (z > LattZDimm) then
z = z - LattZDimm
endif

END SUBROUTINE bc


For reference, Lattice is defined like this:



INTEGER, DIMENSION(:,:,:), ALLOCATABLE :: Lattice


It is allocated depending on program input and then set equal to zero. Also for reference, I am using the GNU compiler.



I suspect that the answer to this questions may be that I cannot do better, but I wanted to see if any Fortran programmers can spot something that I am missing or may want to try.







share|improve this question



















  • I don´t know fortran but, what is the complexity of Lattice operation? How many points are you checking if they are different than 0?
    – juvian
    Mar 22 at 20:27










  • @juvian it is 26 of them.
    – sturgman
    Mar 22 at 21:09












up vote
1
down vote

favorite









up vote
1
down vote

favorite











I use the following subroutine to check whether a small subsection of a 3D Int array is all equal to zero. If any value in the particular subsection is non-zero, I exit and return .false.. Within CheckForVacancy, subroutine bc gets called which takes care of out of bounds array indices.



This subroutine gets called millions of times during a particular run and it is responsible for about half the running time of the entire program. In the distant past, I tried to optimize it by changing the order in which I access the Lattice array and by using the any built-in with array slices but I never saw much improvement. Furthermore, the "array slices" solution can get complicated when dealing with the boundaries of the array. I also tried fiddling with optimization flags but I was largely experimenting blindly at that point.



The code is as follows:



LOGICAL FUNCTION CheckForVacancy (x, y, z)
! Checks for vacancy for a site [x,y,z]

use atrpmodule

IMPLICIT NONE

INTEGER, INTENT(IN) :: x, y, z
INTEGER :: Sx, Sy, Sz, i
INTEGER, DIMENSION(1:26) :: SpaceX = (/1,1,1,0,-1,-1,-1,0,1,1,1,0,-1,-1,-1,0,0,1,1,1,0,-1,-1,-1,0,0/)
INTEGER, DIMENSION(1:26) :: SpaceY = (/0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,-1,-1,-1,-1,-1,-1,-1,-1,-1/)
INTEGER, DIMENSION(1:26) :: SpaceZ = (/1,0,-1,-1,-1,0,1,1,1,0,-1,-1,-1,0,1,1,0,1,0,-1,-1,-1,0,1,1,0/)
!---------------------------------------------------------------------------

checkforvacancy=.true.
do i=1,26
Sx = x + SpaceX(i)
Sy = y + SpaceY(i)
Sz = z + SpaceZ(i)
call bc(Sx,Sy,Sz)
if (lattice(Sx,Sy,Sz)/=0)then
CheckForVacancy=.false.
exit
endif
enddo

END FUNCTION CheckForVacancy

SUBROUTINE bc (x, y, z)
! Takes case of boundary conditions
USE atrpmodule
IMPLICIT NONE

INTEGER :: x, y, z
!---------------------------------------------------------------------------
IF (x < 1) then
x = x + LattXDimm
elseIF(x > LattXDimm) then
x = x - LattXDimm
endif

IF (y < 1) then
y = y + LattYDimm
elseIF (y > LattYDimm) then
y = y - LattYDimm
endif

IF (z < 1) then
z = z + LattZDimm
elseIF (z > LattZDimm) then
z = z - LattZDimm
endif

END SUBROUTINE bc


For reference, Lattice is defined like this:



INTEGER, DIMENSION(:,:,:), ALLOCATABLE :: Lattice


It is allocated depending on program input and then set equal to zero. Also for reference, I am using the GNU compiler.



I suspect that the answer to this questions may be that I cannot do better, but I wanted to see if any Fortran programmers can spot something that I am missing or may want to try.







share|improve this question











I use the following subroutine to check whether a small subsection of a 3D Int array is all equal to zero. If any value in the particular subsection is non-zero, I exit and return .false.. Within CheckForVacancy, subroutine bc gets called which takes care of out of bounds array indices.



This subroutine gets called millions of times during a particular run and it is responsible for about half the running time of the entire program. In the distant past, I tried to optimize it by changing the order in which I access the Lattice array and by using the any built-in with array slices but I never saw much improvement. Furthermore, the "array slices" solution can get complicated when dealing with the boundaries of the array. I also tried fiddling with optimization flags but I was largely experimenting blindly at that point.



The code is as follows:



LOGICAL FUNCTION CheckForVacancy (x, y, z)
! Checks for vacancy for a site [x,y,z]

use atrpmodule

IMPLICIT NONE

INTEGER, INTENT(IN) :: x, y, z
INTEGER :: Sx, Sy, Sz, i
INTEGER, DIMENSION(1:26) :: SpaceX = (/1,1,1,0,-1,-1,-1,0,1,1,1,0,-1,-1,-1,0,0,1,1,1,0,-1,-1,-1,0,0/)
INTEGER, DIMENSION(1:26) :: SpaceY = (/0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,-1,-1,-1,-1,-1,-1,-1,-1,-1/)
INTEGER, DIMENSION(1:26) :: SpaceZ = (/1,0,-1,-1,-1,0,1,1,1,0,-1,-1,-1,0,1,1,0,1,0,-1,-1,-1,0,1,1,0/)
!---------------------------------------------------------------------------

checkforvacancy=.true.
do i=1,26
Sx = x + SpaceX(i)
Sy = y + SpaceY(i)
Sz = z + SpaceZ(i)
call bc(Sx,Sy,Sz)
if (lattice(Sx,Sy,Sz)/=0)then
CheckForVacancy=.false.
exit
endif
enddo

END FUNCTION CheckForVacancy

SUBROUTINE bc (x, y, z)
! Takes case of boundary conditions
USE atrpmodule
IMPLICIT NONE

INTEGER :: x, y, z
!---------------------------------------------------------------------------
IF (x < 1) then
x = x + LattXDimm
elseIF(x > LattXDimm) then
x = x - LattXDimm
endif

IF (y < 1) then
y = y + LattYDimm
elseIF (y > LattYDimm) then
y = y - LattYDimm
endif

IF (z < 1) then
z = z + LattZDimm
elseIF (z > LattZDimm) then
z = z - LattZDimm
endif

END SUBROUTINE bc


For reference, Lattice is defined like this:



INTEGER, DIMENSION(:,:,:), ALLOCATABLE :: Lattice


It is allocated depending on program input and then set equal to zero. Also for reference, I am using the GNU compiler.



I suspect that the answer to this questions may be that I cannot do better, but I wanted to see if any Fortran programmers can spot something that I am missing or may want to try.









share|improve this question










share|improve this question




share|improve this question









asked Mar 22 at 18:08









sturgman

1083




1083











  • I don´t know fortran but, what is the complexity of Lattice operation? How many points are you checking if they are different than 0?
    – juvian
    Mar 22 at 20:27










  • @juvian it is 26 of them.
    – sturgman
    Mar 22 at 21:09
















  • I don´t know fortran but, what is the complexity of Lattice operation? How many points are you checking if they are different than 0?
    – juvian
    Mar 22 at 20:27










  • @juvian it is 26 of them.
    – sturgman
    Mar 22 at 21:09















I don´t know fortran but, what is the complexity of Lattice operation? How many points are you checking if they are different than 0?
– juvian
Mar 22 at 20:27




I don´t know fortran but, what is the complexity of Lattice operation? How many points are you checking if they are different than 0?
– juvian
Mar 22 at 20:27












@juvian it is 26 of them.
– sturgman
Mar 22 at 21:09




@juvian it is 26 of them.
– sturgman
Mar 22 at 21:09










1 Answer
1






active

oldest

votes

















up vote
1
down vote



accepted










Three thoughts about this.



First is unless the lattice is small you won't actually run into a boundary condition most times. Check if a boundary condition is possible once and have two CheckForVacancy functions one with and one without boundary checks.



Second thought is remove the boundary check altogether by adding duplicate elements. In one dimension if you had the following



Index:1,2,3,4,5 
Value:3,4,3,2,1


Make the array



Index:0,1,2,3,4,5,6
Value:1,3,4,3,2,1,3


So you trade extra storage to remove the bounds checks.



Finally, you still have a lot of no_ops in the for loop. All those add zero statements do nothing. You could unroll the loop to eliminate them at the expense of more code. You removed the 0,0,0 option already I think.



Your comment indicates that the lattice gets modified during execution, that makes removing the bounds check more complex.



Depending on the size of the lattice and the number of times you call CheckForVacancy for an individual cell you could either memoize the cells adjacent to a cell or pre-compute them. This would work if you have enough memory and if you call the function with the same values multiple times.






share|improve this answer























  • Just the type of advice I was hoping for. Regarding removing the bounds checks... I will still have to check the bounds when I am modifying lattice, no? If I am at the boundary I have to modify both copies, or am I missing something? I guess that type of check might be simpler.
    – sturgman
    Mar 23 at 0:08










  • If your modifying the lattice then for an edge cell you would have to modify the data in two or more places. If that is a requirement you should put it in the question.
    – Jackson
    Mar 23 at 2:38










  • nice answer. Will add that for 2d arrays you can pre compute a summed area table and then check in O(4) the sum of values in any area of the table. Pretty sure there is something similar for 3d if you want to look for it, might not be worth with just 26 cells though
    – juvian
    Mar 23 at 2:53










  • @jackson the idea to split the work in a function for boundaries and one without them plus unrolling the loop and minimizing the number of operations there improved running time by 15%! Furthermore, the same advice applies to the other bottleneck in the program so I can expect even bigger improvement when I'm done.
    – sturgman
    Mar 25 at 23:34










  • @sturgman good to here that you've implemented these ideas and that theyvare working for you!
    – Jackson
    Mar 26 at 8:04










Your Answer




StackExchange.ifUsing("editor", function ()
return StackExchange.using("mathjaxEditing", function ()
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["\$", "\$"]]);
);
);
, "mathjax-editing");

StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "196"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
convertImagesToLinks: false,
noModals: false,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);



);








 

draft saved


draft discarded


















StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f190224%2fimproving-performance-of-a-subroutine-that-checks-for-a-vacancy-in-a-lattice%23new-answer', 'question_page');

);

Post as a guest






























1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes








up vote
1
down vote



accepted










Three thoughts about this.



First is unless the lattice is small you won't actually run into a boundary condition most times. Check if a boundary condition is possible once and have two CheckForVacancy functions one with and one without boundary checks.



Second thought is remove the boundary check altogether by adding duplicate elements. In one dimension if you had the following



Index:1,2,3,4,5 
Value:3,4,3,2,1


Make the array



Index:0,1,2,3,4,5,6
Value:1,3,4,3,2,1,3


So you trade extra storage to remove the bounds checks.



Finally, you still have a lot of no_ops in the for loop. All those add zero statements do nothing. You could unroll the loop to eliminate them at the expense of more code. You removed the 0,0,0 option already I think.



Your comment indicates that the lattice gets modified during execution, that makes removing the bounds check more complex.



Depending on the size of the lattice and the number of times you call CheckForVacancy for an individual cell you could either memoize the cells adjacent to a cell or pre-compute them. This would work if you have enough memory and if you call the function with the same values multiple times.






share|improve this answer























  • Just the type of advice I was hoping for. Regarding removing the bounds checks... I will still have to check the bounds when I am modifying lattice, no? If I am at the boundary I have to modify both copies, or am I missing something? I guess that type of check might be simpler.
    – sturgman
    Mar 23 at 0:08










  • If your modifying the lattice then for an edge cell you would have to modify the data in two or more places. If that is a requirement you should put it in the question.
    – Jackson
    Mar 23 at 2:38










  • nice answer. Will add that for 2d arrays you can pre compute a summed area table and then check in O(4) the sum of values in any area of the table. Pretty sure there is something similar for 3d if you want to look for it, might not be worth with just 26 cells though
    – juvian
    Mar 23 at 2:53










  • @jackson the idea to split the work in a function for boundaries and one without them plus unrolling the loop and minimizing the number of operations there improved running time by 15%! Furthermore, the same advice applies to the other bottleneck in the program so I can expect even bigger improvement when I'm done.
    – sturgman
    Mar 25 at 23:34










  • @sturgman good to here that you've implemented these ideas and that theyvare working for you!
    – Jackson
    Mar 26 at 8:04














up vote
1
down vote



accepted










Three thoughts about this.



First is unless the lattice is small you won't actually run into a boundary condition most times. Check if a boundary condition is possible once and have two CheckForVacancy functions one with and one without boundary checks.



Second thought is remove the boundary check altogether by adding duplicate elements. In one dimension if you had the following



Index:1,2,3,4,5 
Value:3,4,3,2,1


Make the array



Index:0,1,2,3,4,5,6
Value:1,3,4,3,2,1,3


So you trade extra storage to remove the bounds checks.



Finally, you still have a lot of no_ops in the for loop. All those add zero statements do nothing. You could unroll the loop to eliminate them at the expense of more code. You removed the 0,0,0 option already I think.



Your comment indicates that the lattice gets modified during execution, that makes removing the bounds check more complex.



Depending on the size of the lattice and the number of times you call CheckForVacancy for an individual cell you could either memoize the cells adjacent to a cell or pre-compute them. This would work if you have enough memory and if you call the function with the same values multiple times.






share|improve this answer























  • Just the type of advice I was hoping for. Regarding removing the bounds checks... I will still have to check the bounds when I am modifying lattice, no? If I am at the boundary I have to modify both copies, or am I missing something? I guess that type of check might be simpler.
    – sturgman
    Mar 23 at 0:08










  • If your modifying the lattice then for an edge cell you would have to modify the data in two or more places. If that is a requirement you should put it in the question.
    – Jackson
    Mar 23 at 2:38










  • nice answer. Will add that for 2d arrays you can pre compute a summed area table and then check in O(4) the sum of values in any area of the table. Pretty sure there is something similar for 3d if you want to look for it, might not be worth with just 26 cells though
    – juvian
    Mar 23 at 2:53










  • @jackson the idea to split the work in a function for boundaries and one without them plus unrolling the loop and minimizing the number of operations there improved running time by 15%! Furthermore, the same advice applies to the other bottleneck in the program so I can expect even bigger improvement when I'm done.
    – sturgman
    Mar 25 at 23:34










  • @sturgman good to here that you've implemented these ideas and that theyvare working for you!
    – Jackson
    Mar 26 at 8:04












up vote
1
down vote



accepted







up vote
1
down vote



accepted






Three thoughts about this.



First is unless the lattice is small you won't actually run into a boundary condition most times. Check if a boundary condition is possible once and have two CheckForVacancy functions one with and one without boundary checks.



Second thought is remove the boundary check altogether by adding duplicate elements. In one dimension if you had the following



Index:1,2,3,4,5 
Value:3,4,3,2,1


Make the array



Index:0,1,2,3,4,5,6
Value:1,3,4,3,2,1,3


So you trade extra storage to remove the bounds checks.



Finally, you still have a lot of no_ops in the for loop. All those add zero statements do nothing. You could unroll the loop to eliminate them at the expense of more code. You removed the 0,0,0 option already I think.



Your comment indicates that the lattice gets modified during execution, that makes removing the bounds check more complex.



Depending on the size of the lattice and the number of times you call CheckForVacancy for an individual cell you could either memoize the cells adjacent to a cell or pre-compute them. This would work if you have enough memory and if you call the function with the same values multiple times.






share|improve this answer















Three thoughts about this.



First is unless the lattice is small you won't actually run into a boundary condition most times. Check if a boundary condition is possible once and have two CheckForVacancy functions one with and one without boundary checks.



Second thought is remove the boundary check altogether by adding duplicate elements. In one dimension if you had the following



Index:1,2,3,4,5 
Value:3,4,3,2,1


Make the array



Index:0,1,2,3,4,5,6
Value:1,3,4,3,2,1,3


So you trade extra storage to remove the bounds checks.



Finally, you still have a lot of no_ops in the for loop. All those add zero statements do nothing. You could unroll the loop to eliminate them at the expense of more code. You removed the 0,0,0 option already I think.



Your comment indicates that the lattice gets modified during execution, that makes removing the bounds check more complex.



Depending on the size of the lattice and the number of times you call CheckForVacancy for an individual cell you could either memoize the cells adjacent to a cell or pre-compute them. This would work if you have enough memory and if you call the function with the same values multiple times.







share|improve this answer















share|improve this answer



share|improve this answer








edited Mar 23 at 2:48


























answered Mar 22 at 22:25









Jackson

238210




238210











  • Just the type of advice I was hoping for. Regarding removing the bounds checks... I will still have to check the bounds when I am modifying lattice, no? If I am at the boundary I have to modify both copies, or am I missing something? I guess that type of check might be simpler.
    – sturgman
    Mar 23 at 0:08










  • If your modifying the lattice then for an edge cell you would have to modify the data in two or more places. If that is a requirement you should put it in the question.
    – Jackson
    Mar 23 at 2:38










  • nice answer. Will add that for 2d arrays you can pre compute a summed area table and then check in O(4) the sum of values in any area of the table. Pretty sure there is something similar for 3d if you want to look for it, might not be worth with just 26 cells though
    – juvian
    Mar 23 at 2:53










  • @jackson the idea to split the work in a function for boundaries and one without them plus unrolling the loop and minimizing the number of operations there improved running time by 15%! Furthermore, the same advice applies to the other bottleneck in the program so I can expect even bigger improvement when I'm done.
    – sturgman
    Mar 25 at 23:34










  • @sturgman good to here that you've implemented these ideas and that theyvare working for you!
    – Jackson
    Mar 26 at 8:04
















  • Just the type of advice I was hoping for. Regarding removing the bounds checks... I will still have to check the bounds when I am modifying lattice, no? If I am at the boundary I have to modify both copies, or am I missing something? I guess that type of check might be simpler.
    – sturgman
    Mar 23 at 0:08










  • If your modifying the lattice then for an edge cell you would have to modify the data in two or more places. If that is a requirement you should put it in the question.
    – Jackson
    Mar 23 at 2:38










  • nice answer. Will add that for 2d arrays you can pre compute a summed area table and then check in O(4) the sum of values in any area of the table. Pretty sure there is something similar for 3d if you want to look for it, might not be worth with just 26 cells though
    – juvian
    Mar 23 at 2:53










  • @jackson the idea to split the work in a function for boundaries and one without them plus unrolling the loop and minimizing the number of operations there improved running time by 15%! Furthermore, the same advice applies to the other bottleneck in the program so I can expect even bigger improvement when I'm done.
    – sturgman
    Mar 25 at 23:34










  • @sturgman good to here that you've implemented these ideas and that theyvare working for you!
    – Jackson
    Mar 26 at 8:04















Just the type of advice I was hoping for. Regarding removing the bounds checks... I will still have to check the bounds when I am modifying lattice, no? If I am at the boundary I have to modify both copies, or am I missing something? I guess that type of check might be simpler.
– sturgman
Mar 23 at 0:08




Just the type of advice I was hoping for. Regarding removing the bounds checks... I will still have to check the bounds when I am modifying lattice, no? If I am at the boundary I have to modify both copies, or am I missing something? I guess that type of check might be simpler.
– sturgman
Mar 23 at 0:08












If your modifying the lattice then for an edge cell you would have to modify the data in two or more places. If that is a requirement you should put it in the question.
– Jackson
Mar 23 at 2:38




If your modifying the lattice then for an edge cell you would have to modify the data in two or more places. If that is a requirement you should put it in the question.
– Jackson
Mar 23 at 2:38












nice answer. Will add that for 2d arrays you can pre compute a summed area table and then check in O(4) the sum of values in any area of the table. Pretty sure there is something similar for 3d if you want to look for it, might not be worth with just 26 cells though
– juvian
Mar 23 at 2:53




nice answer. Will add that for 2d arrays you can pre compute a summed area table and then check in O(4) the sum of values in any area of the table. Pretty sure there is something similar for 3d if you want to look for it, might not be worth with just 26 cells though
– juvian
Mar 23 at 2:53












@jackson the idea to split the work in a function for boundaries and one without them plus unrolling the loop and minimizing the number of operations there improved running time by 15%! Furthermore, the same advice applies to the other bottleneck in the program so I can expect even bigger improvement when I'm done.
– sturgman
Mar 25 at 23:34




@jackson the idea to split the work in a function for boundaries and one without them plus unrolling the loop and minimizing the number of operations there improved running time by 15%! Furthermore, the same advice applies to the other bottleneck in the program so I can expect even bigger improvement when I'm done.
– sturgman
Mar 25 at 23:34












@sturgman good to here that you've implemented these ideas and that theyvare working for you!
– Jackson
Mar 26 at 8:04




@sturgman good to here that you've implemented these ideas and that theyvare working for you!
– Jackson
Mar 26 at 8:04












 

draft saved


draft discarded


























 


draft saved


draft discarded














StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f190224%2fimproving-performance-of-a-subroutine-that-checks-for-a-vacancy-in-a-lattice%23new-answer', 'question_page');

);

Post as a guest













































































Popular posts from this blog

Greedy Best First Search implementation in Rust

Function to Return a JSON Like Objects Using VBA Collections and Arrays

C++11 CLH Lock Implementation