Improving performance of a subroutine that checks for a vacancy in a lattice

.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty margin-bottom:0;

up vote
1
down vote

favorite

I use the following subroutine to check whether a small subsection of a 3D Int array is all equal to zero. If any value in the particular subsection is non-zero, I exit and return .false.. Within CheckForVacancy, subroutine bc gets called which takes care of out of bounds array indices.

This subroutine gets called millions of times during a particular run and it is responsible for about half the running time of the entire program. In the distant past, I tried to optimize it by changing the order in which I access the Lattice array and by using the any built-in with array slices but I never saw much improvement. Furthermore, the "array slices" solution can get complicated when dealing with the boundaries of the array. I also tried fiddling with optimization flags but I was largely experimenting blindly at that point.

The code is as follows:

LOGICAL FUNCTION CheckForVacancy (x, y, z)
 ! Checks for vacancy for a site [x,y,z]

 use atrpmodule

 IMPLICIT NONE

 INTEGER, INTENT(IN) :: x, y, z
 INTEGER :: Sx, Sy, Sz, i
 INTEGER, DIMENSION(1:26) :: SpaceX = (/1,1,1,0,-1,-1,-1,0,1,1,1,0,-1,-1,-1,0,0,1,1,1,0,-1,-1,-1,0,0/)
 INTEGER, DIMENSION(1:26) :: SpaceY = (/0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,-1,-1,-1,-1,-1,-1,-1,-1,-1/)
 INTEGER, DIMENSION(1:26) :: SpaceZ = (/1,0,-1,-1,-1,0,1,1,1,0,-1,-1,-1,0,1,1,0,1,0,-1,-1,-1,0,1,1,0/)
 !---------------------------------------------------------------------------

 checkforvacancy=.true.
 do i=1,26
 Sx = x + SpaceX(i)
 Sy = y + SpaceY(i)
 Sz = z + SpaceZ(i)
 call bc(Sx,Sy,Sz)
 if (lattice(Sx,Sy,Sz)/=0)then
 CheckForVacancy=.false.
 exit
 endif
 enddo

END FUNCTION CheckForVacancy

SUBROUTINE bc (x, y, z)
 ! Takes case of boundary conditions
 USE atrpmodule
 IMPLICIT NONE

 INTEGER :: x, y, z
 !---------------------------------------------------------------------------
 IF (x < 1) then
 x = x + LattXDimm
 elseIF(x > LattXDimm) then
 x = x - LattXDimm
 endif

 IF (y < 1) then
 y = y + LattYDimm
 elseIF (y > LattYDimm) then
 y = y - LattYDimm
 endif

 IF (z < 1) then
 z = z + LattZDimm
 elseIF (z > LattZDimm) then
 z = z - LattZDimm
 endif

END SUBROUTINE bc

For reference, Lattice is defined like this:

INTEGER, DIMENSION(:,:,:), ALLOCATABLE :: Lattice

It is allocated depending on program input and then set equal to zero. Also for reference, I am using the GNU compiler.

I suspect that the answer to this questions may be that I cannot do better, but I wanted to see if any Fortran programmers can spot something that I am missing or may want to try.

asked Mar 22 at 18:08

sturgman

1083

I donÂ´t know fortran but, what is the complexity of Lattice operation? How many points are you checking if they are different than 0?
â€“Â juvian
Mar 22 at 20:27

@juvian it is 26 of them.
â€“Â sturgman
Mar 22 at 21:09

add a commentÂ |Â

up vote
1
down vote

favorite

The code is as follows:

LOGICAL FUNCTION CheckForVacancy (x, y, z)
 ! Checks for vacancy for a site [x,y,z]

 use atrpmodule

 IMPLICIT NONE

 INTEGER, INTENT(IN) :: x, y, z
 INTEGER :: Sx, Sy, Sz, i
 INTEGER, DIMENSION(1:26) :: SpaceX = (/1,1,1,0,-1,-1,-1,0,1,1,1,0,-1,-1,-1,0,0,1,1,1,0,-1,-1,-1,0,0/)
 INTEGER, DIMENSION(1:26) :: SpaceY = (/0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,-1,-1,-1,-1,-1,-1,-1,-1,-1/)
 INTEGER, DIMENSION(1:26) :: SpaceZ = (/1,0,-1,-1,-1,0,1,1,1,0,-1,-1,-1,0,1,1,0,1,0,-1,-1,-1,0,1,1,0/)
 !---------------------------------------------------------------------------

 checkforvacancy=.true.
 do i=1,26
 Sx = x + SpaceX(i)
 Sy = y + SpaceY(i)
 Sz = z + SpaceZ(i)
 call bc(Sx,Sy,Sz)
 if (lattice(Sx,Sy,Sz)/=0)then
 CheckForVacancy=.false.
 exit
 endif
 enddo

END FUNCTION CheckForVacancy

SUBROUTINE bc (x, y, z)
 ! Takes case of boundary conditions
 USE atrpmodule
 IMPLICIT NONE

 INTEGER :: x, y, z
 !---------------------------------------------------------------------------
 IF (x < 1) then
 x = x + LattXDimm
 elseIF(x > LattXDimm) then
 x = x - LattXDimm
 endif

 IF (y < 1) then
 y = y + LattYDimm
 elseIF (y > LattYDimm) then
 y = y - LattYDimm
 endif

 IF (z < 1) then
 z = z + LattZDimm
 elseIF (z > LattZDimm) then
 z = z - LattZDimm
 endif

END SUBROUTINE bc

For reference, Lattice is defined like this:

INTEGER, DIMENSION(:,:,:), ALLOCATABLE :: Lattice

It is allocated depending on program input and then set equal to zero. Also for reference, I am using the GNU compiler.

I suspect that the answer to this questions may be that I cannot do better, but I wanted to see if any Fortran programmers can spot something that I am missing or may want to try.

asked Mar 22 at 18:08

sturgman

1083

I donÂ´t know fortran but, what is the complexity of Lattice operation? How many points are you checking if they are different than 0?
â€“Â juvian
Mar 22 at 20:27

@juvian it is 26 of them.
â€“Â sturgman
Mar 22 at 21:09

add a commentÂ |Â

up vote
1
down vote

favorite

The code is as follows:

LOGICAL FUNCTION CheckForVacancy (x, y, z)
 ! Checks for vacancy for a site [x,y,z]

 use atrpmodule

 IMPLICIT NONE

 INTEGER, INTENT(IN) :: x, y, z
 INTEGER :: Sx, Sy, Sz, i
 INTEGER, DIMENSION(1:26) :: SpaceX = (/1,1,1,0,-1,-1,-1,0,1,1,1,0,-1,-1,-1,0,0,1,1,1,0,-1,-1,-1,0,0/)
 INTEGER, DIMENSION(1:26) :: SpaceY = (/0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,-1,-1,-1,-1,-1,-1,-1,-1,-1/)
 INTEGER, DIMENSION(1:26) :: SpaceZ = (/1,0,-1,-1,-1,0,1,1,1,0,-1,-1,-1,0,1,1,0,1,0,-1,-1,-1,0,1,1,0/)
 !---------------------------------------------------------------------------

 checkforvacancy=.true.
 do i=1,26
 Sx = x + SpaceX(i)
 Sy = y + SpaceY(i)
 Sz = z + SpaceZ(i)
 call bc(Sx,Sy,Sz)
 if (lattice(Sx,Sy,Sz)/=0)then
 CheckForVacancy=.false.
 exit
 endif
 enddo

END FUNCTION CheckForVacancy

SUBROUTINE bc (x, y, z)
 ! Takes case of boundary conditions
 USE atrpmodule
 IMPLICIT NONE

 INTEGER :: x, y, z
 !---------------------------------------------------------------------------
 IF (x < 1) then
 x = x + LattXDimm
 elseIF(x > LattXDimm) then
 x = x - LattXDimm
 endif

 IF (y < 1) then
 y = y + LattYDimm
 elseIF (y > LattYDimm) then
 y = y - LattYDimm
 endif

 IF (z < 1) then
 z = z + LattZDimm
 elseIF (z > LattZDimm) then
 z = z - LattZDimm
 endif

END SUBROUTINE bc

For reference, Lattice is defined like this:

INTEGER, DIMENSION(:,:,:), ALLOCATABLE :: Lattice

It is allocated depending on program input and then set equal to zero. Also for reference, I am using the GNU compiler.

I suspect that the answer to this questions may be that I cannot do better, but I wanted to see if any Fortran programmers can spot something that I am missing or may want to try.

asked Mar 22 at 18:08

sturgman

1083

The code is as follows:

LOGICAL FUNCTION CheckForVacancy (x, y, z)
 ! Checks for vacancy for a site [x,y,z]

 use atrpmodule

 IMPLICIT NONE

 INTEGER, INTENT(IN) :: x, y, z
 INTEGER :: Sx, Sy, Sz, i
 INTEGER, DIMENSION(1:26) :: SpaceX = (/1,1,1,0,-1,-1,-1,0,1,1,1,0,-1,-1,-1,0,0,1,1,1,0,-1,-1,-1,0,0/)
 INTEGER, DIMENSION(1:26) :: SpaceY = (/0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,-1,-1,-1,-1,-1,-1,-1,-1,-1/)
 INTEGER, DIMENSION(1:26) :: SpaceZ = (/1,0,-1,-1,-1,0,1,1,1,0,-1,-1,-1,0,1,1,0,1,0,-1,-1,-1,0,1,1,0/)
 !---------------------------------------------------------------------------

 checkforvacancy=.true.
 do i=1,26
 Sx = x + SpaceX(i)
 Sy = y + SpaceY(i)
 Sz = z + SpaceZ(i)
 call bc(Sx,Sy,Sz)
 if (lattice(Sx,Sy,Sz)/=0)then
 CheckForVacancy=.false.
 exit
 endif
 enddo

END FUNCTION CheckForVacancy

SUBROUTINE bc (x, y, z)
 ! Takes case of boundary conditions
 USE atrpmodule
 IMPLICIT NONE

 INTEGER :: x, y, z
 !---------------------------------------------------------------------------
 IF (x < 1) then
 x = x + LattXDimm
 elseIF(x > LattXDimm) then
 x = x - LattXDimm
 endif

 IF (y < 1) then
 y = y + LattYDimm
 elseIF (y > LattYDimm) then
 y = y - LattYDimm
 endif

 IF (z < 1) then
 z = z + LattZDimm
 elseIF (z > LattZDimm) then
 z = z - LattZDimm
 endif

END SUBROUTINE bc

For reference, Lattice is defined like this:

INTEGER, DIMENSION(:,:,:), ALLOCATABLE :: Lattice

It is allocated depending on program input and then set equal to zero. Also for reference, I am using the GNU compiler.

I suspect that the answer to this questions may be that I cannot do better, but I wanted to see if any Fortran programmers can spot something that I am missing or may want to try.

asked Mar 22 at 18:08

sturgman

1083

asked Mar 22 at 18:08

sturgman

1083

asked Mar 22 at 18:08

sturgman

1083

asked Mar 22 at 18:08

sturgman

1083

I donÂ´t know fortran but, what is the complexity of Lattice operation? How many points are you checking if they are different than 0?
â€“Â juvian
Mar 22 at 20:27

@juvian it is 26 of them.
â€“Â sturgman
Mar 22 at 21:09

add a commentÂ |Â

I donÂ´t know fortran but, what is the complexity of Lattice operation? How many points are you checking if they are different than 0?
â€“Â juvian
Mar 22 at 20:27

@juvian it is 26 of them.
â€“Â sturgman
Mar 22 at 21:09

I donÂ´t know fortran but, what is the complexity of Lattice operation? How many points are you checking if they are different than 0?
â€“Â juvian
Mar 22 at 20:27

@juvian it is 26 of them.
â€“Â sturgman
Mar 22 at 21:09

add a commentÂ |Â

1 Answer
1

active

oldest

votes

up vote
1
down vote

accepted

Three thoughts about this.

First is unless the lattice is small you won't actually run into a boundary condition most times. Check if a boundary condition is possible once and have two CheckForVacancy functions one with and one without boundary checks.

Second thought is remove the boundary check altogether by adding duplicate elements. In one dimension if you had the following

Index:1,2,3,4,5 
Value:3,4,3,2,1

Make the array

Index:0,1,2,3,4,5,6
Value:1,3,4,3,2,1,3

So you trade extra storage to remove the bounds checks.

Finally, you still have a lot of no_ops in the for loop. All those add zero statements do nothing. You could unroll the loop to eliminate them at the expense of more code. You removed the 0,0,0 option already I think.

Your comment indicates that the lattice gets modified during execution, that makes removing the bounds check more complex.

Depending on the size of the lattice and the number of times you call CheckForVacancy for an individual cell you could either memoize the cells adjacent to a cell or pre-compute them. This would work if you have enough memory and if you call the function with the same values multiple times.

edited Mar 23 at 2:48

answered Mar 22 at 22:25

Jackson

238210

Just the type of advice I was hoping for. Regarding removing the bounds checks... I will still have to check the bounds when I am modifying lattice, no? If I am at the boundary I have to modify both copies, or am I missing something? I guess that type of check might be simpler.
â€“Â sturgman
Mar 23 at 0:08

If your modifying the lattice then for an edge cell you would have to modify the data in two or more places. If that is a requirement you should put it in the question.
â€“Â Jackson
Mar 23 at 2:38

nice answer. Will add that for 2d arrays you can pre compute a summed area table and then check in O(4) the sum of values in any area of the table. Pretty sure there is something similar for 3d if you want to look for it, might not be worth with just 26 cells though
â€“Â juvian
Mar 23 at 2:53

@jackson the idea to split the work in a function for boundaries and one without them plus unrolling the loop and minimizing the number of operations there improved running time by 15%! Furthermore, the same advice applies to the other bottleneck in the program so I can expect even bigger improvement when I'm done.
â€“Â sturgman
Mar 25 at 23:34

@sturgman good to here that you've implemented these ideas and that theyvare working for you!
â€“Â Jackson
Mar 26 at 8:04

add a commentÂ |Â

Your Answer

StackExchange.ifUsing("editor", function ()
return StackExchange.using("mathjaxEditing", function ()
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["\$", "\$"]]);
);
);
, "mathjax-editing");

StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "196"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
convertImagesToLinks: false,
noModals: false,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f190224%2fimproving-performance-of-a-subroutine-that-checks-for-a-vacancy-in-a-lattice%23new-answer', 'question_page');

);

Post as a guest

Name

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

up vote
1
down vote

accepted

Three thoughts about this.

Second thought is remove the boundary check altogether by adding duplicate elements. In one dimension if you had the following

Index:1,2,3,4,5 
Value:3,4,3,2,1

Make the array

Index:0,1,2,3,4,5,6
Value:1,3,4,3,2,1,3

So you trade extra storage to remove the bounds checks.

Your comment indicates that the lattice gets modified during execution, that makes removing the bounds check more complex.

edited Mar 23 at 2:48

answered Mar 22 at 22:25

Jackson

238210

Just the type of advice I was hoping for. Regarding removing the bounds checks... I will still have to check the bounds when I am modifying lattice, no? If I am at the boundary I have to modify both copies, or am I missing something? I guess that type of check might be simpler.
â€“Â sturgman
Mar 23 at 0:08

If your modifying the lattice then for an edge cell you would have to modify the data in two or more places. If that is a requirement you should put it in the question.
â€“Â Jackson
Mar 23 at 2:38

nice answer. Will add that for 2d arrays you can pre compute a summed area table and then check in O(4) the sum of values in any area of the table. Pretty sure there is something similar for 3d if you want to look for it, might not be worth with just 26 cells though
â€“Â juvian
Mar 23 at 2:53

@jackson the idea to split the work in a function for boundaries and one without them plus unrolling the loop and minimizing the number of operations there improved running time by 15%! Furthermore, the same advice applies to the other bottleneck in the program so I can expect even bigger improvement when I'm done.
â€“Â sturgman
Mar 25 at 23:34

@sturgman good to here that you've implemented these ideas and that theyvare working for you!
â€“Â Jackson
Mar 26 at 8:04

add a commentÂ |Â

up vote
1
down vote

accepted

Three thoughts about this.

Second thought is remove the boundary check altogether by adding duplicate elements. In one dimension if you had the following

Index:1,2,3,4,5 
Value:3,4,3,2,1

Make the array

Index:0,1,2,3,4,5,6
Value:1,3,4,3,2,1,3

So you trade extra storage to remove the bounds checks.

Your comment indicates that the lattice gets modified during execution, that makes removing the bounds check more complex.

edited Mar 23 at 2:48

answered Mar 22 at 22:25

Jackson

238210

Just the type of advice I was hoping for. Regarding removing the bounds checks... I will still have to check the bounds when I am modifying lattice, no? If I am at the boundary I have to modify both copies, or am I missing something? I guess that type of check might be simpler.
â€“Â sturgman
Mar 23 at 0:08

If your modifying the lattice then for an edge cell you would have to modify the data in two or more places. If that is a requirement you should put it in the question.
â€“Â Jackson
Mar 23 at 2:38

nice answer. Will add that for 2d arrays you can pre compute a summed area table and then check in O(4) the sum of values in any area of the table. Pretty sure there is something similar for 3d if you want to look for it, might not be worth with just 26 cells though
â€“Â juvian
Mar 23 at 2:53

@jackson the idea to split the work in a function for boundaries and one without them plus unrolling the loop and minimizing the number of operations there improved running time by 15%! Furthermore, the same advice applies to the other bottleneck in the program so I can expect even bigger improvement when I'm done.
â€“Â sturgman
Mar 25 at 23:34

@sturgman good to here that you've implemented these ideas and that theyvare working for you!
â€“Â Jackson
Mar 26 at 8:04

add a commentÂ |Â

up vote
1
down vote

accepted

Three thoughts about this.

Second thought is remove the boundary check altogether by adding duplicate elements. In one dimension if you had the following

Index:1,2,3,4,5 
Value:3,4,3,2,1

Make the array

Index:0,1,2,3,4,5,6
Value:1,3,4,3,2,1,3

So you trade extra storage to remove the bounds checks.

Your comment indicates that the lattice gets modified during execution, that makes removing the bounds check more complex.

edited Mar 23 at 2:48

answered Mar 22 at 22:25

Jackson

238210

Three thoughts about this.

Second thought is remove the boundary check altogether by adding duplicate elements. In one dimension if you had the following

Index:1,2,3,4,5 
Value:3,4,3,2,1

Make the array

Index:0,1,2,3,4,5,6
Value:1,3,4,3,2,1,3

So you trade extra storage to remove the bounds checks.

Your comment indicates that the lattice gets modified during execution, that makes removing the bounds check more complex.

edited Mar 23 at 2:48

answered Mar 22 at 22:25

Jackson

238210

edited Mar 23 at 2:48

answered Mar 22 at 22:25

Jackson

238210

answered Mar 22 at 22:25

Jackson

238210

answered Mar 22 at 22:25

Jackson

238210

Just the type of advice I was hoping for. Regarding removing the bounds checks... I will still have to check the bounds when I am modifying lattice, no? If I am at the boundary I have to modify both copies, or am I missing something? I guess that type of check might be simpler.
â€“Â sturgman
Mar 23 at 0:08

If your modifying the lattice then for an edge cell you would have to modify the data in two or more places. If that is a requirement you should put it in the question.
â€“Â Jackson
Mar 23 at 2:38

nice answer. Will add that for 2d arrays you can pre compute a summed area table and then check in O(4) the sum of values in any area of the table. Pretty sure there is something similar for 3d if you want to look for it, might not be worth with just 26 cells though
â€“Â juvian
Mar 23 at 2:53

@jackson the idea to split the work in a function for boundaries and one without them plus unrolling the loop and minimizing the number of operations there improved running time by 15%! Furthermore, the same advice applies to the other bottleneck in the program so I can expect even bigger improvement when I'm done.
â€“Â sturgman
Mar 25 at 23:34

@sturgman good to here that you've implemented these ideas and that theyvare working for you!
â€“Â Jackson
Mar 26 at 8:04

add a commentÂ |Â

Just the type of advice I was hoping for. Regarding removing the bounds checks... I will still have to check the bounds when I am modifying lattice, no? If I am at the boundary I have to modify both copies, or am I missing something? I guess that type of check might be simpler.
â€“Â sturgman
Mar 23 at 0:08

If your modifying the lattice then for an edge cell you would have to modify the data in two or more places. If that is a requirement you should put it in the question.
â€“Â Jackson
Mar 23 at 2:38

nice answer. Will add that for 2d arrays you can pre compute a summed area table and then check in O(4) the sum of values in any area of the table. Pretty sure there is something similar for 3d if you want to look for it, might not be worth with just 26 cells though
â€“Â juvian
Mar 23 at 2:53

@jackson the idea to split the work in a function for boundaries and one without them plus unrolling the loop and minimizing the number of operations there improved running time by 15%! Furthermore, the same advice applies to the other bottleneck in the program so I can expect even bigger improvement when I'm done.
â€“Â sturgman
Mar 25 at 23:34

@sturgman good to here that you've implemented these ideas and that theyvare working for you!
â€“Â Jackson
Mar 26 at 8:04

Just the type of advice I was hoping for. Regarding removing the bounds checks... I will still have to check the bounds when I am modifying lattice, no? If I am at the boundary I have to modify both copies, or am I missing something? I guess that type of check might be simpler.
â€“Â sturgman
Mar 23 at 0:08

If your modifying the lattice then for an edge cell you would have to modify the data in two or more places. If that is a requirement you should put it in the question.
â€“Â Jackson
Mar 23 at 2:38

nice answer. Will add that for 2d arrays you can pre compute a summed area table and then check in O(4) the sum of values in any area of the table. Pretty sure there is something similar for 3d if you want to look for it, might not be worth with just 26 cells though
â€“Â juvian
Mar 23 at 2:53

@jackson the idea to split the work in a function for boundaries and one without them plus unrolling the loop and minimizing the number of operations there improved running time by 15%! Furthermore, the same advice applies to the other bottleneck in the program so I can expect even bigger improvement when I'm done.
â€“Â sturgman
Mar 25 at 23:34

@sturgman good to here that you've implemented these ideas and that theyvare working for you!
â€“Â Jackson
Mar 26 at 8:04

add a commentÂ |Â

draft saved

draft discarded

draft saved

draft discarded

Post as a guest

Name

搜尋此網誌

trjhtr