Calculate electric field of a charged annulus
Clash Royale CLAN TAG#URR8PPP
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty margin-bottom:0;
up vote
8
down vote
favorite
This script calculates the shape of the electric field on a 2D grid perpendicular to a uniformly charged annulus using SciPy's dblquad
(tutorial, documentation).
I've chosen a coarse grid and specified a very large relative error epsrel
here so that this example will run in about 10 seconds, but I'll be using it on a much finer grid and with much lower specified error.
I'd like to know if there are ways this can be significantly sped up without loss of accuracy, which should be in the 1E-7 neighborhood.
I'll be avoiding pathological points manually, as I've done here by avoiding the z=0 plane. I've left out constants and units to reduce clutter.
def Excalc(r, th):
x, y, z = r*np.cos(th), r*np.sin(th), 0.0
return (x0-x) * ((x0-x)**2 + (y0-y)**2 + (z0-z)**2)**-1.5
def Eycalc(r, th):
x, y, z = r*np.cos(th), r*np.sin(th), 0.0
return (y0-y) * ((x0-x)**2 + (y0-y)**2 + (z0-z)**2)**-1.5
def Ezcalc(r, th):
x, y, z = r*np.cos(th), r*np.sin(th), 0.0
return (z0-z) * ((x0-x)**2 + (y0-y)**2 + (z0-z)**2)**-1.5
import numpy as np
import matplotlib.pyplot as plt
from scipy.integrate import dblquad
twopi = 2.*np.pi
# annulus of uniform, unit charge density
rmin, rmax = 0.8, 1.2
thmin, thmax = 0, twopi
x = np.arange(-2, 2.1, 0.1)
z = np.arange(-1.55, 1.6, 0.1) # avoid z=0 plane for integration.
X, Z = np.meshgrid(x, z)
s = X.shape
eps = 1E-4
y0 = 0
fields, errors = ,
for x0, z0 in zip(X.flatten(), Z.flatten()):
# use of lambda in lieu of function calls https://stackoverflow.com/a/49441680/3904031
Ex, Exerr = dblquad(Excalc, thmin, thmax, lambda t: rmin, lambda t: rmax, epsrel=eps)
Ey, Eyerr = dblquad(Eycalc, thmin, thmax, lambda t: rmin, lambda t: rmax, epsrel=eps)
Ez, Ezerr = dblquad(Ezcalc, thmin, thmax, lambda t: rmin, lambda t: rmax, epsrel=eps)
fields.append([Ex, Ey, Ez])
errors.append([Exerr, Eyerr, Ezerr])
Evec = np.stack([np.array(thing).reshape(s) for thing in zip(*fields)])
Emag = np.sqrt((Evec**2).sum(axis=0))
Evecerr = np.stack([np.array(thing).reshape(s) for thing in zip(*errors)])
Emagerr = np.sqrt((Evecerr**2).sum(axis=0))
if True:
names = 'Ex', 'Ey', 'Ez', 'Emag'
plt.figure()
for i, (E, name) in enumerate(zip([e for e in Evec] + [Emag], names)):
smalls = np.abs(E) < 1E-08
E[smalls] = 0. # fudge to keep Ey from showing 1E-14 fields
plt.subplot(2, 2, i+1)
plt.imshow(E, origin='lower')
plt.colorbar()
plt.title(name, fontsize=16)
plt.show()
if True:
names = 'Ex rel error', 'Ey rel error', 'Ez rel error', 'Emag rel error'
plt.figure()
for i, (err, name) in enumerate(zip([er for er in Evecerr] + [Emagerr], names)):
plt.subplot(2, 2, i+1)
plt.imshow(err/E, origin='lower')
plt.colorbar()
plt.title(name, fontsize=16)
plt.show()
python python-2.7 scipy
add a comment |Â
up vote
8
down vote
favorite
This script calculates the shape of the electric field on a 2D grid perpendicular to a uniformly charged annulus using SciPy's dblquad
(tutorial, documentation).
I've chosen a coarse grid and specified a very large relative error epsrel
here so that this example will run in about 10 seconds, but I'll be using it on a much finer grid and with much lower specified error.
I'd like to know if there are ways this can be significantly sped up without loss of accuracy, which should be in the 1E-7 neighborhood.
I'll be avoiding pathological points manually, as I've done here by avoiding the z=0 plane. I've left out constants and units to reduce clutter.
def Excalc(r, th):
x, y, z = r*np.cos(th), r*np.sin(th), 0.0
return (x0-x) * ((x0-x)**2 + (y0-y)**2 + (z0-z)**2)**-1.5
def Eycalc(r, th):
x, y, z = r*np.cos(th), r*np.sin(th), 0.0
return (y0-y) * ((x0-x)**2 + (y0-y)**2 + (z0-z)**2)**-1.5
def Ezcalc(r, th):
x, y, z = r*np.cos(th), r*np.sin(th), 0.0
return (z0-z) * ((x0-x)**2 + (y0-y)**2 + (z0-z)**2)**-1.5
import numpy as np
import matplotlib.pyplot as plt
from scipy.integrate import dblquad
twopi = 2.*np.pi
# annulus of uniform, unit charge density
rmin, rmax = 0.8, 1.2
thmin, thmax = 0, twopi
x = np.arange(-2, 2.1, 0.1)
z = np.arange(-1.55, 1.6, 0.1) # avoid z=0 plane for integration.
X, Z = np.meshgrid(x, z)
s = X.shape
eps = 1E-4
y0 = 0
fields, errors = ,
for x0, z0 in zip(X.flatten(), Z.flatten()):
# use of lambda in lieu of function calls https://stackoverflow.com/a/49441680/3904031
Ex, Exerr = dblquad(Excalc, thmin, thmax, lambda t: rmin, lambda t: rmax, epsrel=eps)
Ey, Eyerr = dblquad(Eycalc, thmin, thmax, lambda t: rmin, lambda t: rmax, epsrel=eps)
Ez, Ezerr = dblquad(Ezcalc, thmin, thmax, lambda t: rmin, lambda t: rmax, epsrel=eps)
fields.append([Ex, Ey, Ez])
errors.append([Exerr, Eyerr, Ezerr])
Evec = np.stack([np.array(thing).reshape(s) for thing in zip(*fields)])
Emag = np.sqrt((Evec**2).sum(axis=0))
Evecerr = np.stack([np.array(thing).reshape(s) for thing in zip(*errors)])
Emagerr = np.sqrt((Evecerr**2).sum(axis=0))
if True:
names = 'Ex', 'Ey', 'Ez', 'Emag'
plt.figure()
for i, (E, name) in enumerate(zip([e for e in Evec] + [Emag], names)):
smalls = np.abs(E) < 1E-08
E[smalls] = 0. # fudge to keep Ey from showing 1E-14 fields
plt.subplot(2, 2, i+1)
plt.imshow(E, origin='lower')
plt.colorbar()
plt.title(name, fontsize=16)
plt.show()
if True:
names = 'Ex rel error', 'Ey rel error', 'Ez rel error', 'Emag rel error'
plt.figure()
for i, (err, name) in enumerate(zip([er for er in Evecerr] + [Emagerr], names)):
plt.subplot(2, 2, i+1)
plt.imshow(err/E, origin='lower')
plt.colorbar()
plt.title(name, fontsize=16)
plt.show()
python python-2.7 scipy
add a comment |Â
up vote
8
down vote
favorite
up vote
8
down vote
favorite
This script calculates the shape of the electric field on a 2D grid perpendicular to a uniformly charged annulus using SciPy's dblquad
(tutorial, documentation).
I've chosen a coarse grid and specified a very large relative error epsrel
here so that this example will run in about 10 seconds, but I'll be using it on a much finer grid and with much lower specified error.
I'd like to know if there are ways this can be significantly sped up without loss of accuracy, which should be in the 1E-7 neighborhood.
I'll be avoiding pathological points manually, as I've done here by avoiding the z=0 plane. I've left out constants and units to reduce clutter.
def Excalc(r, th):
x, y, z = r*np.cos(th), r*np.sin(th), 0.0
return (x0-x) * ((x0-x)**2 + (y0-y)**2 + (z0-z)**2)**-1.5
def Eycalc(r, th):
x, y, z = r*np.cos(th), r*np.sin(th), 0.0
return (y0-y) * ((x0-x)**2 + (y0-y)**2 + (z0-z)**2)**-1.5
def Ezcalc(r, th):
x, y, z = r*np.cos(th), r*np.sin(th), 0.0
return (z0-z) * ((x0-x)**2 + (y0-y)**2 + (z0-z)**2)**-1.5
import numpy as np
import matplotlib.pyplot as plt
from scipy.integrate import dblquad
twopi = 2.*np.pi
# annulus of uniform, unit charge density
rmin, rmax = 0.8, 1.2
thmin, thmax = 0, twopi
x = np.arange(-2, 2.1, 0.1)
z = np.arange(-1.55, 1.6, 0.1) # avoid z=0 plane for integration.
X, Z = np.meshgrid(x, z)
s = X.shape
eps = 1E-4
y0 = 0
fields, errors = ,
for x0, z0 in zip(X.flatten(), Z.flatten()):
# use of lambda in lieu of function calls https://stackoverflow.com/a/49441680/3904031
Ex, Exerr = dblquad(Excalc, thmin, thmax, lambda t: rmin, lambda t: rmax, epsrel=eps)
Ey, Eyerr = dblquad(Eycalc, thmin, thmax, lambda t: rmin, lambda t: rmax, epsrel=eps)
Ez, Ezerr = dblquad(Ezcalc, thmin, thmax, lambda t: rmin, lambda t: rmax, epsrel=eps)
fields.append([Ex, Ey, Ez])
errors.append([Exerr, Eyerr, Ezerr])
Evec = np.stack([np.array(thing).reshape(s) for thing in zip(*fields)])
Emag = np.sqrt((Evec**2).sum(axis=0))
Evecerr = np.stack([np.array(thing).reshape(s) for thing in zip(*errors)])
Emagerr = np.sqrt((Evecerr**2).sum(axis=0))
if True:
names = 'Ex', 'Ey', 'Ez', 'Emag'
plt.figure()
for i, (E, name) in enumerate(zip([e for e in Evec] + [Emag], names)):
smalls = np.abs(E) < 1E-08
E[smalls] = 0. # fudge to keep Ey from showing 1E-14 fields
plt.subplot(2, 2, i+1)
plt.imshow(E, origin='lower')
plt.colorbar()
plt.title(name, fontsize=16)
plt.show()
if True:
names = 'Ex rel error', 'Ey rel error', 'Ez rel error', 'Emag rel error'
plt.figure()
for i, (err, name) in enumerate(zip([er for er in Evecerr] + [Emagerr], names)):
plt.subplot(2, 2, i+1)
plt.imshow(err/E, origin='lower')
plt.colorbar()
plt.title(name, fontsize=16)
plt.show()
python python-2.7 scipy
This script calculates the shape of the electric field on a 2D grid perpendicular to a uniformly charged annulus using SciPy's dblquad
(tutorial, documentation).
I've chosen a coarse grid and specified a very large relative error epsrel
here so that this example will run in about 10 seconds, but I'll be using it on a much finer grid and with much lower specified error.
I'd like to know if there are ways this can be significantly sped up without loss of accuracy, which should be in the 1E-7 neighborhood.
I'll be avoiding pathological points manually, as I've done here by avoiding the z=0 plane. I've left out constants and units to reduce clutter.
def Excalc(r, th):
x, y, z = r*np.cos(th), r*np.sin(th), 0.0
return (x0-x) * ((x0-x)**2 + (y0-y)**2 + (z0-z)**2)**-1.5
def Eycalc(r, th):
x, y, z = r*np.cos(th), r*np.sin(th), 0.0
return (y0-y) * ((x0-x)**2 + (y0-y)**2 + (z0-z)**2)**-1.5
def Ezcalc(r, th):
x, y, z = r*np.cos(th), r*np.sin(th), 0.0
return (z0-z) * ((x0-x)**2 + (y0-y)**2 + (z0-z)**2)**-1.5
import numpy as np
import matplotlib.pyplot as plt
from scipy.integrate import dblquad
twopi = 2.*np.pi
# annulus of uniform, unit charge density
rmin, rmax = 0.8, 1.2
thmin, thmax = 0, twopi
x = np.arange(-2, 2.1, 0.1)
z = np.arange(-1.55, 1.6, 0.1) # avoid z=0 plane for integration.
X, Z = np.meshgrid(x, z)
s = X.shape
eps = 1E-4
y0 = 0
fields, errors = ,
for x0, z0 in zip(X.flatten(), Z.flatten()):
# use of lambda in lieu of function calls https://stackoverflow.com/a/49441680/3904031
Ex, Exerr = dblquad(Excalc, thmin, thmax, lambda t: rmin, lambda t: rmax, epsrel=eps)
Ey, Eyerr = dblquad(Eycalc, thmin, thmax, lambda t: rmin, lambda t: rmax, epsrel=eps)
Ez, Ezerr = dblquad(Ezcalc, thmin, thmax, lambda t: rmin, lambda t: rmax, epsrel=eps)
fields.append([Ex, Ey, Ez])
errors.append([Exerr, Eyerr, Ezerr])
Evec = np.stack([np.array(thing).reshape(s) for thing in zip(*fields)])
Emag = np.sqrt((Evec**2).sum(axis=0))
Evecerr = np.stack([np.array(thing).reshape(s) for thing in zip(*errors)])
Emagerr = np.sqrt((Evecerr**2).sum(axis=0))
if True:
names = 'Ex', 'Ey', 'Ez', 'Emag'
plt.figure()
for i, (E, name) in enumerate(zip([e for e in Evec] + [Emag], names)):
smalls = np.abs(E) < 1E-08
E[smalls] = 0. # fudge to keep Ey from showing 1E-14 fields
plt.subplot(2, 2, i+1)
plt.imshow(E, origin='lower')
plt.colorbar()
plt.title(name, fontsize=16)
plt.show()
if True:
names = 'Ex rel error', 'Ey rel error', 'Ez rel error', 'Emag rel error'
plt.figure()
for i, (err, name) in enumerate(zip([er for er in Evecerr] + [Emagerr], names)):
plt.subplot(2, 2, i+1)
plt.imshow(err/E, origin='lower')
plt.colorbar()
plt.title(name, fontsize=16)
plt.show()
python python-2.7 scipy
edited Mar 23 at 8:45
Mathias Ettinger
21.9k32876
21.9k32876
asked Mar 23 at 3:45
uhoh
1435
1435
add a comment |Â
add a comment |Â
2 Answers
2
active
oldest
votes
up vote
5
down vote
accepted
I think the main time consuming factor here is calculating the dbquad
, that said there are some smaller improvements you can make
split into functions
lookup of local variables in a function is faster than a global lookup, so putthing each part in a function can speed up this process already. If you do this, beware that your E_calc
functions use global state (x0
, y0
and z0
), so they should be defined in the function scope, so they can use the local variables
numpy
You can use numpy
a bit more to assemble the final answers
Instead of keeping 2 arrays fields
and errors
, you can keep this in 1 array, and index the errors and values.
def calc2():
twopi = 2.*np.pi
# annulus of uniform, unit charge density
rmin, rmax = 0.8, 1.2
thmin, thmax = 0, twopi
x = np.arange(-2, 2.1, 0.1)
z = np.arange(-1.55, 1.6, 0.1) # avoid z=0 plane for integration.
X, Z = np.meshgrid(x, z)
s = X.shape
eps = 1E-4
y0 = 0
def Excalc(r, th):
x, y, z = r*np.cos(th), r*np.sin(th), 0.0
return (x0-x) * ((x0-x)**2 + (y0-y)**2 + (z0-z)**2)**-1.5
def Eycalc(r, th):
x, y, z = r*np.cos(th), r*np.sin(th), 0.0
return (y0-y) * ((x0-x)**2 + (y0-y)**2 + (z0-z)**2)**-1.5
def Ezcalc(r, th):
x, y, z = r*np.cos(th), r*np.sin(th), 0.0
return (z0-z) * ((x0-x)**2 + (y0-y)**2 + (z0-z)**2)**-1.5
values =
for x0, z0 in zip(X.flatten(), Z.flatten()):
# use of lambda in lieu of function calls https://stackoverflow.com/a/49441680/3904031
Ex, Exerr = dblquad(Excalc, thmin, thmax, lambda t: rmin, lambda t: rmax, epsrel=eps)
Ey, Eyerr = dblquad(Eycalc, thmin, thmax, lambda t: rmin, lambda t: rmax, epsrel=eps)
Ez, Ezerr = dblquad(Ezcalc, thmin, thmax, lambda t: rmin, lambda t: rmax, epsrel=eps)
values.append([Ex, Ey, Ez, Exerr, Eyerr, Ezerr])
# errors.append([Exerr, Eyerr, Ezerr])
values = np.array(values).T.reshape(6, *X.shape)
e_vec = values[:3]
e_mag = np.linalg.norm(e_vec, axis=0)
e_vec_err = values[3:]
e_mag_err = np.linalg.norm(e_vec_err, axis=0)
return Evec, Emag, Evecerr, Emagerr
This resulted in a very slight reduction in time, so will not save you heaps
lru_cache
refactoring the calc
methods to reuse the most of the result actually slowed things down, this is how I tackled that
from functools import partial, lru_cache
@lru_cache(None)
def calc(r, th, coords_0):
coords_0 = np.array(coords_0)
diff = coords_0 - np.array((r*np.cos(th), r*np.sin(th), 0.0))
return diff * ((diff)**2).sum()**-1.5
def calc3():
rmin, rmax = 0.8, 1.2
thmin, thmax = 0, 2.*np.pi
# s = X.shape
eps = 1E-4
y0 = 0
x = np.arange(-2, 2.1, 0.2)
z = np.arange(-1.55, 1.6, 0.2) # avoid z=0 plane for integration.
X, Z = np.meshgrid(x, z)
calc_axis = lambda r, th, coords_0, axis: calc(r, th, coords_0)[axis]
values =
for x0, z0 in zip(X.flatten(), Z.flatten()):
coords_0 = (x0, y0, z0)
Excalc, Eycalc, Ezcalc = (partial(calc_axis, coords_0=coords_0, axis=i) for i in range(3))
# use of lambda in lieu of function calls https://stackoverflow.com/a/49441680/3904031
Ex, Exerr = dblquad(Excalc, thmin, thmax, lambda t: rmin, lambda t: rmax, epsrel=eps)
Ey, Eyerr = dblquad(Eycalc, thmin, thmax, lambda t: rmin, lambda t: rmax, epsrel=eps)
Ez, Ezerr = dblquad(Ezcalc, thmin, thmax, lambda t: rmin, lambda t: rmax, epsrel=eps)
values.append((Ex, Ey, Ez, Exerr, Eyerr, Ezerr))
print(calc.cache_info())
values = np.array(values).T.reshape(6, *X.shape)
e_vec = values[:3]
e_mag = np.linalg.norm(e_vec, axis=0)
e_vec_err = values[3:]
e_mag_err = np.linalg.norm(e_vec_err, axis=0)
return e_vec, e_mag, e_vec_err, e_mag_err, values
Apparently, about halve of the calls to calc
were cached, but moving the calculation to this method resulted in a net slowdown.
numba
What did result in a speed increase was numba
just adding @jit
before def calc(r, th, coords_0):
made the calculation 3 times as fast
Okay, this is great! I will certainly give this a try! It's late here so will test tomorrow. Does "3 times as fast" apply to the whole thing? I've assumed thatdblquad
is already compiled, so@jit
would not be affecting the actual numerical integration time, would it?
â uhoh
Mar 23 at 16:07
1
i haven't tried@jit
in front of the main routine, just before the smallercalc
routine. But since that one gets executed millions of times while calculating thedblquad
, that speeds up the program dramatically
â Maarten Fabré
Mar 24 at 8:40
Oh, I understand what you mean now, and of course it makes sense! Okay i"ll give it a try soon, thanks!
â uhoh
Mar 24 at 10:58
Great! original: 42 sec, lookup & better numpification: 32 sec, add@jit
: 8 sec. Okay there is quite a lot I can learn here in just this one answer. Thank you very much!
â uhoh
Mar 24 at 11:41
add a comment |Â
up vote
1
down vote
I'm not sure if it's the case, but if any of the 'calc' functions are ever called more than once with the same r
and th
values, then functools.lru_cache
could help speed things up here.
This is used to decorate a method, and then it stores a cache of the results for each set of params - returning the cached result immediately rather than re-running the calculation. It does require more memory to store the results, however, so you have to consider if that trade-off is acceptable.
You might be able to gain some speed by merging the main part of these methods, and then returning x
, y
, z
and the 'calculation' result. This is more likely to 'hit' the cache since you now have 1 cache, not 3. A simplified example here:
@functools.lru_cache()
def Ecalc(r, th):
x, y, z = r*np.cos(th), r*np.sin(th), 0.0
return x, y, z, ((x0-x)**2 + (y0-y)**2 + (z0-z)**2)**-1.5
# Maybe also lru_cache decorator here?
def new_method(axis, r, th):
x, y, z, result = Ecalc(r, th)
axes = 'x': x, 'y': y, 'z': z
index = axes[axis]
return (index0 - index) * result
new_method('x', r, th)
I see. I think it would be quite infrequent for thedblquad
to call the exact same floats very often considering how it works, but I will give this a try because it looks like something very useful and good to know about. If I don't see a large change in performance, is there some simple way to add a counter to see how often thelru_cache
is actually used?
â uhoh
Mar 23 at 10:41
1
@uhoh: The documentation says: "the wrapped function is instrumented with acache_info()
function that returns a named tuple showing hits, misses, maxsize and currsize". So have a look atEcalc.cache_info()
. (But I doubt a cache will be useful.)
â Gareth Rees
Mar 23 at 10:43
beware thatx0
, andz0
change every iteration
â Maarten Fabré
Mar 23 at 12:48
add a comment |Â
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
5
down vote
accepted
I think the main time consuming factor here is calculating the dbquad
, that said there are some smaller improvements you can make
split into functions
lookup of local variables in a function is faster than a global lookup, so putthing each part in a function can speed up this process already. If you do this, beware that your E_calc
functions use global state (x0
, y0
and z0
), so they should be defined in the function scope, so they can use the local variables
numpy
You can use numpy
a bit more to assemble the final answers
Instead of keeping 2 arrays fields
and errors
, you can keep this in 1 array, and index the errors and values.
def calc2():
twopi = 2.*np.pi
# annulus of uniform, unit charge density
rmin, rmax = 0.8, 1.2
thmin, thmax = 0, twopi
x = np.arange(-2, 2.1, 0.1)
z = np.arange(-1.55, 1.6, 0.1) # avoid z=0 plane for integration.
X, Z = np.meshgrid(x, z)
s = X.shape
eps = 1E-4
y0 = 0
def Excalc(r, th):
x, y, z = r*np.cos(th), r*np.sin(th), 0.0
return (x0-x) * ((x0-x)**2 + (y0-y)**2 + (z0-z)**2)**-1.5
def Eycalc(r, th):
x, y, z = r*np.cos(th), r*np.sin(th), 0.0
return (y0-y) * ((x0-x)**2 + (y0-y)**2 + (z0-z)**2)**-1.5
def Ezcalc(r, th):
x, y, z = r*np.cos(th), r*np.sin(th), 0.0
return (z0-z) * ((x0-x)**2 + (y0-y)**2 + (z0-z)**2)**-1.5
values =
for x0, z0 in zip(X.flatten(), Z.flatten()):
# use of lambda in lieu of function calls https://stackoverflow.com/a/49441680/3904031
Ex, Exerr = dblquad(Excalc, thmin, thmax, lambda t: rmin, lambda t: rmax, epsrel=eps)
Ey, Eyerr = dblquad(Eycalc, thmin, thmax, lambda t: rmin, lambda t: rmax, epsrel=eps)
Ez, Ezerr = dblquad(Ezcalc, thmin, thmax, lambda t: rmin, lambda t: rmax, epsrel=eps)
values.append([Ex, Ey, Ez, Exerr, Eyerr, Ezerr])
# errors.append([Exerr, Eyerr, Ezerr])
values = np.array(values).T.reshape(6, *X.shape)
e_vec = values[:3]
e_mag = np.linalg.norm(e_vec, axis=0)
e_vec_err = values[3:]
e_mag_err = np.linalg.norm(e_vec_err, axis=0)
return Evec, Emag, Evecerr, Emagerr
This resulted in a very slight reduction in time, so will not save you heaps
lru_cache
refactoring the calc
methods to reuse the most of the result actually slowed things down, this is how I tackled that
from functools import partial, lru_cache
@lru_cache(None)
def calc(r, th, coords_0):
coords_0 = np.array(coords_0)
diff = coords_0 - np.array((r*np.cos(th), r*np.sin(th), 0.0))
return diff * ((diff)**2).sum()**-1.5
def calc3():
rmin, rmax = 0.8, 1.2
thmin, thmax = 0, 2.*np.pi
# s = X.shape
eps = 1E-4
y0 = 0
x = np.arange(-2, 2.1, 0.2)
z = np.arange(-1.55, 1.6, 0.2) # avoid z=0 plane for integration.
X, Z = np.meshgrid(x, z)
calc_axis = lambda r, th, coords_0, axis: calc(r, th, coords_0)[axis]
values =
for x0, z0 in zip(X.flatten(), Z.flatten()):
coords_0 = (x0, y0, z0)
Excalc, Eycalc, Ezcalc = (partial(calc_axis, coords_0=coords_0, axis=i) for i in range(3))
# use of lambda in lieu of function calls https://stackoverflow.com/a/49441680/3904031
Ex, Exerr = dblquad(Excalc, thmin, thmax, lambda t: rmin, lambda t: rmax, epsrel=eps)
Ey, Eyerr = dblquad(Eycalc, thmin, thmax, lambda t: rmin, lambda t: rmax, epsrel=eps)
Ez, Ezerr = dblquad(Ezcalc, thmin, thmax, lambda t: rmin, lambda t: rmax, epsrel=eps)
values.append((Ex, Ey, Ez, Exerr, Eyerr, Ezerr))
print(calc.cache_info())
values = np.array(values).T.reshape(6, *X.shape)
e_vec = values[:3]
e_mag = np.linalg.norm(e_vec, axis=0)
e_vec_err = values[3:]
e_mag_err = np.linalg.norm(e_vec_err, axis=0)
return e_vec, e_mag, e_vec_err, e_mag_err, values
Apparently, about halve of the calls to calc
were cached, but moving the calculation to this method resulted in a net slowdown.
numba
What did result in a speed increase was numba
just adding @jit
before def calc(r, th, coords_0):
made the calculation 3 times as fast
Okay, this is great! I will certainly give this a try! It's late here so will test tomorrow. Does "3 times as fast" apply to the whole thing? I've assumed thatdblquad
is already compiled, so@jit
would not be affecting the actual numerical integration time, would it?
â uhoh
Mar 23 at 16:07
1
i haven't tried@jit
in front of the main routine, just before the smallercalc
routine. But since that one gets executed millions of times while calculating thedblquad
, that speeds up the program dramatically
â Maarten Fabré
Mar 24 at 8:40
Oh, I understand what you mean now, and of course it makes sense! Okay i"ll give it a try soon, thanks!
â uhoh
Mar 24 at 10:58
Great! original: 42 sec, lookup & better numpification: 32 sec, add@jit
: 8 sec. Okay there is quite a lot I can learn here in just this one answer. Thank you very much!
â uhoh
Mar 24 at 11:41
add a comment |Â
up vote
5
down vote
accepted
I think the main time consuming factor here is calculating the dbquad
, that said there are some smaller improvements you can make
split into functions
lookup of local variables in a function is faster than a global lookup, so putthing each part in a function can speed up this process already. If you do this, beware that your E_calc
functions use global state (x0
, y0
and z0
), so they should be defined in the function scope, so they can use the local variables
numpy
You can use numpy
a bit more to assemble the final answers
Instead of keeping 2 arrays fields
and errors
, you can keep this in 1 array, and index the errors and values.
def calc2():
twopi = 2.*np.pi
# annulus of uniform, unit charge density
rmin, rmax = 0.8, 1.2
thmin, thmax = 0, twopi
x = np.arange(-2, 2.1, 0.1)
z = np.arange(-1.55, 1.6, 0.1) # avoid z=0 plane for integration.
X, Z = np.meshgrid(x, z)
s = X.shape
eps = 1E-4
y0 = 0
def Excalc(r, th):
x, y, z = r*np.cos(th), r*np.sin(th), 0.0
return (x0-x) * ((x0-x)**2 + (y0-y)**2 + (z0-z)**2)**-1.5
def Eycalc(r, th):
x, y, z = r*np.cos(th), r*np.sin(th), 0.0
return (y0-y) * ((x0-x)**2 + (y0-y)**2 + (z0-z)**2)**-1.5
def Ezcalc(r, th):
x, y, z = r*np.cos(th), r*np.sin(th), 0.0
return (z0-z) * ((x0-x)**2 + (y0-y)**2 + (z0-z)**2)**-1.5
values =
for x0, z0 in zip(X.flatten(), Z.flatten()):
# use of lambda in lieu of function calls https://stackoverflow.com/a/49441680/3904031
Ex, Exerr = dblquad(Excalc, thmin, thmax, lambda t: rmin, lambda t: rmax, epsrel=eps)
Ey, Eyerr = dblquad(Eycalc, thmin, thmax, lambda t: rmin, lambda t: rmax, epsrel=eps)
Ez, Ezerr = dblquad(Ezcalc, thmin, thmax, lambda t: rmin, lambda t: rmax, epsrel=eps)
values.append([Ex, Ey, Ez, Exerr, Eyerr, Ezerr])
# errors.append([Exerr, Eyerr, Ezerr])
values = np.array(values).T.reshape(6, *X.shape)
e_vec = values[:3]
e_mag = np.linalg.norm(e_vec, axis=0)
e_vec_err = values[3:]
e_mag_err = np.linalg.norm(e_vec_err, axis=0)
return Evec, Emag, Evecerr, Emagerr
This resulted in a very slight reduction in time, so will not save you heaps
lru_cache
refactoring the calc
methods to reuse the most of the result actually slowed things down, this is how I tackled that
from functools import partial, lru_cache
@lru_cache(None)
def calc(r, th, coords_0):
coords_0 = np.array(coords_0)
diff = coords_0 - np.array((r*np.cos(th), r*np.sin(th), 0.0))
return diff * ((diff)**2).sum()**-1.5
def calc3():
rmin, rmax = 0.8, 1.2
thmin, thmax = 0, 2.*np.pi
# s = X.shape
eps = 1E-4
y0 = 0
x = np.arange(-2, 2.1, 0.2)
z = np.arange(-1.55, 1.6, 0.2) # avoid z=0 plane for integration.
X, Z = np.meshgrid(x, z)
calc_axis = lambda r, th, coords_0, axis: calc(r, th, coords_0)[axis]
values =
for x0, z0 in zip(X.flatten(), Z.flatten()):
coords_0 = (x0, y0, z0)
Excalc, Eycalc, Ezcalc = (partial(calc_axis, coords_0=coords_0, axis=i) for i in range(3))
# use of lambda in lieu of function calls https://stackoverflow.com/a/49441680/3904031
Ex, Exerr = dblquad(Excalc, thmin, thmax, lambda t: rmin, lambda t: rmax, epsrel=eps)
Ey, Eyerr = dblquad(Eycalc, thmin, thmax, lambda t: rmin, lambda t: rmax, epsrel=eps)
Ez, Ezerr = dblquad(Ezcalc, thmin, thmax, lambda t: rmin, lambda t: rmax, epsrel=eps)
values.append((Ex, Ey, Ez, Exerr, Eyerr, Ezerr))
print(calc.cache_info())
values = np.array(values).T.reshape(6, *X.shape)
e_vec = values[:3]
e_mag = np.linalg.norm(e_vec, axis=0)
e_vec_err = values[3:]
e_mag_err = np.linalg.norm(e_vec_err, axis=0)
return e_vec, e_mag, e_vec_err, e_mag_err, values
Apparently, about halve of the calls to calc
were cached, but moving the calculation to this method resulted in a net slowdown.
numba
What did result in a speed increase was numba
just adding @jit
before def calc(r, th, coords_0):
made the calculation 3 times as fast
Okay, this is great! I will certainly give this a try! It's late here so will test tomorrow. Does "3 times as fast" apply to the whole thing? I've assumed thatdblquad
is already compiled, so@jit
would not be affecting the actual numerical integration time, would it?
â uhoh
Mar 23 at 16:07
1
i haven't tried@jit
in front of the main routine, just before the smallercalc
routine. But since that one gets executed millions of times while calculating thedblquad
, that speeds up the program dramatically
â Maarten Fabré
Mar 24 at 8:40
Oh, I understand what you mean now, and of course it makes sense! Okay i"ll give it a try soon, thanks!
â uhoh
Mar 24 at 10:58
Great! original: 42 sec, lookup & better numpification: 32 sec, add@jit
: 8 sec. Okay there is quite a lot I can learn here in just this one answer. Thank you very much!
â uhoh
Mar 24 at 11:41
add a comment |Â
up vote
5
down vote
accepted
up vote
5
down vote
accepted
I think the main time consuming factor here is calculating the dbquad
, that said there are some smaller improvements you can make
split into functions
lookup of local variables in a function is faster than a global lookup, so putthing each part in a function can speed up this process already. If you do this, beware that your E_calc
functions use global state (x0
, y0
and z0
), so they should be defined in the function scope, so they can use the local variables
numpy
You can use numpy
a bit more to assemble the final answers
Instead of keeping 2 arrays fields
and errors
, you can keep this in 1 array, and index the errors and values.
def calc2():
twopi = 2.*np.pi
# annulus of uniform, unit charge density
rmin, rmax = 0.8, 1.2
thmin, thmax = 0, twopi
x = np.arange(-2, 2.1, 0.1)
z = np.arange(-1.55, 1.6, 0.1) # avoid z=0 plane for integration.
X, Z = np.meshgrid(x, z)
s = X.shape
eps = 1E-4
y0 = 0
def Excalc(r, th):
x, y, z = r*np.cos(th), r*np.sin(th), 0.0
return (x0-x) * ((x0-x)**2 + (y0-y)**2 + (z0-z)**2)**-1.5
def Eycalc(r, th):
x, y, z = r*np.cos(th), r*np.sin(th), 0.0
return (y0-y) * ((x0-x)**2 + (y0-y)**2 + (z0-z)**2)**-1.5
def Ezcalc(r, th):
x, y, z = r*np.cos(th), r*np.sin(th), 0.0
return (z0-z) * ((x0-x)**2 + (y0-y)**2 + (z0-z)**2)**-1.5
values =
for x0, z0 in zip(X.flatten(), Z.flatten()):
# use of lambda in lieu of function calls https://stackoverflow.com/a/49441680/3904031
Ex, Exerr = dblquad(Excalc, thmin, thmax, lambda t: rmin, lambda t: rmax, epsrel=eps)
Ey, Eyerr = dblquad(Eycalc, thmin, thmax, lambda t: rmin, lambda t: rmax, epsrel=eps)
Ez, Ezerr = dblquad(Ezcalc, thmin, thmax, lambda t: rmin, lambda t: rmax, epsrel=eps)
values.append([Ex, Ey, Ez, Exerr, Eyerr, Ezerr])
# errors.append([Exerr, Eyerr, Ezerr])
values = np.array(values).T.reshape(6, *X.shape)
e_vec = values[:3]
e_mag = np.linalg.norm(e_vec, axis=0)
e_vec_err = values[3:]
e_mag_err = np.linalg.norm(e_vec_err, axis=0)
return Evec, Emag, Evecerr, Emagerr
This resulted in a very slight reduction in time, so will not save you heaps
lru_cache
refactoring the calc
methods to reuse the most of the result actually slowed things down, this is how I tackled that
from functools import partial, lru_cache
@lru_cache(None)
def calc(r, th, coords_0):
coords_0 = np.array(coords_0)
diff = coords_0 - np.array((r*np.cos(th), r*np.sin(th), 0.0))
return diff * ((diff)**2).sum()**-1.5
def calc3():
rmin, rmax = 0.8, 1.2
thmin, thmax = 0, 2.*np.pi
# s = X.shape
eps = 1E-4
y0 = 0
x = np.arange(-2, 2.1, 0.2)
z = np.arange(-1.55, 1.6, 0.2) # avoid z=0 plane for integration.
X, Z = np.meshgrid(x, z)
calc_axis = lambda r, th, coords_0, axis: calc(r, th, coords_0)[axis]
values =
for x0, z0 in zip(X.flatten(), Z.flatten()):
coords_0 = (x0, y0, z0)
Excalc, Eycalc, Ezcalc = (partial(calc_axis, coords_0=coords_0, axis=i) for i in range(3))
# use of lambda in lieu of function calls https://stackoverflow.com/a/49441680/3904031
Ex, Exerr = dblquad(Excalc, thmin, thmax, lambda t: rmin, lambda t: rmax, epsrel=eps)
Ey, Eyerr = dblquad(Eycalc, thmin, thmax, lambda t: rmin, lambda t: rmax, epsrel=eps)
Ez, Ezerr = dblquad(Ezcalc, thmin, thmax, lambda t: rmin, lambda t: rmax, epsrel=eps)
values.append((Ex, Ey, Ez, Exerr, Eyerr, Ezerr))
print(calc.cache_info())
values = np.array(values).T.reshape(6, *X.shape)
e_vec = values[:3]
e_mag = np.linalg.norm(e_vec, axis=0)
e_vec_err = values[3:]
e_mag_err = np.linalg.norm(e_vec_err, axis=0)
return e_vec, e_mag, e_vec_err, e_mag_err, values
Apparently, about halve of the calls to calc
were cached, but moving the calculation to this method resulted in a net slowdown.
numba
What did result in a speed increase was numba
just adding @jit
before def calc(r, th, coords_0):
made the calculation 3 times as fast
I think the main time consuming factor here is calculating the dbquad
, that said there are some smaller improvements you can make
split into functions
lookup of local variables in a function is faster than a global lookup, so putthing each part in a function can speed up this process already. If you do this, beware that your E_calc
functions use global state (x0
, y0
and z0
), so they should be defined in the function scope, so they can use the local variables
numpy
You can use numpy
a bit more to assemble the final answers
Instead of keeping 2 arrays fields
and errors
, you can keep this in 1 array, and index the errors and values.
def calc2():
twopi = 2.*np.pi
# annulus of uniform, unit charge density
rmin, rmax = 0.8, 1.2
thmin, thmax = 0, twopi
x = np.arange(-2, 2.1, 0.1)
z = np.arange(-1.55, 1.6, 0.1) # avoid z=0 plane for integration.
X, Z = np.meshgrid(x, z)
s = X.shape
eps = 1E-4
y0 = 0
def Excalc(r, th):
x, y, z = r*np.cos(th), r*np.sin(th), 0.0
return (x0-x) * ((x0-x)**2 + (y0-y)**2 + (z0-z)**2)**-1.5
def Eycalc(r, th):
x, y, z = r*np.cos(th), r*np.sin(th), 0.0
return (y0-y) * ((x0-x)**2 + (y0-y)**2 + (z0-z)**2)**-1.5
def Ezcalc(r, th):
x, y, z = r*np.cos(th), r*np.sin(th), 0.0
return (z0-z) * ((x0-x)**2 + (y0-y)**2 + (z0-z)**2)**-1.5
values =
for x0, z0 in zip(X.flatten(), Z.flatten()):
# use of lambda in lieu of function calls https://stackoverflow.com/a/49441680/3904031
Ex, Exerr = dblquad(Excalc, thmin, thmax, lambda t: rmin, lambda t: rmax, epsrel=eps)
Ey, Eyerr = dblquad(Eycalc, thmin, thmax, lambda t: rmin, lambda t: rmax, epsrel=eps)
Ez, Ezerr = dblquad(Ezcalc, thmin, thmax, lambda t: rmin, lambda t: rmax, epsrel=eps)
values.append([Ex, Ey, Ez, Exerr, Eyerr, Ezerr])
# errors.append([Exerr, Eyerr, Ezerr])
values = np.array(values).T.reshape(6, *X.shape)
e_vec = values[:3]
e_mag = np.linalg.norm(e_vec, axis=0)
e_vec_err = values[3:]
e_mag_err = np.linalg.norm(e_vec_err, axis=0)
return Evec, Emag, Evecerr, Emagerr
This resulted in a very slight reduction in time, so will not save you heaps
lru_cache
refactoring the calc
methods to reuse the most of the result actually slowed things down, this is how I tackled that
from functools import partial, lru_cache
@lru_cache(None)
def calc(r, th, coords_0):
coords_0 = np.array(coords_0)
diff = coords_0 - np.array((r*np.cos(th), r*np.sin(th), 0.0))
return diff * ((diff)**2).sum()**-1.5
def calc3():
rmin, rmax = 0.8, 1.2
thmin, thmax = 0, 2.*np.pi
# s = X.shape
eps = 1E-4
y0 = 0
x = np.arange(-2, 2.1, 0.2)
z = np.arange(-1.55, 1.6, 0.2) # avoid z=0 plane for integration.
X, Z = np.meshgrid(x, z)
calc_axis = lambda r, th, coords_0, axis: calc(r, th, coords_0)[axis]
values =
for x0, z0 in zip(X.flatten(), Z.flatten()):
coords_0 = (x0, y0, z0)
Excalc, Eycalc, Ezcalc = (partial(calc_axis, coords_0=coords_0, axis=i) for i in range(3))
# use of lambda in lieu of function calls https://stackoverflow.com/a/49441680/3904031
Ex, Exerr = dblquad(Excalc, thmin, thmax, lambda t: rmin, lambda t: rmax, epsrel=eps)
Ey, Eyerr = dblquad(Eycalc, thmin, thmax, lambda t: rmin, lambda t: rmax, epsrel=eps)
Ez, Ezerr = dblquad(Ezcalc, thmin, thmax, lambda t: rmin, lambda t: rmax, epsrel=eps)
values.append((Ex, Ey, Ez, Exerr, Eyerr, Ezerr))
print(calc.cache_info())
values = np.array(values).T.reshape(6, *X.shape)
e_vec = values[:3]
e_mag = np.linalg.norm(e_vec, axis=0)
e_vec_err = values[3:]
e_mag_err = np.linalg.norm(e_vec_err, axis=0)
return e_vec, e_mag, e_vec_err, e_mag_err, values
Apparently, about halve of the calls to calc
were cached, but moving the calculation to this method resulted in a net slowdown.
numba
What did result in a speed increase was numba
just adding @jit
before def calc(r, th, coords_0):
made the calculation 3 times as fast
answered Mar 23 at 15:38
Maarten Fabré
3,204214
3,204214
Okay, this is great! I will certainly give this a try! It's late here so will test tomorrow. Does "3 times as fast" apply to the whole thing? I've assumed thatdblquad
is already compiled, so@jit
would not be affecting the actual numerical integration time, would it?
â uhoh
Mar 23 at 16:07
1
i haven't tried@jit
in front of the main routine, just before the smallercalc
routine. But since that one gets executed millions of times while calculating thedblquad
, that speeds up the program dramatically
â Maarten Fabré
Mar 24 at 8:40
Oh, I understand what you mean now, and of course it makes sense! Okay i"ll give it a try soon, thanks!
â uhoh
Mar 24 at 10:58
Great! original: 42 sec, lookup & better numpification: 32 sec, add@jit
: 8 sec. Okay there is quite a lot I can learn here in just this one answer. Thank you very much!
â uhoh
Mar 24 at 11:41
add a comment |Â
Okay, this is great! I will certainly give this a try! It's late here so will test tomorrow. Does "3 times as fast" apply to the whole thing? I've assumed thatdblquad
is already compiled, so@jit
would not be affecting the actual numerical integration time, would it?
â uhoh
Mar 23 at 16:07
1
i haven't tried@jit
in front of the main routine, just before the smallercalc
routine. But since that one gets executed millions of times while calculating thedblquad
, that speeds up the program dramatically
â Maarten Fabré
Mar 24 at 8:40
Oh, I understand what you mean now, and of course it makes sense! Okay i"ll give it a try soon, thanks!
â uhoh
Mar 24 at 10:58
Great! original: 42 sec, lookup & better numpification: 32 sec, add@jit
: 8 sec. Okay there is quite a lot I can learn here in just this one answer. Thank you very much!
â uhoh
Mar 24 at 11:41
Okay, this is great! I will certainly give this a try! It's late here so will test tomorrow. Does "3 times as fast" apply to the whole thing? I've assumed that
dblquad
is already compiled, so @jit
would not be affecting the actual numerical integration time, would it?â uhoh
Mar 23 at 16:07
Okay, this is great! I will certainly give this a try! It's late here so will test tomorrow. Does "3 times as fast" apply to the whole thing? I've assumed that
dblquad
is already compiled, so @jit
would not be affecting the actual numerical integration time, would it?â uhoh
Mar 23 at 16:07
1
1
i haven't tried
@jit
in front of the main routine, just before the smaller calc
routine. But since that one gets executed millions of times while calculating the dblquad
, that speeds up the program dramaticallyâ Maarten Fabré
Mar 24 at 8:40
i haven't tried
@jit
in front of the main routine, just before the smaller calc
routine. But since that one gets executed millions of times while calculating the dblquad
, that speeds up the program dramaticallyâ Maarten Fabré
Mar 24 at 8:40
Oh, I understand what you mean now, and of course it makes sense! Okay i"ll give it a try soon, thanks!
â uhoh
Mar 24 at 10:58
Oh, I understand what you mean now, and of course it makes sense! Okay i"ll give it a try soon, thanks!
â uhoh
Mar 24 at 10:58
Great! original: 42 sec, lookup & better numpification: 32 sec, add
@jit
: 8 sec. Okay there is quite a lot I can learn here in just this one answer. Thank you very much!â uhoh
Mar 24 at 11:41
Great! original: 42 sec, lookup & better numpification: 32 sec, add
@jit
: 8 sec. Okay there is quite a lot I can learn here in just this one answer. Thank you very much!â uhoh
Mar 24 at 11:41
add a comment |Â
up vote
1
down vote
I'm not sure if it's the case, but if any of the 'calc' functions are ever called more than once with the same r
and th
values, then functools.lru_cache
could help speed things up here.
This is used to decorate a method, and then it stores a cache of the results for each set of params - returning the cached result immediately rather than re-running the calculation. It does require more memory to store the results, however, so you have to consider if that trade-off is acceptable.
You might be able to gain some speed by merging the main part of these methods, and then returning x
, y
, z
and the 'calculation' result. This is more likely to 'hit' the cache since you now have 1 cache, not 3. A simplified example here:
@functools.lru_cache()
def Ecalc(r, th):
x, y, z = r*np.cos(th), r*np.sin(th), 0.0
return x, y, z, ((x0-x)**2 + (y0-y)**2 + (z0-z)**2)**-1.5
# Maybe also lru_cache decorator here?
def new_method(axis, r, th):
x, y, z, result = Ecalc(r, th)
axes = 'x': x, 'y': y, 'z': z
index = axes[axis]
return (index0 - index) * result
new_method('x', r, th)
I see. I think it would be quite infrequent for thedblquad
to call the exact same floats very often considering how it works, but I will give this a try because it looks like something very useful and good to know about. If I don't see a large change in performance, is there some simple way to add a counter to see how often thelru_cache
is actually used?
â uhoh
Mar 23 at 10:41
1
@uhoh: The documentation says: "the wrapped function is instrumented with acache_info()
function that returns a named tuple showing hits, misses, maxsize and currsize". So have a look atEcalc.cache_info()
. (But I doubt a cache will be useful.)
â Gareth Rees
Mar 23 at 10:43
beware thatx0
, andz0
change every iteration
â Maarten Fabré
Mar 23 at 12:48
add a comment |Â
up vote
1
down vote
I'm not sure if it's the case, but if any of the 'calc' functions are ever called more than once with the same r
and th
values, then functools.lru_cache
could help speed things up here.
This is used to decorate a method, and then it stores a cache of the results for each set of params - returning the cached result immediately rather than re-running the calculation. It does require more memory to store the results, however, so you have to consider if that trade-off is acceptable.
You might be able to gain some speed by merging the main part of these methods, and then returning x
, y
, z
and the 'calculation' result. This is more likely to 'hit' the cache since you now have 1 cache, not 3. A simplified example here:
@functools.lru_cache()
def Ecalc(r, th):
x, y, z = r*np.cos(th), r*np.sin(th), 0.0
return x, y, z, ((x0-x)**2 + (y0-y)**2 + (z0-z)**2)**-1.5
# Maybe also lru_cache decorator here?
def new_method(axis, r, th):
x, y, z, result = Ecalc(r, th)
axes = 'x': x, 'y': y, 'z': z
index = axes[axis]
return (index0 - index) * result
new_method('x', r, th)
I see. I think it would be quite infrequent for thedblquad
to call the exact same floats very often considering how it works, but I will give this a try because it looks like something very useful and good to know about. If I don't see a large change in performance, is there some simple way to add a counter to see how often thelru_cache
is actually used?
â uhoh
Mar 23 at 10:41
1
@uhoh: The documentation says: "the wrapped function is instrumented with acache_info()
function that returns a named tuple showing hits, misses, maxsize and currsize". So have a look atEcalc.cache_info()
. (But I doubt a cache will be useful.)
â Gareth Rees
Mar 23 at 10:43
beware thatx0
, andz0
change every iteration
â Maarten Fabré
Mar 23 at 12:48
add a comment |Â
up vote
1
down vote
up vote
1
down vote
I'm not sure if it's the case, but if any of the 'calc' functions are ever called more than once with the same r
and th
values, then functools.lru_cache
could help speed things up here.
This is used to decorate a method, and then it stores a cache of the results for each set of params - returning the cached result immediately rather than re-running the calculation. It does require more memory to store the results, however, so you have to consider if that trade-off is acceptable.
You might be able to gain some speed by merging the main part of these methods, and then returning x
, y
, z
and the 'calculation' result. This is more likely to 'hit' the cache since you now have 1 cache, not 3. A simplified example here:
@functools.lru_cache()
def Ecalc(r, th):
x, y, z = r*np.cos(th), r*np.sin(th), 0.0
return x, y, z, ((x0-x)**2 + (y0-y)**2 + (z0-z)**2)**-1.5
# Maybe also lru_cache decorator here?
def new_method(axis, r, th):
x, y, z, result = Ecalc(r, th)
axes = 'x': x, 'y': y, 'z': z
index = axes[axis]
return (index0 - index) * result
new_method('x', r, th)
I'm not sure if it's the case, but if any of the 'calc' functions are ever called more than once with the same r
and th
values, then functools.lru_cache
could help speed things up here.
This is used to decorate a method, and then it stores a cache of the results for each set of params - returning the cached result immediately rather than re-running the calculation. It does require more memory to store the results, however, so you have to consider if that trade-off is acceptable.
You might be able to gain some speed by merging the main part of these methods, and then returning x
, y
, z
and the 'calculation' result. This is more likely to 'hit' the cache since you now have 1 cache, not 3. A simplified example here:
@functools.lru_cache()
def Ecalc(r, th):
x, y, z = r*np.cos(th), r*np.sin(th), 0.0
return x, y, z, ((x0-x)**2 + (y0-y)**2 + (z0-z)**2)**-1.5
# Maybe also lru_cache decorator here?
def new_method(axis, r, th):
x, y, z, result = Ecalc(r, th)
axes = 'x': x, 'y': y, 'z': z
index = axes[axis]
return (index0 - index) * result
new_method('x', r, th)
edited Mar 23 at 9:37
answered Mar 23 at 9:20
match
4565
4565
I see. I think it would be quite infrequent for thedblquad
to call the exact same floats very often considering how it works, but I will give this a try because it looks like something very useful and good to know about. If I don't see a large change in performance, is there some simple way to add a counter to see how often thelru_cache
is actually used?
â uhoh
Mar 23 at 10:41
1
@uhoh: The documentation says: "the wrapped function is instrumented with acache_info()
function that returns a named tuple showing hits, misses, maxsize and currsize". So have a look atEcalc.cache_info()
. (But I doubt a cache will be useful.)
â Gareth Rees
Mar 23 at 10:43
beware thatx0
, andz0
change every iteration
â Maarten Fabré
Mar 23 at 12:48
add a comment |Â
I see. I think it would be quite infrequent for thedblquad
to call the exact same floats very often considering how it works, but I will give this a try because it looks like something very useful and good to know about. If I don't see a large change in performance, is there some simple way to add a counter to see how often thelru_cache
is actually used?
â uhoh
Mar 23 at 10:41
1
@uhoh: The documentation says: "the wrapped function is instrumented with acache_info()
function that returns a named tuple showing hits, misses, maxsize and currsize". So have a look atEcalc.cache_info()
. (But I doubt a cache will be useful.)
â Gareth Rees
Mar 23 at 10:43
beware thatx0
, andz0
change every iteration
â Maarten Fabré
Mar 23 at 12:48
I see. I think it would be quite infrequent for the
dblquad
to call the exact same floats very often considering how it works, but I will give this a try because it looks like something very useful and good to know about. If I don't see a large change in performance, is there some simple way to add a counter to see how often the lru_cache
is actually used?â uhoh
Mar 23 at 10:41
I see. I think it would be quite infrequent for the
dblquad
to call the exact same floats very often considering how it works, but I will give this a try because it looks like something very useful and good to know about. If I don't see a large change in performance, is there some simple way to add a counter to see how often the lru_cache
is actually used?â uhoh
Mar 23 at 10:41
1
1
@uhoh: The documentation says: "the wrapped function is instrumented with a
cache_info()
function that returns a named tuple showing hits, misses, maxsize and currsize". So have a look at Ecalc.cache_info()
. (But I doubt a cache will be useful.)â Gareth Rees
Mar 23 at 10:43
@uhoh: The documentation says: "the wrapped function is instrumented with a
cache_info()
function that returns a named tuple showing hits, misses, maxsize and currsize". So have a look at Ecalc.cache_info()
. (But I doubt a cache will be useful.)â Gareth Rees
Mar 23 at 10:43
beware that
x0
, and z0
change every iterationâ Maarten Fabré
Mar 23 at 12:48
beware that
x0
, and z0
change every iterationâ Maarten Fabré
Mar 23 at 12:48
add a comment |Â
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f190268%2fcalculate-electric-field-of-a-charged-annulus%23new-answer', 'question_page');
);
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password