Aggregate conditional logic test results iterated over changing row selections

Clash Royale CLAN TAG#URR8PPP
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty margin-bottom:0;
up vote
1
down vote
favorite
This for-loop is the biggest bottleneck in my project, and I'm curious whether there is an opportunity to speed it up by improving the grammar or by using a different strategy altogether, perhaps. I'm a noob to Python and migrating this from Excel.
Here's some sample data and the scheme. j's range is actually 25k and i's about 1k, so it takes quite a while.
Functionally, it selects rows in df1 where a string in any of the 'typeX' columns matches a particular element in df2_columns, then tests the sl_pt element on each of those rows. It creates a 'dummy' array s to reduce the 3-part 'or' test down to a T/F, and performs some math on the count of T's as well as the count of F's. It tests with every sl_pt element and then goes to the next element in df2_columns, rinse, repeat.
import pandas as pd
import numpy as np
import functools as ft
df1 = pd.DataFrame('type1' : ['A1_v3', 'A2_v2', 'A1_v3', 'B1_v2'],
'type2' : ['A1_vr1', 'A2_vr3', 'A1_vr3', 'B1_vr2'],
'type3' : ['A1_vp4', 'A2_vp2', 'A1_vp3', 'B1_vp4'],
'mae1' : [14, 16, 4, 63],
'mae2' : [17, 29, 38, 63],
'mae3' : [27, 29, 77, 63],
'mfe1' : [48, 31, 59, 0],
'mfe2' : [79, 37, 59, 14],
'mfe3' : [79, 108, 71, 36])
df2 = pd.DataFrame('sl_pt' : [[22, 62], [22, 63], [22, 64], [22, 66], [22, 68], [22, 70]])
df2_columns = ['A1_v3', 'A2_vr3', 'B1_vp4']
for i in range(3):
tag = ft.reduce(np.logical_or, [df1['type'.format(k)].str.startswith(df2_columns[i]) for k in range(1,4)])
selection = df1.loc[tag]
for j in range(6):
sl = df2['sl_pt'][j][0]
pt = df2['sl_pt'][j][1]
s = ft.reduce(np.logical_or, [(selection['mae'.format(m)] >= 0) & (selection['mae'.format(m)] <= sl) & (selection['mfe'.format(m)] >= pt) for m in range(1,4)])
df2.loc[j, df2_columns[i]] = sum(s) * pt * 10 + (len(s) - sum(s)) * sl * -10
python performance beginner numpy pandas
add a comment |Â
up vote
1
down vote
favorite
This for-loop is the biggest bottleneck in my project, and I'm curious whether there is an opportunity to speed it up by improving the grammar or by using a different strategy altogether, perhaps. I'm a noob to Python and migrating this from Excel.
Here's some sample data and the scheme. j's range is actually 25k and i's about 1k, so it takes quite a while.
Functionally, it selects rows in df1 where a string in any of the 'typeX' columns matches a particular element in df2_columns, then tests the sl_pt element on each of those rows. It creates a 'dummy' array s to reduce the 3-part 'or' test down to a T/F, and performs some math on the count of T's as well as the count of F's. It tests with every sl_pt element and then goes to the next element in df2_columns, rinse, repeat.
import pandas as pd
import numpy as np
import functools as ft
df1 = pd.DataFrame('type1' : ['A1_v3', 'A2_v2', 'A1_v3', 'B1_v2'],
'type2' : ['A1_vr1', 'A2_vr3', 'A1_vr3', 'B1_vr2'],
'type3' : ['A1_vp4', 'A2_vp2', 'A1_vp3', 'B1_vp4'],
'mae1' : [14, 16, 4, 63],
'mae2' : [17, 29, 38, 63],
'mae3' : [27, 29, 77, 63],
'mfe1' : [48, 31, 59, 0],
'mfe2' : [79, 37, 59, 14],
'mfe3' : [79, 108, 71, 36])
df2 = pd.DataFrame('sl_pt' : [[22, 62], [22, 63], [22, 64], [22, 66], [22, 68], [22, 70]])
df2_columns = ['A1_v3', 'A2_vr3', 'B1_vp4']
for i in range(3):
tag = ft.reduce(np.logical_or, [df1['type'.format(k)].str.startswith(df2_columns[i]) for k in range(1,4)])
selection = df1.loc[tag]
for j in range(6):
sl = df2['sl_pt'][j][0]
pt = df2['sl_pt'][j][1]
s = ft.reduce(np.logical_or, [(selection['mae'.format(m)] >= 0) & (selection['mae'.format(m)] <= sl) & (selection['mfe'.format(m)] >= pt) for m in range(1,4)])
df2.loc[j, df2_columns[i]] = sum(s) * pt * 10 + (len(s) - sum(s)) * sl * -10
python performance beginner numpy pandas
add a comment |Â
up vote
1
down vote
favorite
up vote
1
down vote
favorite
This for-loop is the biggest bottleneck in my project, and I'm curious whether there is an opportunity to speed it up by improving the grammar or by using a different strategy altogether, perhaps. I'm a noob to Python and migrating this from Excel.
Here's some sample data and the scheme. j's range is actually 25k and i's about 1k, so it takes quite a while.
Functionally, it selects rows in df1 where a string in any of the 'typeX' columns matches a particular element in df2_columns, then tests the sl_pt element on each of those rows. It creates a 'dummy' array s to reduce the 3-part 'or' test down to a T/F, and performs some math on the count of T's as well as the count of F's. It tests with every sl_pt element and then goes to the next element in df2_columns, rinse, repeat.
import pandas as pd
import numpy as np
import functools as ft
df1 = pd.DataFrame('type1' : ['A1_v3', 'A2_v2', 'A1_v3', 'B1_v2'],
'type2' : ['A1_vr1', 'A2_vr3', 'A1_vr3', 'B1_vr2'],
'type3' : ['A1_vp4', 'A2_vp2', 'A1_vp3', 'B1_vp4'],
'mae1' : [14, 16, 4, 63],
'mae2' : [17, 29, 38, 63],
'mae3' : [27, 29, 77, 63],
'mfe1' : [48, 31, 59, 0],
'mfe2' : [79, 37, 59, 14],
'mfe3' : [79, 108, 71, 36])
df2 = pd.DataFrame('sl_pt' : [[22, 62], [22, 63], [22, 64], [22, 66], [22, 68], [22, 70]])
df2_columns = ['A1_v3', 'A2_vr3', 'B1_vp4']
for i in range(3):
tag = ft.reduce(np.logical_or, [df1['type'.format(k)].str.startswith(df2_columns[i]) for k in range(1,4)])
selection = df1.loc[tag]
for j in range(6):
sl = df2['sl_pt'][j][0]
pt = df2['sl_pt'][j][1]
s = ft.reduce(np.logical_or, [(selection['mae'.format(m)] >= 0) & (selection['mae'.format(m)] <= sl) & (selection['mfe'.format(m)] >= pt) for m in range(1,4)])
df2.loc[j, df2_columns[i]] = sum(s) * pt * 10 + (len(s) - sum(s)) * sl * -10
python performance beginner numpy pandas
This for-loop is the biggest bottleneck in my project, and I'm curious whether there is an opportunity to speed it up by improving the grammar or by using a different strategy altogether, perhaps. I'm a noob to Python and migrating this from Excel.
Here's some sample data and the scheme. j's range is actually 25k and i's about 1k, so it takes quite a while.
Functionally, it selects rows in df1 where a string in any of the 'typeX' columns matches a particular element in df2_columns, then tests the sl_pt element on each of those rows. It creates a 'dummy' array s to reduce the 3-part 'or' test down to a T/F, and performs some math on the count of T's as well as the count of F's. It tests with every sl_pt element and then goes to the next element in df2_columns, rinse, repeat.
import pandas as pd
import numpy as np
import functools as ft
df1 = pd.DataFrame('type1' : ['A1_v3', 'A2_v2', 'A1_v3', 'B1_v2'],
'type2' : ['A1_vr1', 'A2_vr3', 'A1_vr3', 'B1_vr2'],
'type3' : ['A1_vp4', 'A2_vp2', 'A1_vp3', 'B1_vp4'],
'mae1' : [14, 16, 4, 63],
'mae2' : [17, 29, 38, 63],
'mae3' : [27, 29, 77, 63],
'mfe1' : [48, 31, 59, 0],
'mfe2' : [79, 37, 59, 14],
'mfe3' : [79, 108, 71, 36])
df2 = pd.DataFrame('sl_pt' : [[22, 62], [22, 63], [22, 64], [22, 66], [22, 68], [22, 70]])
df2_columns = ['A1_v3', 'A2_vr3', 'B1_vp4']
for i in range(3):
tag = ft.reduce(np.logical_or, [df1['type'.format(k)].str.startswith(df2_columns[i]) for k in range(1,4)])
selection = df1.loc[tag]
for j in range(6):
sl = df2['sl_pt'][j][0]
pt = df2['sl_pt'][j][1]
s = ft.reduce(np.logical_or, [(selection['mae'.format(m)] >= 0) & (selection['mae'.format(m)] <= sl) & (selection['mfe'.format(m)] >= pt) for m in range(1,4)])
df2.loc[j, df2_columns[i]] = sum(s) * pt * 10 + (len(s) - sum(s)) * sl * -10
python performance beginner numpy pandas
edited May 21 at 1:48
asked May 15 at 1:42
CmB
63
63
add a comment |Â
add a comment |Â
active
oldest
votes
active
oldest
votes
active
oldest
votes
active
oldest
votes
active
oldest
votes
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f194411%2faggregate-conditional-logic-test-results-iterated-over-changing-row-selections%23new-answer', 'question_page');
);
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password