Aggregate conditional logic test results iterated over changing row selections

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP





.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty margin-bottom:0;







up vote
1
down vote

favorite












This for-loop is the biggest bottleneck in my project, and I'm curious whether there is an opportunity to speed it up by improving the grammar or by using a different strategy altogether, perhaps. I'm a noob to Python and migrating this from Excel.



Here's some sample data and the scheme. j's range is actually 25k and i's about 1k, so it takes quite a while.



Functionally, it selects rows in df1 where a string in any of the 'typeX' columns matches a particular element in df2_columns, then tests the sl_pt element on each of those rows. It creates a 'dummy' array s to reduce the 3-part 'or' test down to a T/F, and performs some math on the count of T's as well as the count of F's. It tests with every sl_pt element and then goes to the next element in df2_columns, rinse, repeat.



import pandas as pd
import numpy as np
import functools as ft

df1 = pd.DataFrame('type1' : ['A1_v3', 'A2_v2', 'A1_v3', 'B1_v2'],
'type2' : ['A1_vr1', 'A2_vr3', 'A1_vr3', 'B1_vr2'],
'type3' : ['A1_vp4', 'A2_vp2', 'A1_vp3', 'B1_vp4'],
'mae1' : [14, 16, 4, 63],
'mae2' : [17, 29, 38, 63],
'mae3' : [27, 29, 77, 63],
'mfe1' : [48, 31, 59, 0],
'mfe2' : [79, 37, 59, 14],
'mfe3' : [79, 108, 71, 36])

df2 = pd.DataFrame('sl_pt' : [[22, 62], [22, 63], [22, 64], [22, 66], [22, 68], [22, 70]])

df2_columns = ['A1_v3', 'A2_vr3', 'B1_vp4']

for i in range(3):
tag = ft.reduce(np.logical_or, [df1['type'.format(k)].str.startswith(df2_columns[i]) for k in range(1,4)])
selection = df1.loc[tag]
for j in range(6):
sl = df2['sl_pt'][j][0]
pt = df2['sl_pt'][j][1]
s = ft.reduce(np.logical_or, [(selection['mae'.format(m)] >= 0) & (selection['mae'.format(m)] <= sl) & (selection['mfe'.format(m)] >= pt) for m in range(1,4)])
df2.loc[j, df2_columns[i]] = sum(s) * pt * 10 + (len(s) - sum(s)) * sl * -10






share|improve this question



























    up vote
    1
    down vote

    favorite












    This for-loop is the biggest bottleneck in my project, and I'm curious whether there is an opportunity to speed it up by improving the grammar or by using a different strategy altogether, perhaps. I'm a noob to Python and migrating this from Excel.



    Here's some sample data and the scheme. j's range is actually 25k and i's about 1k, so it takes quite a while.



    Functionally, it selects rows in df1 where a string in any of the 'typeX' columns matches a particular element in df2_columns, then tests the sl_pt element on each of those rows. It creates a 'dummy' array s to reduce the 3-part 'or' test down to a T/F, and performs some math on the count of T's as well as the count of F's. It tests with every sl_pt element and then goes to the next element in df2_columns, rinse, repeat.



    import pandas as pd
    import numpy as np
    import functools as ft

    df1 = pd.DataFrame('type1' : ['A1_v3', 'A2_v2', 'A1_v3', 'B1_v2'],
    'type2' : ['A1_vr1', 'A2_vr3', 'A1_vr3', 'B1_vr2'],
    'type3' : ['A1_vp4', 'A2_vp2', 'A1_vp3', 'B1_vp4'],
    'mae1' : [14, 16, 4, 63],
    'mae2' : [17, 29, 38, 63],
    'mae3' : [27, 29, 77, 63],
    'mfe1' : [48, 31, 59, 0],
    'mfe2' : [79, 37, 59, 14],
    'mfe3' : [79, 108, 71, 36])

    df2 = pd.DataFrame('sl_pt' : [[22, 62], [22, 63], [22, 64], [22, 66], [22, 68], [22, 70]])

    df2_columns = ['A1_v3', 'A2_vr3', 'B1_vp4']

    for i in range(3):
    tag = ft.reduce(np.logical_or, [df1['type'.format(k)].str.startswith(df2_columns[i]) for k in range(1,4)])
    selection = df1.loc[tag]
    for j in range(6):
    sl = df2['sl_pt'][j][0]
    pt = df2['sl_pt'][j][1]
    s = ft.reduce(np.logical_or, [(selection['mae'.format(m)] >= 0) & (selection['mae'.format(m)] <= sl) & (selection['mfe'.format(m)] >= pt) for m in range(1,4)])
    df2.loc[j, df2_columns[i]] = sum(s) * pt * 10 + (len(s) - sum(s)) * sl * -10






    share|improve this question























      up vote
      1
      down vote

      favorite









      up vote
      1
      down vote

      favorite











      This for-loop is the biggest bottleneck in my project, and I'm curious whether there is an opportunity to speed it up by improving the grammar or by using a different strategy altogether, perhaps. I'm a noob to Python and migrating this from Excel.



      Here's some sample data and the scheme. j's range is actually 25k and i's about 1k, so it takes quite a while.



      Functionally, it selects rows in df1 where a string in any of the 'typeX' columns matches a particular element in df2_columns, then tests the sl_pt element on each of those rows. It creates a 'dummy' array s to reduce the 3-part 'or' test down to a T/F, and performs some math on the count of T's as well as the count of F's. It tests with every sl_pt element and then goes to the next element in df2_columns, rinse, repeat.



      import pandas as pd
      import numpy as np
      import functools as ft

      df1 = pd.DataFrame('type1' : ['A1_v3', 'A2_v2', 'A1_v3', 'B1_v2'],
      'type2' : ['A1_vr1', 'A2_vr3', 'A1_vr3', 'B1_vr2'],
      'type3' : ['A1_vp4', 'A2_vp2', 'A1_vp3', 'B1_vp4'],
      'mae1' : [14, 16, 4, 63],
      'mae2' : [17, 29, 38, 63],
      'mae3' : [27, 29, 77, 63],
      'mfe1' : [48, 31, 59, 0],
      'mfe2' : [79, 37, 59, 14],
      'mfe3' : [79, 108, 71, 36])

      df2 = pd.DataFrame('sl_pt' : [[22, 62], [22, 63], [22, 64], [22, 66], [22, 68], [22, 70]])

      df2_columns = ['A1_v3', 'A2_vr3', 'B1_vp4']

      for i in range(3):
      tag = ft.reduce(np.logical_or, [df1['type'.format(k)].str.startswith(df2_columns[i]) for k in range(1,4)])
      selection = df1.loc[tag]
      for j in range(6):
      sl = df2['sl_pt'][j][0]
      pt = df2['sl_pt'][j][1]
      s = ft.reduce(np.logical_or, [(selection['mae'.format(m)] >= 0) & (selection['mae'.format(m)] <= sl) & (selection['mfe'.format(m)] >= pt) for m in range(1,4)])
      df2.loc[j, df2_columns[i]] = sum(s) * pt * 10 + (len(s) - sum(s)) * sl * -10






      share|improve this question













      This for-loop is the biggest bottleneck in my project, and I'm curious whether there is an opportunity to speed it up by improving the grammar or by using a different strategy altogether, perhaps. I'm a noob to Python and migrating this from Excel.



      Here's some sample data and the scheme. j's range is actually 25k and i's about 1k, so it takes quite a while.



      Functionally, it selects rows in df1 where a string in any of the 'typeX' columns matches a particular element in df2_columns, then tests the sl_pt element on each of those rows. It creates a 'dummy' array s to reduce the 3-part 'or' test down to a T/F, and performs some math on the count of T's as well as the count of F's. It tests with every sl_pt element and then goes to the next element in df2_columns, rinse, repeat.



      import pandas as pd
      import numpy as np
      import functools as ft

      df1 = pd.DataFrame('type1' : ['A1_v3', 'A2_v2', 'A1_v3', 'B1_v2'],
      'type2' : ['A1_vr1', 'A2_vr3', 'A1_vr3', 'B1_vr2'],
      'type3' : ['A1_vp4', 'A2_vp2', 'A1_vp3', 'B1_vp4'],
      'mae1' : [14, 16, 4, 63],
      'mae2' : [17, 29, 38, 63],
      'mae3' : [27, 29, 77, 63],
      'mfe1' : [48, 31, 59, 0],
      'mfe2' : [79, 37, 59, 14],
      'mfe3' : [79, 108, 71, 36])

      df2 = pd.DataFrame('sl_pt' : [[22, 62], [22, 63], [22, 64], [22, 66], [22, 68], [22, 70]])

      df2_columns = ['A1_v3', 'A2_vr3', 'B1_vp4']

      for i in range(3):
      tag = ft.reduce(np.logical_or, [df1['type'.format(k)].str.startswith(df2_columns[i]) for k in range(1,4)])
      selection = df1.loc[tag]
      for j in range(6):
      sl = df2['sl_pt'][j][0]
      pt = df2['sl_pt'][j][1]
      s = ft.reduce(np.logical_or, [(selection['mae'.format(m)] >= 0) & (selection['mae'.format(m)] <= sl) & (selection['mfe'.format(m)] >= pt) for m in range(1,4)])
      df2.loc[j, df2_columns[i]] = sum(s) * pt * 10 + (len(s) - sum(s)) * sl * -10








      share|improve this question












      share|improve this question




      share|improve this question








      edited May 21 at 1:48
























      asked May 15 at 1:42









      CmB

      63




      63

























          active

          oldest

          votes











          Your Answer




          StackExchange.ifUsing("editor", function ()
          return StackExchange.using("mathjaxEditing", function ()
          StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
          StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["\$", "\$"]]);
          );
          );
          , "mathjax-editing");

          StackExchange.ifUsing("editor", function ()
          StackExchange.using("externalEditor", function ()
          StackExchange.using("snippets", function ()
          StackExchange.snippets.init();
          );
          );
          , "code-snippets");

          StackExchange.ready(function()
          var channelOptions =
          tags: "".split(" "),
          id: "196"
          ;
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function()
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled)
          StackExchange.using("snippets", function()
          createEditor();
          );

          else
          createEditor();

          );

          function createEditor()
          StackExchange.prepareEditor(
          heartbeatType: 'answer',
          convertImagesToLinks: false,
          noModals: false,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: null,
          bindNavPrevention: true,
          postfix: "",
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          );



          );








           

          draft saved


          draft discarded


















          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f194411%2faggregate-conditional-logic-test-results-iterated-over-changing-row-selections%23new-answer', 'question_page');

          );

          Post as a guest



































          active

          oldest

          votes













          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes










           

          draft saved


          draft discarded


























           


          draft saved


          draft discarded














          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f194411%2faggregate-conditional-logic-test-results-iterated-over-changing-row-selections%23new-answer', 'question_page');

          );

          Post as a guest













































































          Popular posts from this blog

          Python Lists

          Aion

          JavaScript Array Iteration Methods