Compute conditional median of PANDAS dataframe

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP





.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty margin-bottom:0;







up vote
3
down vote

favorite












I am new to Python/Pandas.
Consider the following code:



import pandas as pd
import numpy as np

df = pd.DataFrame('Time': [0.0, 1.0, 2.0, 0.0, 1.0, 2.0, 0.0, 2.0, 0.0, 1.0, 2.0],
'Id': [1, 1, 1, 2, 2, 2, 3, 3, 4, 4, 4],
'A': [10, 15, np.NaN, 11, 16, 25, 10, 15, 9, 14, 19])
print(df)


Output:



 A Id Time
0 10.0 1 0.0
1 15.0 1 1.0
2 NaN 1 2.0
3 11.0 2 0.0
4 16.0 2 1.0
5 25.0 2 2.0
6 10.0 3 0.0
7 15.0 3 2.0
8 9.0 4 0.0
9 14.0 4 1.0
10 19.0 4 2.0


I want to add a column Feature_1 which, for each row of the dataframe, compute the median of column A for ALL the values which have the same Time value. This can be done as follows:



df['Feature_1'] = df.groupby('Time')['A'].transform(np.median)
print(df)


Output:



 A Id Time Feature_1
0 10.0 1 0.0 10.0
1 15.0 1 1.0 15.0
2 NaN 1 2.0 19.0
3 11.0 2 0.0 10.0
4 16.0 2 1.0 15.0
5 25.0 2 2.0 19.0
6 10.0 3 0.0 10.0
7 15.0 3 2.0 19.0
8 9.0 4 0.0 10.0
9 14.0 4 1.0 15.0
10 19.0 4 2.0 19.0


My problem is now to compute another feature, Feature_2, which for each row of the dataframe, compute the median of column A for OTHER values which have the same Time value. I was not able to vectorize this, so my solution with a for loop:



df['feature_2'] = np.NaN

for i in range(len(df)):

current_Id = df.Id[i]
current_time = df.Time[i]

idx = (df.Time == current_time) & (df.Id != current_Id)

if idx.any():
df['feature_2'][i] = df.A[idx].median()

print(df)


Output:



 A Id Time Feature_1 Feature_2
0 10.0 1 0.0 10.0 10.0
1 15.0 1 1.0 15.0 15.0
2 NaN 1 2.0 19.0 19.0
3 11.0 2 0.0 10.0 10.0
4 16.0 2 1.0 15.0 14.5
5 25.0 2 2.0 19.0 17.0
6 10.0 3 0.0 10.0 10.0
7 15.0 3 2.0 19.0 22.0
8 9.0 4 0.0 10.0 10.0
9 14.0 4 1.0 15.0 15.5
10 19.0 4 2.0 19.0 20.0


This is working but it is very slow as my dataframe has 1 million rows (but only four different IDs).



Is it possible to vectorize the creation of Feature_2 ?



I hope, I am clear enough. Live code can be found here.







share|improve this question



























    up vote
    3
    down vote

    favorite












    I am new to Python/Pandas.
    Consider the following code:



    import pandas as pd
    import numpy as np

    df = pd.DataFrame('Time': [0.0, 1.0, 2.0, 0.0, 1.0, 2.0, 0.0, 2.0, 0.0, 1.0, 2.0],
    'Id': [1, 1, 1, 2, 2, 2, 3, 3, 4, 4, 4],
    'A': [10, 15, np.NaN, 11, 16, 25, 10, 15, 9, 14, 19])
    print(df)


    Output:



     A Id Time
    0 10.0 1 0.0
    1 15.0 1 1.0
    2 NaN 1 2.0
    3 11.0 2 0.0
    4 16.0 2 1.0
    5 25.0 2 2.0
    6 10.0 3 0.0
    7 15.0 3 2.0
    8 9.0 4 0.0
    9 14.0 4 1.0
    10 19.0 4 2.0


    I want to add a column Feature_1 which, for each row of the dataframe, compute the median of column A for ALL the values which have the same Time value. This can be done as follows:



    df['Feature_1'] = df.groupby('Time')['A'].transform(np.median)
    print(df)


    Output:



     A Id Time Feature_1
    0 10.0 1 0.0 10.0
    1 15.0 1 1.0 15.0
    2 NaN 1 2.0 19.0
    3 11.0 2 0.0 10.0
    4 16.0 2 1.0 15.0
    5 25.0 2 2.0 19.0
    6 10.0 3 0.0 10.0
    7 15.0 3 2.0 19.0
    8 9.0 4 0.0 10.0
    9 14.0 4 1.0 15.0
    10 19.0 4 2.0 19.0


    My problem is now to compute another feature, Feature_2, which for each row of the dataframe, compute the median of column A for OTHER values which have the same Time value. I was not able to vectorize this, so my solution with a for loop:



    df['feature_2'] = np.NaN

    for i in range(len(df)):

    current_Id = df.Id[i]
    current_time = df.Time[i]

    idx = (df.Time == current_time) & (df.Id != current_Id)

    if idx.any():
    df['feature_2'][i] = df.A[idx].median()

    print(df)


    Output:



     A Id Time Feature_1 Feature_2
    0 10.0 1 0.0 10.0 10.0
    1 15.0 1 1.0 15.0 15.0
    2 NaN 1 2.0 19.0 19.0
    3 11.0 2 0.0 10.0 10.0
    4 16.0 2 1.0 15.0 14.5
    5 25.0 2 2.0 19.0 17.0
    6 10.0 3 0.0 10.0 10.0
    7 15.0 3 2.0 19.0 22.0
    8 9.0 4 0.0 10.0 10.0
    9 14.0 4 1.0 15.0 15.5
    10 19.0 4 2.0 19.0 20.0


    This is working but it is very slow as my dataframe has 1 million rows (but only four different IDs).



    Is it possible to vectorize the creation of Feature_2 ?



    I hope, I am clear enough. Live code can be found here.







    share|improve this question























      up vote
      3
      down vote

      favorite









      up vote
      3
      down vote

      favorite











      I am new to Python/Pandas.
      Consider the following code:



      import pandas as pd
      import numpy as np

      df = pd.DataFrame('Time': [0.0, 1.0, 2.0, 0.0, 1.0, 2.0, 0.0, 2.0, 0.0, 1.0, 2.0],
      'Id': [1, 1, 1, 2, 2, 2, 3, 3, 4, 4, 4],
      'A': [10, 15, np.NaN, 11, 16, 25, 10, 15, 9, 14, 19])
      print(df)


      Output:



       A Id Time
      0 10.0 1 0.0
      1 15.0 1 1.0
      2 NaN 1 2.0
      3 11.0 2 0.0
      4 16.0 2 1.0
      5 25.0 2 2.0
      6 10.0 3 0.0
      7 15.0 3 2.0
      8 9.0 4 0.0
      9 14.0 4 1.0
      10 19.0 4 2.0


      I want to add a column Feature_1 which, for each row of the dataframe, compute the median of column A for ALL the values which have the same Time value. This can be done as follows:



      df['Feature_1'] = df.groupby('Time')['A'].transform(np.median)
      print(df)


      Output:



       A Id Time Feature_1
      0 10.0 1 0.0 10.0
      1 15.0 1 1.0 15.0
      2 NaN 1 2.0 19.0
      3 11.0 2 0.0 10.0
      4 16.0 2 1.0 15.0
      5 25.0 2 2.0 19.0
      6 10.0 3 0.0 10.0
      7 15.0 3 2.0 19.0
      8 9.0 4 0.0 10.0
      9 14.0 4 1.0 15.0
      10 19.0 4 2.0 19.0


      My problem is now to compute another feature, Feature_2, which for each row of the dataframe, compute the median of column A for OTHER values which have the same Time value. I was not able to vectorize this, so my solution with a for loop:



      df['feature_2'] = np.NaN

      for i in range(len(df)):

      current_Id = df.Id[i]
      current_time = df.Time[i]

      idx = (df.Time == current_time) & (df.Id != current_Id)

      if idx.any():
      df['feature_2'][i] = df.A[idx].median()

      print(df)


      Output:



       A Id Time Feature_1 Feature_2
      0 10.0 1 0.0 10.0 10.0
      1 15.0 1 1.0 15.0 15.0
      2 NaN 1 2.0 19.0 19.0
      3 11.0 2 0.0 10.0 10.0
      4 16.0 2 1.0 15.0 14.5
      5 25.0 2 2.0 19.0 17.0
      6 10.0 3 0.0 10.0 10.0
      7 15.0 3 2.0 19.0 22.0
      8 9.0 4 0.0 10.0 10.0
      9 14.0 4 1.0 15.0 15.5
      10 19.0 4 2.0 19.0 20.0


      This is working but it is very slow as my dataframe has 1 million rows (but only four different IDs).



      Is it possible to vectorize the creation of Feature_2 ?



      I hope, I am clear enough. Live code can be found here.







      share|improve this question













      I am new to Python/Pandas.
      Consider the following code:



      import pandas as pd
      import numpy as np

      df = pd.DataFrame('Time': [0.0, 1.0, 2.0, 0.0, 1.0, 2.0, 0.0, 2.0, 0.0, 1.0, 2.0],
      'Id': [1, 1, 1, 2, 2, 2, 3, 3, 4, 4, 4],
      'A': [10, 15, np.NaN, 11, 16, 25, 10, 15, 9, 14, 19])
      print(df)


      Output:



       A Id Time
      0 10.0 1 0.0
      1 15.0 1 1.0
      2 NaN 1 2.0
      3 11.0 2 0.0
      4 16.0 2 1.0
      5 25.0 2 2.0
      6 10.0 3 0.0
      7 15.0 3 2.0
      8 9.0 4 0.0
      9 14.0 4 1.0
      10 19.0 4 2.0


      I want to add a column Feature_1 which, for each row of the dataframe, compute the median of column A for ALL the values which have the same Time value. This can be done as follows:



      df['Feature_1'] = df.groupby('Time')['A'].transform(np.median)
      print(df)


      Output:



       A Id Time Feature_1
      0 10.0 1 0.0 10.0
      1 15.0 1 1.0 15.0
      2 NaN 1 2.0 19.0
      3 11.0 2 0.0 10.0
      4 16.0 2 1.0 15.0
      5 25.0 2 2.0 19.0
      6 10.0 3 0.0 10.0
      7 15.0 3 2.0 19.0
      8 9.0 4 0.0 10.0
      9 14.0 4 1.0 15.0
      10 19.0 4 2.0 19.0


      My problem is now to compute another feature, Feature_2, which for each row of the dataframe, compute the median of column A for OTHER values which have the same Time value. I was not able to vectorize this, so my solution with a for loop:



      df['feature_2'] = np.NaN

      for i in range(len(df)):

      current_Id = df.Id[i]
      current_time = df.Time[i]

      idx = (df.Time == current_time) & (df.Id != current_Id)

      if idx.any():
      df['feature_2'][i] = df.A[idx].median()

      print(df)


      Output:



       A Id Time Feature_1 Feature_2
      0 10.0 1 0.0 10.0 10.0
      1 15.0 1 1.0 15.0 15.0
      2 NaN 1 2.0 19.0 19.0
      3 11.0 2 0.0 10.0 10.0
      4 16.0 2 1.0 15.0 14.5
      5 25.0 2 2.0 19.0 17.0
      6 10.0 3 0.0 10.0 10.0
      7 15.0 3 2.0 19.0 22.0
      8 9.0 4 0.0 10.0 10.0
      9 14.0 4 1.0 15.0 15.5
      10 19.0 4 2.0 19.0 20.0


      This is working but it is very slow as my dataframe has 1 million rows (but only four different IDs).



      Is it possible to vectorize the creation of Feature_2 ?



      I hope, I am clear enough. Live code can be found here.









      share|improve this question












      share|improve this question




      share|improve this question








      edited May 14 at 14:49









      200_success

      123k14143399




      123k14143399









      asked May 14 at 13:06









      Romain

      1184




      1184




















          1 Answer
          1






          active

          oldest

          votes

















          up vote
          4
          down vote



          accepted










          So, you want to get the medians of the groups by removing each value from the group in turn:



           group => individual removal of values

          NaN [ ] NaN NaN NaN
          25.0 => 25.0 [ ] 25.0 25.0
          15.0 15.0 15.0 [ ] 15.0
          19.0 19.0 19.0 19.0 [ ]

          median 19.0 19.0 17.0 22.0 20.0


          An other way of doing, beside manually reconstructing the group without the current value for each value, is to build the above intermediate matrix and ask for the median on each column. This will return a Series of length the length of the group, which is supported by SeriesGroupBy.transform.



          The steps to get the desired result are:



          • build the matrix by repeating the input group as many time as its length;


          • fill the diagonal of the matrix with NaNs;

          • ask for the median by row/column depending on how you built the matrix.

          The function that can be fed to transform may look like:



          def median_without_element(group):
          matrix = pd.DataFrame([group] * len(group))
          np.fill_diagonal(matrix.values, np.NaN)
          return matrix.median(axis=1)



          An other advantage of this approach is that you are able to reuse the same groups of elements and so cut on the need to recompute them again and again:



          import numpy as np
          import pandas as pd


          def median_without_element(group):
          matrix = pd.DataFrame([group] * len(group))
          np.fill_diagonal(matrix.values, np.NaN)
          return matrix.median(axis=1)


          def compute_medians(dataframe, groups_column='Time', values_column='A'):
          groups = dataframe.groupby(groups_column)[values_column]
          dataframe['Feature_1'] = groups.transform(np.median)
          dataframe['Feature_2'] = groups.transform(median_without_element)


          if __name__ == '__main__':
          df = pd.DataFrame(
          'Time': [0.0, 1.0, 2.0, 0.0, 1.0, 2.0, 0.0, 2.0, 0.0, 1.0, 2.0],
          'Id': [1, 1, 1, 2, 2, 2, 3, 3, 4, 4, 4],
          'A': [10, 15, np.NaN, 11, 16, 25, 10, 15, 9, 14, 19],
          )
          compute_medians(df)
          print(df)





          share|improve this answer























          • Thanks, It works ;). I have one question though: in the compute_medians function, shouldn't you replace df by dataframe ?
            – Romain
            May 14 at 16:44










          • @Romain Absolutely, fixed it.
            – Mathias Ettinger
            May 14 at 17:09










          Your Answer




          StackExchange.ifUsing("editor", function ()
          return StackExchange.using("mathjaxEditing", function ()
          StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
          StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["\$", "\$"]]);
          );
          );
          , "mathjax-editing");

          StackExchange.ifUsing("editor", function ()
          StackExchange.using("externalEditor", function ()
          StackExchange.using("snippets", function ()
          StackExchange.snippets.init();
          );
          );
          , "code-snippets");

          StackExchange.ready(function()
          var channelOptions =
          tags: "".split(" "),
          id: "196"
          ;
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function()
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled)
          StackExchange.using("snippets", function()
          createEditor();
          );

          else
          createEditor();

          );

          function createEditor()
          StackExchange.prepareEditor(
          heartbeatType: 'answer',
          convertImagesToLinks: false,
          noModals: false,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: null,
          bindNavPrevention: true,
          postfix: "",
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          );



          );








           

          draft saved


          draft discarded


















          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f194361%2fcompute-conditional-median-of-pandas-dataframe%23new-answer', 'question_page');

          );

          Post as a guest






























          1 Answer
          1






          active

          oldest

          votes








          1 Answer
          1






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes








          up vote
          4
          down vote



          accepted










          So, you want to get the medians of the groups by removing each value from the group in turn:



           group => individual removal of values

          NaN [ ] NaN NaN NaN
          25.0 => 25.0 [ ] 25.0 25.0
          15.0 15.0 15.0 [ ] 15.0
          19.0 19.0 19.0 19.0 [ ]

          median 19.0 19.0 17.0 22.0 20.0


          An other way of doing, beside manually reconstructing the group without the current value for each value, is to build the above intermediate matrix and ask for the median on each column. This will return a Series of length the length of the group, which is supported by SeriesGroupBy.transform.



          The steps to get the desired result are:



          • build the matrix by repeating the input group as many time as its length;


          • fill the diagonal of the matrix with NaNs;

          • ask for the median by row/column depending on how you built the matrix.

          The function that can be fed to transform may look like:



          def median_without_element(group):
          matrix = pd.DataFrame([group] * len(group))
          np.fill_diagonal(matrix.values, np.NaN)
          return matrix.median(axis=1)



          An other advantage of this approach is that you are able to reuse the same groups of elements and so cut on the need to recompute them again and again:



          import numpy as np
          import pandas as pd


          def median_without_element(group):
          matrix = pd.DataFrame([group] * len(group))
          np.fill_diagonal(matrix.values, np.NaN)
          return matrix.median(axis=1)


          def compute_medians(dataframe, groups_column='Time', values_column='A'):
          groups = dataframe.groupby(groups_column)[values_column]
          dataframe['Feature_1'] = groups.transform(np.median)
          dataframe['Feature_2'] = groups.transform(median_without_element)


          if __name__ == '__main__':
          df = pd.DataFrame(
          'Time': [0.0, 1.0, 2.0, 0.0, 1.0, 2.0, 0.0, 2.0, 0.0, 1.0, 2.0],
          'Id': [1, 1, 1, 2, 2, 2, 3, 3, 4, 4, 4],
          'A': [10, 15, np.NaN, 11, 16, 25, 10, 15, 9, 14, 19],
          )
          compute_medians(df)
          print(df)





          share|improve this answer























          • Thanks, It works ;). I have one question though: in the compute_medians function, shouldn't you replace df by dataframe ?
            – Romain
            May 14 at 16:44










          • @Romain Absolutely, fixed it.
            – Mathias Ettinger
            May 14 at 17:09














          up vote
          4
          down vote



          accepted










          So, you want to get the medians of the groups by removing each value from the group in turn:



           group => individual removal of values

          NaN [ ] NaN NaN NaN
          25.0 => 25.0 [ ] 25.0 25.0
          15.0 15.0 15.0 [ ] 15.0
          19.0 19.0 19.0 19.0 [ ]

          median 19.0 19.0 17.0 22.0 20.0


          An other way of doing, beside manually reconstructing the group without the current value for each value, is to build the above intermediate matrix and ask for the median on each column. This will return a Series of length the length of the group, which is supported by SeriesGroupBy.transform.



          The steps to get the desired result are:



          • build the matrix by repeating the input group as many time as its length;


          • fill the diagonal of the matrix with NaNs;

          • ask for the median by row/column depending on how you built the matrix.

          The function that can be fed to transform may look like:



          def median_without_element(group):
          matrix = pd.DataFrame([group] * len(group))
          np.fill_diagonal(matrix.values, np.NaN)
          return matrix.median(axis=1)



          An other advantage of this approach is that you are able to reuse the same groups of elements and so cut on the need to recompute them again and again:



          import numpy as np
          import pandas as pd


          def median_without_element(group):
          matrix = pd.DataFrame([group] * len(group))
          np.fill_diagonal(matrix.values, np.NaN)
          return matrix.median(axis=1)


          def compute_medians(dataframe, groups_column='Time', values_column='A'):
          groups = dataframe.groupby(groups_column)[values_column]
          dataframe['Feature_1'] = groups.transform(np.median)
          dataframe['Feature_2'] = groups.transform(median_without_element)


          if __name__ == '__main__':
          df = pd.DataFrame(
          'Time': [0.0, 1.0, 2.0, 0.0, 1.0, 2.0, 0.0, 2.0, 0.0, 1.0, 2.0],
          'Id': [1, 1, 1, 2, 2, 2, 3, 3, 4, 4, 4],
          'A': [10, 15, np.NaN, 11, 16, 25, 10, 15, 9, 14, 19],
          )
          compute_medians(df)
          print(df)





          share|improve this answer























          • Thanks, It works ;). I have one question though: in the compute_medians function, shouldn't you replace df by dataframe ?
            – Romain
            May 14 at 16:44










          • @Romain Absolutely, fixed it.
            – Mathias Ettinger
            May 14 at 17:09












          up vote
          4
          down vote



          accepted







          up vote
          4
          down vote



          accepted






          So, you want to get the medians of the groups by removing each value from the group in turn:



           group => individual removal of values

          NaN [ ] NaN NaN NaN
          25.0 => 25.0 [ ] 25.0 25.0
          15.0 15.0 15.0 [ ] 15.0
          19.0 19.0 19.0 19.0 [ ]

          median 19.0 19.0 17.0 22.0 20.0


          An other way of doing, beside manually reconstructing the group without the current value for each value, is to build the above intermediate matrix and ask for the median on each column. This will return a Series of length the length of the group, which is supported by SeriesGroupBy.transform.



          The steps to get the desired result are:



          • build the matrix by repeating the input group as many time as its length;


          • fill the diagonal of the matrix with NaNs;

          • ask for the median by row/column depending on how you built the matrix.

          The function that can be fed to transform may look like:



          def median_without_element(group):
          matrix = pd.DataFrame([group] * len(group))
          np.fill_diagonal(matrix.values, np.NaN)
          return matrix.median(axis=1)



          An other advantage of this approach is that you are able to reuse the same groups of elements and so cut on the need to recompute them again and again:



          import numpy as np
          import pandas as pd


          def median_without_element(group):
          matrix = pd.DataFrame([group] * len(group))
          np.fill_diagonal(matrix.values, np.NaN)
          return matrix.median(axis=1)


          def compute_medians(dataframe, groups_column='Time', values_column='A'):
          groups = dataframe.groupby(groups_column)[values_column]
          dataframe['Feature_1'] = groups.transform(np.median)
          dataframe['Feature_2'] = groups.transform(median_without_element)


          if __name__ == '__main__':
          df = pd.DataFrame(
          'Time': [0.0, 1.0, 2.0, 0.0, 1.0, 2.0, 0.0, 2.0, 0.0, 1.0, 2.0],
          'Id': [1, 1, 1, 2, 2, 2, 3, 3, 4, 4, 4],
          'A': [10, 15, np.NaN, 11, 16, 25, 10, 15, 9, 14, 19],
          )
          compute_medians(df)
          print(df)





          share|improve this answer















          So, you want to get the medians of the groups by removing each value from the group in turn:



           group => individual removal of values

          NaN [ ] NaN NaN NaN
          25.0 => 25.0 [ ] 25.0 25.0
          15.0 15.0 15.0 [ ] 15.0
          19.0 19.0 19.0 19.0 [ ]

          median 19.0 19.0 17.0 22.0 20.0


          An other way of doing, beside manually reconstructing the group without the current value for each value, is to build the above intermediate matrix and ask for the median on each column. This will return a Series of length the length of the group, which is supported by SeriesGroupBy.transform.



          The steps to get the desired result are:



          • build the matrix by repeating the input group as many time as its length;


          • fill the diagonal of the matrix with NaNs;

          • ask for the median by row/column depending on how you built the matrix.

          The function that can be fed to transform may look like:



          def median_without_element(group):
          matrix = pd.DataFrame([group] * len(group))
          np.fill_diagonal(matrix.values, np.NaN)
          return matrix.median(axis=1)



          An other advantage of this approach is that you are able to reuse the same groups of elements and so cut on the need to recompute them again and again:



          import numpy as np
          import pandas as pd


          def median_without_element(group):
          matrix = pd.DataFrame([group] * len(group))
          np.fill_diagonal(matrix.values, np.NaN)
          return matrix.median(axis=1)


          def compute_medians(dataframe, groups_column='Time', values_column='A'):
          groups = dataframe.groupby(groups_column)[values_column]
          dataframe['Feature_1'] = groups.transform(np.median)
          dataframe['Feature_2'] = groups.transform(median_without_element)


          if __name__ == '__main__':
          df = pd.DataFrame(
          'Time': [0.0, 1.0, 2.0, 0.0, 1.0, 2.0, 0.0, 2.0, 0.0, 1.0, 2.0],
          'Id': [1, 1, 1, 2, 2, 2, 3, 3, 4, 4, 4],
          'A': [10, 15, np.NaN, 11, 16, 25, 10, 15, 9, 14, 19],
          )
          compute_medians(df)
          print(df)






          share|improve this answer















          share|improve this answer



          share|improve this answer








          edited May 14 at 17:08


























          answered May 14 at 14:53









          Mathias Ettinger

          21.8k32875




          21.8k32875











          • Thanks, It works ;). I have one question though: in the compute_medians function, shouldn't you replace df by dataframe ?
            – Romain
            May 14 at 16:44










          • @Romain Absolutely, fixed it.
            – Mathias Ettinger
            May 14 at 17:09
















          • Thanks, It works ;). I have one question though: in the compute_medians function, shouldn't you replace df by dataframe ?
            – Romain
            May 14 at 16:44










          • @Romain Absolutely, fixed it.
            – Mathias Ettinger
            May 14 at 17:09















          Thanks, It works ;). I have one question though: in the compute_medians function, shouldn't you replace df by dataframe ?
          – Romain
          May 14 at 16:44




          Thanks, It works ;). I have one question though: in the compute_medians function, shouldn't you replace df by dataframe ?
          – Romain
          May 14 at 16:44












          @Romain Absolutely, fixed it.
          – Mathias Ettinger
          May 14 at 17:09




          @Romain Absolutely, fixed it.
          – Mathias Ettinger
          May 14 at 17:09












           

          draft saved


          draft discarded


























           


          draft saved


          draft discarded














          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f194361%2fcompute-conditional-median-of-pandas-dataframe%23new-answer', 'question_page');

          );

          Post as a guest













































































          Popular posts from this blog

          Chat program with C++ and SFML

          Function to Return a JSON Like Objects Using VBA Collections and Arrays

          Will my employers contract hold up in court?