Set all values in one column to NaN if the corresponding values in another column are also NaN

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP











up vote
8
down vote

favorite












The goal is to maintain the relationship between two columns by setting to NaN all the values from one column in another column.



Having the following data frame:



df = pd.DataFrame('a': [np.nan, 2, np.nan, 4],'b': [11, 12 , 13, 14])

a b
0 NaN 11
1 2 12
2 NaN 13
3 4 14


Maintaining the relationship from column a to column b, where all NaN values are updated results in:



 a b
0 NaN NaN
1 2 12
2 NaN NaN
3 4 14


One way that it is possible to achieve the desired behaviour is:



df.b.where(~df.a.isnull(), np.nan)


Is there any other way to maintain such a relationship?







share|improve this question



















  • Is there any other way.... What's wrong with your current method? Are you looking for cleaner syntax, a more efficient solution, or something else?
    – jpp
    Aug 6 at 15:38










  • Cleaner or recommended way.
    – Krzysztof Słowiński
    Aug 6 at 15:45














up vote
8
down vote

favorite












The goal is to maintain the relationship between two columns by setting to NaN all the values from one column in another column.



Having the following data frame:



df = pd.DataFrame('a': [np.nan, 2, np.nan, 4],'b': [11, 12 , 13, 14])

a b
0 NaN 11
1 2 12
2 NaN 13
3 4 14


Maintaining the relationship from column a to column b, where all NaN values are updated results in:



 a b
0 NaN NaN
1 2 12
2 NaN NaN
3 4 14


One way that it is possible to achieve the desired behaviour is:



df.b.where(~df.a.isnull(), np.nan)


Is there any other way to maintain such a relationship?







share|improve this question



















  • Is there any other way.... What's wrong with your current method? Are you looking for cleaner syntax, a more efficient solution, or something else?
    – jpp
    Aug 6 at 15:38










  • Cleaner or recommended way.
    – Krzysztof Słowiński
    Aug 6 at 15:45












up vote
8
down vote

favorite









up vote
8
down vote

favorite











The goal is to maintain the relationship between two columns by setting to NaN all the values from one column in another column.



Having the following data frame:



df = pd.DataFrame('a': [np.nan, 2, np.nan, 4],'b': [11, 12 , 13, 14])

a b
0 NaN 11
1 2 12
2 NaN 13
3 4 14


Maintaining the relationship from column a to column b, where all NaN values are updated results in:



 a b
0 NaN NaN
1 2 12
2 NaN NaN
3 4 14


One way that it is possible to achieve the desired behaviour is:



df.b.where(~df.a.isnull(), np.nan)


Is there any other way to maintain such a relationship?







share|improve this question











The goal is to maintain the relationship between two columns by setting to NaN all the values from one column in another column.



Having the following data frame:



df = pd.DataFrame('a': [np.nan, 2, np.nan, 4],'b': [11, 12 , 13, 14])

a b
0 NaN 11
1 2 12
2 NaN 13
3 4 14


Maintaining the relationship from column a to column b, where all NaN values are updated results in:



 a b
0 NaN NaN
1 2 12
2 NaN NaN
3 4 14


One way that it is possible to achieve the desired behaviour is:



df.b.where(~df.a.isnull(), np.nan)


Is there any other way to maintain such a relationship?









share|improve this question










share|improve this question




share|improve this question









asked Aug 6 at 15:21









Krzysztof Słowiński

553316




553316











  • Is there any other way.... What's wrong with your current method? Are you looking for cleaner syntax, a more efficient solution, or something else?
    – jpp
    Aug 6 at 15:38










  • Cleaner or recommended way.
    – Krzysztof Słowiński
    Aug 6 at 15:45
















  • Is there any other way.... What's wrong with your current method? Are you looking for cleaner syntax, a more efficient solution, or something else?
    – jpp
    Aug 6 at 15:38










  • Cleaner or recommended way.
    – Krzysztof Słowiński
    Aug 6 at 15:45















Is there any other way.... What's wrong with your current method? Are you looking for cleaner syntax, a more efficient solution, or something else?
– jpp
Aug 6 at 15:38




Is there any other way.... What's wrong with your current method? Are you looking for cleaner syntax, a more efficient solution, or something else?
– jpp
Aug 6 at 15:38












Cleaner or recommended way.
– Krzysztof Słowiński
Aug 6 at 15:45




Cleaner or recommended way.
– Krzysztof Słowiński
Aug 6 at 15:45










5 Answers
5






active

oldest

votes

















up vote
9
down vote



accepted










You could use mask on NaN rows.



In [366]: df.mask(df.a.isnull())
Out[366]:
a b
0 NaN NaN
1 2.0 12.0
2 NaN NaN
3 4.0 14.0


For, presence of any NaN across columns use df.mask(df.isnull().any(1))






share|improve this answer

















  • 1




    You can also use inplace=True for the changes to stick.
    – jpp
    Aug 6 at 15:37

















up vote
2
down vote













Using pd.Series.notnull to avoid having to take the negative of your Boolean series:



df.b.where(df.a.notnull(), np.nan)


But, really, there's nothing wrong with your existing solution.






share|improve this answer




























    up vote
    1
    down vote













    Using dropna with reindex



    df.dropna().reindex(df.index)
    Out[151]:
    a b
    0 NaN NaN
    1 2.0 12.0
    2 NaN NaN
    3 4.0 14.0





    share|improve this answer





















    • This solution would only work across the columns, right? I would like to be able to apply it to a single column or a selected set of columns.
      – Krzysztof Słowiński
      Aug 7 at 0:48

















    up vote
    1
    down vote













    Another one would be:



    df.loc[df.a.isnull(), 'b'] = df.a


    Isn't shorter but does the job.






    share|improve this answer




























      up vote
      1
      down vote













      Using np.where(),



      df['b'] = np.where(df.a.isnull(), df.a, df.b)


      Working - np.where(condition, [a, b])



      Return elements, either from a or b, depending on condition.



      Output:



      >>> df
      a b
      0 NaN NaN
      1 2.0 12.0
      2 NaN NaN
      3 4.0 14.0





      share|improve this answer





















        Your Answer





        StackExchange.ifUsing("editor", function ()
        StackExchange.using("externalEditor", function ()
        StackExchange.using("snippets", function ()
        StackExchange.snippets.init();
        );
        );
        , "code-snippets");

        StackExchange.ready(function()
        var channelOptions =
        tags: "".split(" "),
        id: "1"
        ;
        initTagRenderer("".split(" "), "".split(" "), channelOptions);

        StackExchange.using("externalEditor", function()
        // Have to fire editor after snippets, if snippets enabled
        if (StackExchange.settings.snippets.snippetsEnabled)
        StackExchange.using("snippets", function()
        createEditor();
        );

        else
        createEditor();

        );

        function createEditor()
        StackExchange.prepareEditor(
        heartbeatType: 'answer',
        convertImagesToLinks: true,
        noModals: false,
        showLowRepImageUploadWarning: true,
        reputationToPostImages: 10,
        bindNavPrevention: true,
        postfix: "",
        onDemand: true,
        discardSelector: ".discard-answer"
        ,immediatelyShowMarkdownHelp:true
        );



        );








         

        draft saved


        draft discarded


















        StackExchange.ready(
        function ()
        StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f51710907%2fset-all-values-in-one-column-to-nan-if-the-corresponding-values-in-another-colum%23new-answer', 'question_page');

        );

        Post as a guest






























        5 Answers
        5






        active

        oldest

        votes








        5 Answers
        5






        active

        oldest

        votes









        active

        oldest

        votes






        active

        oldest

        votes








        up vote
        9
        down vote



        accepted










        You could use mask on NaN rows.



        In [366]: df.mask(df.a.isnull())
        Out[366]:
        a b
        0 NaN NaN
        1 2.0 12.0
        2 NaN NaN
        3 4.0 14.0


        For, presence of any NaN across columns use df.mask(df.isnull().any(1))






        share|improve this answer

















        • 1




          You can also use inplace=True for the changes to stick.
          – jpp
          Aug 6 at 15:37














        up vote
        9
        down vote



        accepted










        You could use mask on NaN rows.



        In [366]: df.mask(df.a.isnull())
        Out[366]:
        a b
        0 NaN NaN
        1 2.0 12.0
        2 NaN NaN
        3 4.0 14.0


        For, presence of any NaN across columns use df.mask(df.isnull().any(1))






        share|improve this answer

















        • 1




          You can also use inplace=True for the changes to stick.
          – jpp
          Aug 6 at 15:37












        up vote
        9
        down vote



        accepted







        up vote
        9
        down vote



        accepted






        You could use mask on NaN rows.



        In [366]: df.mask(df.a.isnull())
        Out[366]:
        a b
        0 NaN NaN
        1 2.0 12.0
        2 NaN NaN
        3 4.0 14.0


        For, presence of any NaN across columns use df.mask(df.isnull().any(1))






        share|improve this answer













        You could use mask on NaN rows.



        In [366]: df.mask(df.a.isnull())
        Out[366]:
        a b
        0 NaN NaN
        1 2.0 12.0
        2 NaN NaN
        3 4.0 14.0


        For, presence of any NaN across columns use df.mask(df.isnull().any(1))







        share|improve this answer













        share|improve this answer



        share|improve this answer











        answered Aug 6 at 15:24









        Zero

        33.9k75281




        33.9k75281







        • 1




          You can also use inplace=True for the changes to stick.
          – jpp
          Aug 6 at 15:37












        • 1




          You can also use inplace=True for the changes to stick.
          – jpp
          Aug 6 at 15:37







        1




        1




        You can also use inplace=True for the changes to stick.
        – jpp
        Aug 6 at 15:37




        You can also use inplace=True for the changes to stick.
        – jpp
        Aug 6 at 15:37












        up vote
        2
        down vote













        Using pd.Series.notnull to avoid having to take the negative of your Boolean series:



        df.b.where(df.a.notnull(), np.nan)


        But, really, there's nothing wrong with your existing solution.






        share|improve this answer

























          up vote
          2
          down vote













          Using pd.Series.notnull to avoid having to take the negative of your Boolean series:



          df.b.where(df.a.notnull(), np.nan)


          But, really, there's nothing wrong with your existing solution.






          share|improve this answer























            up vote
            2
            down vote










            up vote
            2
            down vote









            Using pd.Series.notnull to avoid having to take the negative of your Boolean series:



            df.b.where(df.a.notnull(), np.nan)


            But, really, there's nothing wrong with your existing solution.






            share|improve this answer













            Using pd.Series.notnull to avoid having to take the negative of your Boolean series:



            df.b.where(df.a.notnull(), np.nan)


            But, really, there's nothing wrong with your existing solution.







            share|improve this answer













            share|improve this answer



            share|improve this answer











            answered Aug 6 at 15:47









            jpp

            57.1k163374




            57.1k163374




















                up vote
                1
                down vote













                Using dropna with reindex



                df.dropna().reindex(df.index)
                Out[151]:
                a b
                0 NaN NaN
                1 2.0 12.0
                2 NaN NaN
                3 4.0 14.0





                share|improve this answer





















                • This solution would only work across the columns, right? I would like to be able to apply it to a single column or a selected set of columns.
                  – Krzysztof Słowiński
                  Aug 7 at 0:48














                up vote
                1
                down vote













                Using dropna with reindex



                df.dropna().reindex(df.index)
                Out[151]:
                a b
                0 NaN NaN
                1 2.0 12.0
                2 NaN NaN
                3 4.0 14.0





                share|improve this answer





















                • This solution would only work across the columns, right? I would like to be able to apply it to a single column or a selected set of columns.
                  – Krzysztof Słowiński
                  Aug 7 at 0:48












                up vote
                1
                down vote










                up vote
                1
                down vote









                Using dropna with reindex



                df.dropna().reindex(df.index)
                Out[151]:
                a b
                0 NaN NaN
                1 2.0 12.0
                2 NaN NaN
                3 4.0 14.0





                share|improve this answer













                Using dropna with reindex



                df.dropna().reindex(df.index)
                Out[151]:
                a b
                0 NaN NaN
                1 2.0 12.0
                2 NaN NaN
                3 4.0 14.0






                share|improve this answer













                share|improve this answer



                share|improve this answer











                answered Aug 6 at 15:24









                Wen

                73.3k71842




                73.3k71842











                • This solution would only work across the columns, right? I would like to be able to apply it to a single column or a selected set of columns.
                  – Krzysztof Słowiński
                  Aug 7 at 0:48
















                • This solution would only work across the columns, right? I would like to be able to apply it to a single column or a selected set of columns.
                  – Krzysztof Słowiński
                  Aug 7 at 0:48















                This solution would only work across the columns, right? I would like to be able to apply it to a single column or a selected set of columns.
                – Krzysztof Słowiński
                Aug 7 at 0:48




                This solution would only work across the columns, right? I would like to be able to apply it to a single column or a selected set of columns.
                – Krzysztof Słowiński
                Aug 7 at 0:48










                up vote
                1
                down vote













                Another one would be:



                df.loc[df.a.isnull(), 'b'] = df.a


                Isn't shorter but does the job.






                share|improve this answer

























                  up vote
                  1
                  down vote













                  Another one would be:



                  df.loc[df.a.isnull(), 'b'] = df.a


                  Isn't shorter but does the job.






                  share|improve this answer























                    up vote
                    1
                    down vote










                    up vote
                    1
                    down vote









                    Another one would be:



                    df.loc[df.a.isnull(), 'b'] = df.a


                    Isn't shorter but does the job.






                    share|improve this answer













                    Another one would be:



                    df.loc[df.a.isnull(), 'b'] = df.a


                    Isn't shorter but does the job.







                    share|improve this answer













                    share|improve this answer



                    share|improve this answer











                    answered Aug 6 at 15:31









                    zipa

                    13k21231




                    13k21231




















                        up vote
                        1
                        down vote













                        Using np.where(),



                        df['b'] = np.where(df.a.isnull(), df.a, df.b)


                        Working - np.where(condition, [a, b])



                        Return elements, either from a or b, depending on condition.



                        Output:



                        >>> df
                        a b
                        0 NaN NaN
                        1 2.0 12.0
                        2 NaN NaN
                        3 4.0 14.0





                        share|improve this answer

























                          up vote
                          1
                          down vote













                          Using np.where(),



                          df['b'] = np.where(df.a.isnull(), df.a, df.b)


                          Working - np.where(condition, [a, b])



                          Return elements, either from a or b, depending on condition.



                          Output:



                          >>> df
                          a b
                          0 NaN NaN
                          1 2.0 12.0
                          2 NaN NaN
                          3 4.0 14.0





                          share|improve this answer























                            up vote
                            1
                            down vote










                            up vote
                            1
                            down vote









                            Using np.where(),



                            df['b'] = np.where(df.a.isnull(), df.a, df.b)


                            Working - np.where(condition, [a, b])



                            Return elements, either from a or b, depending on condition.



                            Output:



                            >>> df
                            a b
                            0 NaN NaN
                            1 2.0 12.0
                            2 NaN NaN
                            3 4.0 14.0





                            share|improve this answer













                            Using np.where(),



                            df['b'] = np.where(df.a.isnull(), df.a, df.b)


                            Working - np.where(condition, [a, b])



                            Return elements, either from a or b, depending on condition.



                            Output:



                            >>> df
                            a b
                            0 NaN NaN
                            1 2.0 12.0
                            2 NaN NaN
                            3 4.0 14.0






                            share|improve this answer













                            share|improve this answer



                            share|improve this answer











                            answered Aug 6 at 15:47









                            Van Peer

                            1,51311123




                            1,51311123






















                                 

                                draft saved


                                draft discarded


























                                 


                                draft saved


                                draft discarded














                                StackExchange.ready(
                                function ()
                                StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f51710907%2fset-all-values-in-one-column-to-nan-if-the-corresponding-values-in-another-colum%23new-answer', 'question_page');

                                );

                                Post as a guest













































































                                Popular posts from this blog

                                Chat program with C++ and SFML

                                Function to Return a JSON Like Objects Using VBA Collections and Arrays

                                Will my employers contract hold up in court?