Dropping rows from a PANDAS dataframe where some of the columns have value 0

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP





.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty margin-bottom:0;







up vote
5
down vote

favorite












I am dropping rows from a PANDAS dataframe when some of its columns have 0 value. I got the output by using the below code, but I hope we can do the same with less code — perhaps in a single line.



df:



 A B C
0 1 2 5
1 4 4 0
2 6 8 4
3 0 4 2


My code:



 drop_A=df.index[df["A"] == 0].tolist()
drop_B=df.index[df["C"] == 0].tolist()
c=drop_A+drop_B
df=df.drop(df.index[c])


[out]



 A B C
0 1 2 5
2 6 8 4






share|improve this question





















  • Do you want to know a better way to do what your code is doing, or do you want us to code golf it?
    – Peilonrayz
    Jan 18 at 11:27










  • I need a better way
    – pyd
    Jan 18 at 11:27
















up vote
5
down vote

favorite












I am dropping rows from a PANDAS dataframe when some of its columns have 0 value. I got the output by using the below code, but I hope we can do the same with less code — perhaps in a single line.



df:



 A B C
0 1 2 5
1 4 4 0
2 6 8 4
3 0 4 2


My code:



 drop_A=df.index[df["A"] == 0].tolist()
drop_B=df.index[df["C"] == 0].tolist()
c=drop_A+drop_B
df=df.drop(df.index[c])


[out]



 A B C
0 1 2 5
2 6 8 4






share|improve this question





















  • Do you want to know a better way to do what your code is doing, or do you want us to code golf it?
    – Peilonrayz
    Jan 18 at 11:27










  • I need a better way
    – pyd
    Jan 18 at 11:27












up vote
5
down vote

favorite









up vote
5
down vote

favorite











I am dropping rows from a PANDAS dataframe when some of its columns have 0 value. I got the output by using the below code, but I hope we can do the same with less code — perhaps in a single line.



df:



 A B C
0 1 2 5
1 4 4 0
2 6 8 4
3 0 4 2


My code:



 drop_A=df.index[df["A"] == 0].tolist()
drop_B=df.index[df["C"] == 0].tolist()
c=drop_A+drop_B
df=df.drop(df.index[c])


[out]



 A B C
0 1 2 5
2 6 8 4






share|improve this question













I am dropping rows from a PANDAS dataframe when some of its columns have 0 value. I got the output by using the below code, but I hope we can do the same with less code — perhaps in a single line.



df:



 A B C
0 1 2 5
1 4 4 0
2 6 8 4
3 0 4 2


My code:



 drop_A=df.index[df["A"] == 0].tolist()
drop_B=df.index[df["C"] == 0].tolist()
c=drop_A+drop_B
df=df.drop(df.index[c])


[out]



 A B C
0 1 2 5
2 6 8 4








share|improve this question












share|improve this question




share|improve this question








edited Jan 26 at 18:50









200_success

123k14143401




123k14143401









asked Jan 18 at 11:19









pyd

133117




133117











  • Do you want to know a better way to do what your code is doing, or do you want us to code golf it?
    – Peilonrayz
    Jan 18 at 11:27










  • I need a better way
    – pyd
    Jan 18 at 11:27
















  • Do you want to know a better way to do what your code is doing, or do you want us to code golf it?
    – Peilonrayz
    Jan 18 at 11:27










  • I need a better way
    – pyd
    Jan 18 at 11:27















Do you want to know a better way to do what your code is doing, or do you want us to code golf it?
– Peilonrayz
Jan 18 at 11:27




Do you want to know a better way to do what your code is doing, or do you want us to code golf it?
– Peilonrayz
Jan 18 at 11:27












I need a better way
– pyd
Jan 18 at 11:27




I need a better way
– pyd
Jan 18 at 11:27










2 Answers
2






active

oldest

votes

















up vote
9
down vote



accepted










I think you need create boolean DataFrame by compare all filtered columns values by scalar for not equality and then check all Trues per rows by all:



df = df[(df[['A','C']] != 0).all(axis=1)]
print (df)
A B C
0 1 2 5
2 6 8 4


Details:



print (df[['A','C']] != 0)
A C
0 True True
1 True False
2 True True
3 False True

print ((df[['A','C']] != 0).all(axis=1))

0 True
1 False
2 True
3 False
dtype: bool


I think you need create boolean DataFrame by compare all values by scalar and then check any Trues per rows by any and last invert mask by ~:



df = df[~(df[['A','C']] == 0).any(axis=1)]


Details:



print (df[['A','C']])
A C
0 1 5
1 4 0
2 6 4
3 0 2

print (df[['A','C']] == 0)
A C
0 False False
1 False True
2 False False
3 True False

print ((df[['A','C']] == 0).any(axis=1))
0 False
1 True
2 False
3 True
dtype: bool

print (~(df[['A','C']] == 0).any(axis=1))
0 True
1 False
2 True
3 False
dtype: bool





share|improve this answer























  • Jezrael , I want to consider only column A and C , pls check my question once
    – pyd
    Jan 18 at 11:31











  • @pyd Clarify this in your question.
    – Mast
    Jan 18 at 11:39










  • You have both "all not equal to 0" and "not any equal to zero". Did you intend these to be two options, or did you accidentally post two solutions?
    – Acccumulation
    Jan 18 at 17:51










  • @Accumulation No, it was no accident. I post first the best solution and second very nice, the best 2. :)
    – jezrael
    Jan 18 at 18:03

















up vote
1
down vote













One line hack using .dropna()



import pandas as pd

df = pd.DataFrame('A':[1,4,6,0],'B':[2,4,8,4],'C':[5,0,4,2])
print df
A B C
0 1 2 5
1 4 4 0
2 6 8 4
3 0 4 2

columns = ['A', 'C']
df = df.replace(0, pd.np.nan).dropna(axis=0, how='any', subset=columns).fillna(0).astype(int)

print df
A B C
0 1 2 5
2 6 8 4


So, what's happening is:



  1. Replace 0 by NaN with .replace()

  2. Use .dropna() to drop NaN considering only columns A and C

  3. Replace NaN back to 0 with .fillna() (not needed if you use all columns instead of only a subset)

  4. Correct the data type from float to int with .astype()





share|improve this answer























    Your Answer




    StackExchange.ifUsing("editor", function ()
    return StackExchange.using("mathjaxEditing", function ()
    StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
    StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["\$", "\$"]]);
    );
    );
    , "mathjax-editing");

    StackExchange.ifUsing("editor", function ()
    StackExchange.using("externalEditor", function ()
    StackExchange.using("snippets", function ()
    StackExchange.snippets.init();
    );
    );
    , "code-snippets");

    StackExchange.ready(function()
    var channelOptions =
    tags: "".split(" "),
    id: "196"
    ;
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function()
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled)
    StackExchange.using("snippets", function()
    createEditor();
    );

    else
    createEditor();

    );

    function createEditor()
    StackExchange.prepareEditor(
    heartbeatType: 'answer',
    convertImagesToLinks: false,
    noModals: false,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: null,
    bindNavPrevention: true,
    postfix: "",
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    );



    );








     

    draft saved


    draft discarded


















    StackExchange.ready(
    function ()
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f185389%2fdropping-rows-from-a-pandas-dataframe-where-some-of-the-columns-have-value-0%23new-answer', 'question_page');

    );

    Post as a guest






























    2 Answers
    2






    active

    oldest

    votes








    2 Answers
    2






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes








    up vote
    9
    down vote



    accepted










    I think you need create boolean DataFrame by compare all filtered columns values by scalar for not equality and then check all Trues per rows by all:



    df = df[(df[['A','C']] != 0).all(axis=1)]
    print (df)
    A B C
    0 1 2 5
    2 6 8 4


    Details:



    print (df[['A','C']] != 0)
    A C
    0 True True
    1 True False
    2 True True
    3 False True

    print ((df[['A','C']] != 0).all(axis=1))

    0 True
    1 False
    2 True
    3 False
    dtype: bool


    I think you need create boolean DataFrame by compare all values by scalar and then check any Trues per rows by any and last invert mask by ~:



    df = df[~(df[['A','C']] == 0).any(axis=1)]


    Details:



    print (df[['A','C']])
    A C
    0 1 5
    1 4 0
    2 6 4
    3 0 2

    print (df[['A','C']] == 0)
    A C
    0 False False
    1 False True
    2 False False
    3 True False

    print ((df[['A','C']] == 0).any(axis=1))
    0 False
    1 True
    2 False
    3 True
    dtype: bool

    print (~(df[['A','C']] == 0).any(axis=1))
    0 True
    1 False
    2 True
    3 False
    dtype: bool





    share|improve this answer























    • Jezrael , I want to consider only column A and C , pls check my question once
      – pyd
      Jan 18 at 11:31











    • @pyd Clarify this in your question.
      – Mast
      Jan 18 at 11:39










    • You have both "all not equal to 0" and "not any equal to zero". Did you intend these to be two options, or did you accidentally post two solutions?
      – Acccumulation
      Jan 18 at 17:51










    • @Accumulation No, it was no accident. I post first the best solution and second very nice, the best 2. :)
      – jezrael
      Jan 18 at 18:03














    up vote
    9
    down vote



    accepted










    I think you need create boolean DataFrame by compare all filtered columns values by scalar for not equality and then check all Trues per rows by all:



    df = df[(df[['A','C']] != 0).all(axis=1)]
    print (df)
    A B C
    0 1 2 5
    2 6 8 4


    Details:



    print (df[['A','C']] != 0)
    A C
    0 True True
    1 True False
    2 True True
    3 False True

    print ((df[['A','C']] != 0).all(axis=1))

    0 True
    1 False
    2 True
    3 False
    dtype: bool


    I think you need create boolean DataFrame by compare all values by scalar and then check any Trues per rows by any and last invert mask by ~:



    df = df[~(df[['A','C']] == 0).any(axis=1)]


    Details:



    print (df[['A','C']])
    A C
    0 1 5
    1 4 0
    2 6 4
    3 0 2

    print (df[['A','C']] == 0)
    A C
    0 False False
    1 False True
    2 False False
    3 True False

    print ((df[['A','C']] == 0).any(axis=1))
    0 False
    1 True
    2 False
    3 True
    dtype: bool

    print (~(df[['A','C']] == 0).any(axis=1))
    0 True
    1 False
    2 True
    3 False
    dtype: bool





    share|improve this answer























    • Jezrael , I want to consider only column A and C , pls check my question once
      – pyd
      Jan 18 at 11:31











    • @pyd Clarify this in your question.
      – Mast
      Jan 18 at 11:39










    • You have both "all not equal to 0" and "not any equal to zero". Did you intend these to be two options, or did you accidentally post two solutions?
      – Acccumulation
      Jan 18 at 17:51










    • @Accumulation No, it was no accident. I post first the best solution and second very nice, the best 2. :)
      – jezrael
      Jan 18 at 18:03












    up vote
    9
    down vote



    accepted







    up vote
    9
    down vote



    accepted






    I think you need create boolean DataFrame by compare all filtered columns values by scalar for not equality and then check all Trues per rows by all:



    df = df[(df[['A','C']] != 0).all(axis=1)]
    print (df)
    A B C
    0 1 2 5
    2 6 8 4


    Details:



    print (df[['A','C']] != 0)
    A C
    0 True True
    1 True False
    2 True True
    3 False True

    print ((df[['A','C']] != 0).all(axis=1))

    0 True
    1 False
    2 True
    3 False
    dtype: bool


    I think you need create boolean DataFrame by compare all values by scalar and then check any Trues per rows by any and last invert mask by ~:



    df = df[~(df[['A','C']] == 0).any(axis=1)]


    Details:



    print (df[['A','C']])
    A C
    0 1 5
    1 4 0
    2 6 4
    3 0 2

    print (df[['A','C']] == 0)
    A C
    0 False False
    1 False True
    2 False False
    3 True False

    print ((df[['A','C']] == 0).any(axis=1))
    0 False
    1 True
    2 False
    3 True
    dtype: bool

    print (~(df[['A','C']] == 0).any(axis=1))
    0 True
    1 False
    2 True
    3 False
    dtype: bool





    share|improve this answer















    I think you need create boolean DataFrame by compare all filtered columns values by scalar for not equality and then check all Trues per rows by all:



    df = df[(df[['A','C']] != 0).all(axis=1)]
    print (df)
    A B C
    0 1 2 5
    2 6 8 4


    Details:



    print (df[['A','C']] != 0)
    A C
    0 True True
    1 True False
    2 True True
    3 False True

    print ((df[['A','C']] != 0).all(axis=1))

    0 True
    1 False
    2 True
    3 False
    dtype: bool


    I think you need create boolean DataFrame by compare all values by scalar and then check any Trues per rows by any and last invert mask by ~:



    df = df[~(df[['A','C']] == 0).any(axis=1)]


    Details:



    print (df[['A','C']])
    A C
    0 1 5
    1 4 0
    2 6 4
    3 0 2

    print (df[['A','C']] == 0)
    A C
    0 False False
    1 False True
    2 False False
    3 True False

    print ((df[['A','C']] == 0).any(axis=1))
    0 False
    1 True
    2 False
    3 True
    dtype: bool

    print (~(df[['A','C']] == 0).any(axis=1))
    0 True
    1 False
    2 True
    3 False
    dtype: bool






    share|improve this answer















    share|improve this answer



    share|improve this answer








    edited Jan 18 at 11:41


























    answered Jan 18 at 11:28









    jezrael

    20615




    20615











    • Jezrael , I want to consider only column A and C , pls check my question once
      – pyd
      Jan 18 at 11:31











    • @pyd Clarify this in your question.
      – Mast
      Jan 18 at 11:39










    • You have both "all not equal to 0" and "not any equal to zero". Did you intend these to be two options, or did you accidentally post two solutions?
      – Acccumulation
      Jan 18 at 17:51










    • @Accumulation No, it was no accident. I post first the best solution and second very nice, the best 2. :)
      – jezrael
      Jan 18 at 18:03
















    • Jezrael , I want to consider only column A and C , pls check my question once
      – pyd
      Jan 18 at 11:31











    • @pyd Clarify this in your question.
      – Mast
      Jan 18 at 11:39










    • You have both "all not equal to 0" and "not any equal to zero". Did you intend these to be two options, or did you accidentally post two solutions?
      – Acccumulation
      Jan 18 at 17:51










    • @Accumulation No, it was no accident. I post first the best solution and second very nice, the best 2. :)
      – jezrael
      Jan 18 at 18:03















    Jezrael , I want to consider only column A and C , pls check my question once
    – pyd
    Jan 18 at 11:31





    Jezrael , I want to consider only column A and C , pls check my question once
    – pyd
    Jan 18 at 11:31













    @pyd Clarify this in your question.
    – Mast
    Jan 18 at 11:39




    @pyd Clarify this in your question.
    – Mast
    Jan 18 at 11:39












    You have both "all not equal to 0" and "not any equal to zero". Did you intend these to be two options, or did you accidentally post two solutions?
    – Acccumulation
    Jan 18 at 17:51




    You have both "all not equal to 0" and "not any equal to zero". Did you intend these to be two options, or did you accidentally post two solutions?
    – Acccumulation
    Jan 18 at 17:51












    @Accumulation No, it was no accident. I post first the best solution and second very nice, the best 2. :)
    – jezrael
    Jan 18 at 18:03




    @Accumulation No, it was no accident. I post first the best solution and second very nice, the best 2. :)
    – jezrael
    Jan 18 at 18:03












    up vote
    1
    down vote













    One line hack using .dropna()



    import pandas as pd

    df = pd.DataFrame('A':[1,4,6,0],'B':[2,4,8,4],'C':[5,0,4,2])
    print df
    A B C
    0 1 2 5
    1 4 4 0
    2 6 8 4
    3 0 4 2

    columns = ['A', 'C']
    df = df.replace(0, pd.np.nan).dropna(axis=0, how='any', subset=columns).fillna(0).astype(int)

    print df
    A B C
    0 1 2 5
    2 6 8 4


    So, what's happening is:



    1. Replace 0 by NaN with .replace()

    2. Use .dropna() to drop NaN considering only columns A and C

    3. Replace NaN back to 0 with .fillna() (not needed if you use all columns instead of only a subset)

    4. Correct the data type from float to int with .astype()





    share|improve this answer



























      up vote
      1
      down vote













      One line hack using .dropna()



      import pandas as pd

      df = pd.DataFrame('A':[1,4,6,0],'B':[2,4,8,4],'C':[5,0,4,2])
      print df
      A B C
      0 1 2 5
      1 4 4 0
      2 6 8 4
      3 0 4 2

      columns = ['A', 'C']
      df = df.replace(0, pd.np.nan).dropna(axis=0, how='any', subset=columns).fillna(0).astype(int)

      print df
      A B C
      0 1 2 5
      2 6 8 4


      So, what's happening is:



      1. Replace 0 by NaN with .replace()

      2. Use .dropna() to drop NaN considering only columns A and C

      3. Replace NaN back to 0 with .fillna() (not needed if you use all columns instead of only a subset)

      4. Correct the data type from float to int with .astype()





      share|improve this answer

























        up vote
        1
        down vote










        up vote
        1
        down vote









        One line hack using .dropna()



        import pandas as pd

        df = pd.DataFrame('A':[1,4,6,0],'B':[2,4,8,4],'C':[5,0,4,2])
        print df
        A B C
        0 1 2 5
        1 4 4 0
        2 6 8 4
        3 0 4 2

        columns = ['A', 'C']
        df = df.replace(0, pd.np.nan).dropna(axis=0, how='any', subset=columns).fillna(0).astype(int)

        print df
        A B C
        0 1 2 5
        2 6 8 4


        So, what's happening is:



        1. Replace 0 by NaN with .replace()

        2. Use .dropna() to drop NaN considering only columns A and C

        3. Replace NaN back to 0 with .fillna() (not needed if you use all columns instead of only a subset)

        4. Correct the data type from float to int with .astype()





        share|improve this answer















        One line hack using .dropna()



        import pandas as pd

        df = pd.DataFrame('A':[1,4,6,0],'B':[2,4,8,4],'C':[5,0,4,2])
        print df
        A B C
        0 1 2 5
        1 4 4 0
        2 6 8 4
        3 0 4 2

        columns = ['A', 'C']
        df = df.replace(0, pd.np.nan).dropna(axis=0, how='any', subset=columns).fillna(0).astype(int)

        print df
        A B C
        0 1 2 5
        2 6 8 4


        So, what's happening is:



        1. Replace 0 by NaN with .replace()

        2. Use .dropna() to drop NaN considering only columns A and C

        3. Replace NaN back to 0 with .fillna() (not needed if you use all columns instead of only a subset)

        4. Correct the data type from float to int with .astype()






        share|improve this answer















        share|improve this answer



        share|improve this answer








        edited Jan 26 at 17:48


























        answered Jan 23 at 9:08









        paulo.filip3

        1113




        1113






















             

            draft saved


            draft discarded


























             


            draft saved


            draft discarded














            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f185389%2fdropping-rows-from-a-pandas-dataframe-where-some-of-the-columns-have-value-0%23new-answer', 'question_page');

            );

            Post as a guest













































































            Popular posts from this blog

            Python Lists

            Aion

            JavaScript Array Iteration Methods