Dropping rows from a PANDAS dataframe where some of the columns have value 0

.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty margin-bottom:0;

up vote
5
down vote

favorite

I am dropping rows from a PANDAS dataframe when some of its columns have 0 value. I got the output by using the below code, but I hope we can do the same with less code Ã¢Â€Â” perhaps in a single line.

df:

My code:

 drop_A=df.index[df["A"] == 0].tolist()
 drop_B=df.index[df["C"] == 0].tolist()
 c=drop_A+drop_B
 df=df.drop(df.index[c])

[out]

 A B C
 0 1 2 5
 2 6 8 4

edited Jan 26 at 18:50

200_success

123k14143401

asked Jan 18 at 11:19

pyd

133117

Do you want to know a better way to do what your code is doing, or do you want us to code golf it?
â€“Â Peilonrayz
Jan 18 at 11:27

I need a better way
â€“Â pyd
Jan 18 at 11:27

add a commentÂ |Â

up vote
5
down vote

favorite

df:

My code:

 drop_A=df.index[df["A"] == 0].tolist()
 drop_B=df.index[df["C"] == 0].tolist()
 c=drop_A+drop_B
 df=df.drop(df.index[c])

[out]

 A B C
 0 1 2 5
 2 6 8 4

edited Jan 26 at 18:50

200_success

123k14143401

asked Jan 18 at 11:19

pyd

133117

Do you want to know a better way to do what your code is doing, or do you want us to code golf it?
â€“Â Peilonrayz
Jan 18 at 11:27

I need a better way
â€“Â pyd
Jan 18 at 11:27

add a commentÂ |Â

up vote
5
down vote

favorite

df:

My code:

 drop_A=df.index[df["A"] == 0].tolist()
 drop_B=df.index[df["C"] == 0].tolist()
 c=drop_A+drop_B
 df=df.drop(df.index[c])

[out]

 A B C
 0 1 2 5
 2 6 8 4

edited Jan 26 at 18:50

200_success

123k14143401

asked Jan 18 at 11:19

pyd

133117

df:

My code:

 drop_A=df.index[df["A"] == 0].tolist()
 drop_B=df.index[df["C"] == 0].tolist()
 c=drop_A+drop_B
 df=df.drop(df.index[c])

[out]

 A B C
 0 1 2 5
 2 6 8 4

edited Jan 26 at 18:50

200_success

123k14143401

asked Jan 18 at 11:19

pyd

133117

edited Jan 26 at 18:50

200_success

123k14143401

edited Jan 26 at 18:50

200_success

123k14143401

edited Jan 26 at 18:50

200_success

123k14143401

asked Jan 18 at 11:19

pyd

133117

asked Jan 18 at 11:19

pyd

133117

asked Jan 18 at 11:19

pyd

133117

Do you want to know a better way to do what your code is doing, or do you want us to code golf it?
â€“Â Peilonrayz
Jan 18 at 11:27

I need a better way
â€“Â pyd
Jan 18 at 11:27

add a commentÂ |Â

Do you want to know a better way to do what your code is doing, or do you want us to code golf it?
â€“Â Peilonrayz
Jan 18 at 11:27

I need a better way
â€“Â pyd
Jan 18 at 11:27

Do you want to know a better way to do what your code is doing, or do you want us to code golf it?
â€“Â Peilonrayz
Jan 18 at 11:27

I need a better way
â€“Â pyd
Jan 18 at 11:27

add a commentÂ |Â

2 Answers
2

active

oldest

votes

up vote
9
down vote

accepted

I think you need create boolean DataFrame by compare all filtered columns values by scalar for not equality and then check all Trues per rows by all:

df = df[(df[['A','C']] != 0).all(axis=1)]
print (df)
 A B C
0 1 2 5
2 6 8 4

Details:

print (df[['A','C']] != 0)
 A C
0 True True
1 True False
2 True True
3 False True

print ((df[['A','C']] != 0).all(axis=1))

0 True
1 False
2 True
3 False
dtype: bool

I think you need create boolean DataFrame by compare all values by scalar and then check any Trues per rows by any and last invert mask by ~:

df = df[~(df[['A','C']] == 0).any(axis=1)]

Details:

print (df[['A','C']])
 A C
0 1 5
1 4 0
2 6 4
3 0 2

print (df[['A','C']] == 0)
 A C
0 False False
1 False True
2 False False
3 True False

print ((df[['A','C']] == 0).any(axis=1))
0 False
1 True
2 False
3 True
dtype: bool

print (~(df[['A','C']] == 0).any(axis=1))
0 True
1 False
2 True
3 False
dtype: bool

edited Jan 18 at 11:41

answered Jan 18 at 11:28

jezrael

20615

Jezrael , I want to consider only column A and C , pls check my question once
â€“Â pyd
Jan 18 at 11:31

@pyd Clarify this in your question.
â€“Â Mast
Jan 18 at 11:39

You have both "all not equal to 0" and "not any equal to zero". Did you intend these to be two options, or did you accidentally post two solutions?
â€“Â Acccumulation
Jan 18 at 17:51

@Accumulation No, it was no accident. I post first the best solution and second very nice, the best 2. :)
â€“Â jezrael
Jan 18 at 18:03

add a commentÂ |Â

up vote
1
down vote

One line hack using .dropna()

import pandas as pd

df = pd.DataFrame('A':[1,4,6,0],'B':[2,4,8,4],'C':[5,0,4,2])
print df
 A B C
0 1 2 5
1 4 4 0
2 6 8 4
3 0 4 2

columns = ['A', 'C']
df = df.replace(0, pd.np.nan).dropna(axis=0, how='any', subset=columns).fillna(0).astype(int)

print df
 A B C
0 1 2 5
2 6 8 4

So, what's happening is:

Replace 0 by NaN with .replace()

Use .dropna() to drop NaN considering only columns A and C

Replace NaN back to 0 with .fillna() (not needed if you use all columns instead of only a subset)

Correct the data type from float to int with .astype()

edited Jan 26 at 17:48

answered Jan 23 at 9:08

paulo.filip3

1113

add a commentÂ |Â

Your Answer

StackExchange.ifUsing("editor", function ()
return StackExchange.using("mathjaxEditing", function ()
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["\$", "\$"]]);
);
);
, "mathjax-editing");

StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "196"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
convertImagesToLinks: false,
noModals: false,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f185389%2fdropping-rows-from-a-pandas-dataframe-where-some-of-the-columns-have-value-0%23new-answer', 'question_page');

);

Post as a guest

Name

2 Answers
2

active

oldest

votes

2 Answers
2

active

oldest

votes

up vote
9
down vote

accepted

I think you need create boolean DataFrame by compare all filtered columns values by scalar for not equality and then check all Trues per rows by all:

df = df[(df[['A','C']] != 0).all(axis=1)]
print (df)
 A B C
0 1 2 5
2 6 8 4

Details:

print (df[['A','C']] != 0)
 A C
0 True True
1 True False
2 True True
3 False True

print ((df[['A','C']] != 0).all(axis=1))

0 True
1 False
2 True
3 False
dtype: bool

I think you need create boolean DataFrame by compare all values by scalar and then check any Trues per rows by any and last invert mask by ~:

df = df[~(df[['A','C']] == 0).any(axis=1)]

Details:

print (df[['A','C']])
 A C
0 1 5
1 4 0
2 6 4
3 0 2

print (df[['A','C']] == 0)
 A C
0 False False
1 False True
2 False False
3 True False

print ((df[['A','C']] == 0).any(axis=1))
0 False
1 True
2 False
3 True
dtype: bool

print (~(df[['A','C']] == 0).any(axis=1))
0 True
1 False
2 True
3 False
dtype: bool

edited Jan 18 at 11:41

answered Jan 18 at 11:28

jezrael

20615

Jezrael , I want to consider only column A and C , pls check my question once
â€“Â pyd
Jan 18 at 11:31

@pyd Clarify this in your question.
â€“Â Mast
Jan 18 at 11:39

You have both "all not equal to 0" and "not any equal to zero". Did you intend these to be two options, or did you accidentally post two solutions?
â€“Â Acccumulation
Jan 18 at 17:51

@Accumulation No, it was no accident. I post first the best solution and second very nice, the best 2. :)
â€“Â jezrael
Jan 18 at 18:03

add a commentÂ |Â

up vote
9
down vote

accepted

I think you need create boolean DataFrame by compare all filtered columns values by scalar for not equality and then check all Trues per rows by all:

df = df[(df[['A','C']] != 0).all(axis=1)]
print (df)
 A B C
0 1 2 5
2 6 8 4

Details:

print (df[['A','C']] != 0)
 A C
0 True True
1 True False
2 True True
3 False True

print ((df[['A','C']] != 0).all(axis=1))

0 True
1 False
2 True
3 False
dtype: bool

I think you need create boolean DataFrame by compare all values by scalar and then check any Trues per rows by any and last invert mask by ~:

df = df[~(df[['A','C']] == 0).any(axis=1)]

Details:

print (df[['A','C']])
 A C
0 1 5
1 4 0
2 6 4
3 0 2

print (df[['A','C']] == 0)
 A C
0 False False
1 False True
2 False False
3 True False

print ((df[['A','C']] == 0).any(axis=1))
0 False
1 True
2 False
3 True
dtype: bool

print (~(df[['A','C']] == 0).any(axis=1))
0 True
1 False
2 True
3 False
dtype: bool

edited Jan 18 at 11:41

answered Jan 18 at 11:28

jezrael

20615

Jezrael , I want to consider only column A and C , pls check my question once
â€“Â pyd
Jan 18 at 11:31

@pyd Clarify this in your question.
â€“Â Mast
Jan 18 at 11:39

You have both "all not equal to 0" and "not any equal to zero". Did you intend these to be two options, or did you accidentally post two solutions?
â€“Â Acccumulation
Jan 18 at 17:51

@Accumulation No, it was no accident. I post first the best solution and second very nice, the best 2. :)
â€“Â jezrael
Jan 18 at 18:03

add a commentÂ |Â

up vote
9
down vote

accepted

I think you need create boolean DataFrame by compare all filtered columns values by scalar for not equality and then check all Trues per rows by all:

df = df[(df[['A','C']] != 0).all(axis=1)]
print (df)
 A B C
0 1 2 5
2 6 8 4

Details:

print (df[['A','C']] != 0)
 A C
0 True True
1 True False
2 True True
3 False True

print ((df[['A','C']] != 0).all(axis=1))

0 True
1 False
2 True
3 False
dtype: bool

I think you need create boolean DataFrame by compare all values by scalar and then check any Trues per rows by any and last invert mask by ~:

df = df[~(df[['A','C']] == 0).any(axis=1)]

Details:

print (df[['A','C']])
 A C
0 1 5
1 4 0
2 6 4
3 0 2

print (df[['A','C']] == 0)
 A C
0 False False
1 False True
2 False False
3 True False

print ((df[['A','C']] == 0).any(axis=1))
0 False
1 True
2 False
3 True
dtype: bool

print (~(df[['A','C']] == 0).any(axis=1))
0 True
1 False
2 True
3 False
dtype: bool

edited Jan 18 at 11:41

answered Jan 18 at 11:28

jezrael

20615

I think you need create boolean DataFrame by compare all filtered columns values by scalar for not equality and then check all Trues per rows by all:

df = df[(df[['A','C']] != 0).all(axis=1)]
print (df)
 A B C
0 1 2 5
2 6 8 4

Details:

print (df[['A','C']] != 0)
 A C
0 True True
1 True False
2 True True
3 False True

print ((df[['A','C']] != 0).all(axis=1))

0 True
1 False
2 True
3 False
dtype: bool

I think you need create boolean DataFrame by compare all values by scalar and then check any Trues per rows by any and last invert mask by ~:

df = df[~(df[['A','C']] == 0).any(axis=1)]

Details:

print (df[['A','C']])
 A C
0 1 5
1 4 0
2 6 4
3 0 2

print (df[['A','C']] == 0)
 A C
0 False False
1 False True
2 False False
3 True False

print ((df[['A','C']] == 0).any(axis=1))
0 False
1 True
2 False
3 True
dtype: bool

print (~(df[['A','C']] == 0).any(axis=1))
0 True
1 False
2 True
3 False
dtype: bool

edited Jan 18 at 11:41

answered Jan 18 at 11:28

jezrael

20615

edited Jan 18 at 11:41

answered Jan 18 at 11:28

jezrael

20615

answered Jan 18 at 11:28

jezrael

20615

answered Jan 18 at 11:28

jezrael

20615

Jezrael , I want to consider only column A and C , pls check my question once
â€“Â pyd
Jan 18 at 11:31

@pyd Clarify this in your question.
â€“Â Mast
Jan 18 at 11:39

You have both "all not equal to 0" and "not any equal to zero". Did you intend these to be two options, or did you accidentally post two solutions?
â€“Â Acccumulation
Jan 18 at 17:51

@Accumulation No, it was no accident. I post first the best solution and second very nice, the best 2. :)
â€“Â jezrael
Jan 18 at 18:03

add a commentÂ |Â

Jezrael , I want to consider only column A and C , pls check my question once
â€“Â pyd
Jan 18 at 11:31

@pyd Clarify this in your question.
â€“Â Mast
Jan 18 at 11:39

You have both "all not equal to 0" and "not any equal to zero". Did you intend these to be two options, or did you accidentally post two solutions?
â€“Â Acccumulation
Jan 18 at 17:51

@Accumulation No, it was no accident. I post first the best solution and second very nice, the best 2. :)
â€“Â jezrael
Jan 18 at 18:03

Jezrael , I want to consider only column A and C , pls check my question once
â€“Â pyd
Jan 18 at 11:31

@pyd Clarify this in your question.
â€“Â Mast
Jan 18 at 11:39

You have both "all not equal to 0" and "not any equal to zero". Did you intend these to be two options, or did you accidentally post two solutions?
â€“Â Acccumulation
Jan 18 at 17:51

@Accumulation No, it was no accident. I post first the best solution and second very nice, the best 2. :)
â€“Â jezrael
Jan 18 at 18:03

add a commentÂ |Â

up vote
1
down vote

One line hack using .dropna()

import pandas as pd

df = pd.DataFrame('A':[1,4,6,0],'B':[2,4,8,4],'C':[5,0,4,2])
print df
 A B C
0 1 2 5
1 4 4 0
2 6 8 4
3 0 4 2

columns = ['A', 'C']
df = df.replace(0, pd.np.nan).dropna(axis=0, how='any', subset=columns).fillna(0).astype(int)

print df
 A B C
0 1 2 5
2 6 8 4

So, what's happening is:

Replace 0 by NaN with .replace()

Use .dropna() to drop NaN considering only columns A and C

Replace NaN back to 0 with .fillna() (not needed if you use all columns instead of only a subset)

Correct the data type from float to int with .astype()

edited Jan 26 at 17:48

answered Jan 23 at 9:08

paulo.filip3

1113

add a commentÂ |Â

up vote
1
down vote

One line hack using .dropna()

import pandas as pd

df = pd.DataFrame('A':[1,4,6,0],'B':[2,4,8,4],'C':[5,0,4,2])
print df
 A B C
0 1 2 5
1 4 4 0
2 6 8 4
3 0 4 2

columns = ['A', 'C']
df = df.replace(0, pd.np.nan).dropna(axis=0, how='any', subset=columns).fillna(0).astype(int)

print df
 A B C
0 1 2 5
2 6 8 4

So, what's happening is:

Replace 0 by NaN with .replace()

Use .dropna() to drop NaN considering only columns A and C

Replace NaN back to 0 with .fillna() (not needed if you use all columns instead of only a subset)

Correct the data type from float to int with .astype()

edited Jan 26 at 17:48

answered Jan 23 at 9:08

paulo.filip3

1113

add a commentÂ |Â

up vote
1
down vote

One line hack using .dropna()

import pandas as pd

df = pd.DataFrame('A':[1,4,6,0],'B':[2,4,8,4],'C':[5,0,4,2])
print df
 A B C
0 1 2 5
1 4 4 0
2 6 8 4
3 0 4 2

columns = ['A', 'C']
df = df.replace(0, pd.np.nan).dropna(axis=0, how='any', subset=columns).fillna(0).astype(int)

print df
 A B C
0 1 2 5
2 6 8 4

So, what's happening is:

Replace 0 by NaN with .replace()

Use .dropna() to drop NaN considering only columns A and C

Replace NaN back to 0 with .fillna() (not needed if you use all columns instead of only a subset)

Correct the data type from float to int with .astype()

edited Jan 26 at 17:48

answered Jan 23 at 9:08

paulo.filip3

1113

One line hack using .dropna()

import pandas as pd

df = pd.DataFrame('A':[1,4,6,0],'B':[2,4,8,4],'C':[5,0,4,2])
print df
 A B C
0 1 2 5
1 4 4 0
2 6 8 4
3 0 4 2

columns = ['A', 'C']
df = df.replace(0, pd.np.nan).dropna(axis=0, how='any', subset=columns).fillna(0).astype(int)

print df
 A B C
0 1 2 5
2 6 8 4

So, what's happening is:

Replace 0 by NaN with .replace()

Use .dropna() to drop NaN considering only columns A and C

Replace NaN back to 0 with .fillna() (not needed if you use all columns instead of only a subset)

Correct the data type from float to int with .astype()

edited Jan 26 at 17:48

answered Jan 23 at 9:08

paulo.filip3

1113

edited Jan 26 at 17:48

answered Jan 23 at 9:08

paulo.filip3

1113

answered Jan 23 at 9:08

paulo.filip3

1113

answered Jan 23 at 9:08

paulo.filip3

1113

add a commentÂ |Â

draft saved

draft discarded

draft saved

draft discarded

Post as a guest

Name

搜尋此網誌

trjhtr