Dropping rows from a PANDAS dataframe where some of the columns have value 0

Clash Royale CLAN TAG#URR8PPP
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty margin-bottom:0;
up vote
5
down vote
favorite
I am dropping rows from a PANDAS dataframe when some of its columns have 0 value. I got the output by using the below code, but I hope we can do the same with less code â perhaps in a single line.
df:
A B C
0 1 2 5
1 4 4 0
2 6 8 4
3 0 4 2
My code:
drop_A=df.index[df["A"] == 0].tolist()
drop_B=df.index[df["C"] == 0].tolist()
c=drop_A+drop_B
df=df.drop(df.index[c])
[out]
A B C
0 1 2 5
2 6 8 4
python pandas
add a comment |Â
up vote
5
down vote
favorite
I am dropping rows from a PANDAS dataframe when some of its columns have 0 value. I got the output by using the below code, but I hope we can do the same with less code â perhaps in a single line.
df:
A B C
0 1 2 5
1 4 4 0
2 6 8 4
3 0 4 2
My code:
drop_A=df.index[df["A"] == 0].tolist()
drop_B=df.index[df["C"] == 0].tolist()
c=drop_A+drop_B
df=df.drop(df.index[c])
[out]
A B C
0 1 2 5
2 6 8 4
python pandas
Do you want to know a better way to do what your code is doing, or do you want us to code golf it?
â Peilonrayz
Jan 18 at 11:27
I need a better way
â pyd
Jan 18 at 11:27
add a comment |Â
up vote
5
down vote
favorite
up vote
5
down vote
favorite
I am dropping rows from a PANDAS dataframe when some of its columns have 0 value. I got the output by using the below code, but I hope we can do the same with less code â perhaps in a single line.
df:
A B C
0 1 2 5
1 4 4 0
2 6 8 4
3 0 4 2
My code:
drop_A=df.index[df["A"] == 0].tolist()
drop_B=df.index[df["C"] == 0].tolist()
c=drop_A+drop_B
df=df.drop(df.index[c])
[out]
A B C
0 1 2 5
2 6 8 4
python pandas
I am dropping rows from a PANDAS dataframe when some of its columns have 0 value. I got the output by using the below code, but I hope we can do the same with less code â perhaps in a single line.
df:
A B C
0 1 2 5
1 4 4 0
2 6 8 4
3 0 4 2
My code:
drop_A=df.index[df["A"] == 0].tolist()
drop_B=df.index[df["C"] == 0].tolist()
c=drop_A+drop_B
df=df.drop(df.index[c])
[out]
A B C
0 1 2 5
2 6 8 4
python pandas
edited Jan 26 at 18:50
200_success
123k14143401
123k14143401
asked Jan 18 at 11:19
pyd
133117
133117
Do you want to know a better way to do what your code is doing, or do you want us to code golf it?
â Peilonrayz
Jan 18 at 11:27
I need a better way
â pyd
Jan 18 at 11:27
add a comment |Â
Do you want to know a better way to do what your code is doing, or do you want us to code golf it?
â Peilonrayz
Jan 18 at 11:27
I need a better way
â pyd
Jan 18 at 11:27
Do you want to know a better way to do what your code is doing, or do you want us to code golf it?
â Peilonrayz
Jan 18 at 11:27
Do you want to know a better way to do what your code is doing, or do you want us to code golf it?
â Peilonrayz
Jan 18 at 11:27
I need a better way
â pyd
Jan 18 at 11:27
I need a better way
â pyd
Jan 18 at 11:27
add a comment |Â
2 Answers
2
active
oldest
votes
up vote
9
down vote
accepted
I think you need create boolean DataFrame by compare all filtered columns values by scalar for not equality and then check all Trues per rows by all:
df = df[(df[['A','C']] != 0).all(axis=1)]
print (df)
A B C
0 1 2 5
2 6 8 4
Details:
print (df[['A','C']] != 0)
A C
0 True True
1 True False
2 True True
3 False True
print ((df[['A','C']] != 0).all(axis=1))
0 True
1 False
2 True
3 False
dtype: bool
I think you need create boolean DataFrame by compare all values by scalar and then check any Trues per rows by any and last invert mask by ~:
df = df[~(df[['A','C']] == 0).any(axis=1)]
Details:
print (df[['A','C']])
A C
0 1 5
1 4 0
2 6 4
3 0 2
print (df[['A','C']] == 0)
A C
0 False False
1 False True
2 False False
3 True False
print ((df[['A','C']] == 0).any(axis=1))
0 False
1 True
2 False
3 True
dtype: bool
print (~(df[['A','C']] == 0).any(axis=1))
0 True
1 False
2 True
3 False
dtype: bool
Jezrael , I want to consider only column A and C , pls check my question once
â pyd
Jan 18 at 11:31
@pyd Clarify this in your question.
â Mast
Jan 18 at 11:39
You have both "all not equal to 0" and "not any equal to zero". Did you intend these to be two options, or did you accidentally post two solutions?
â Acccumulation
Jan 18 at 17:51
@Accumulation No, it was no accident. I post first the best solution and second very nice, the best 2. :)
â jezrael
Jan 18 at 18:03
add a comment |Â
up vote
1
down vote
One line hack using .dropna()
import pandas as pd
df = pd.DataFrame('A':[1,4,6,0],'B':[2,4,8,4],'C':[5,0,4,2])
print df
A B C
0 1 2 5
1 4 4 0
2 6 8 4
3 0 4 2
columns = ['A', 'C']
df = df.replace(0, pd.np.nan).dropna(axis=0, how='any', subset=columns).fillna(0).astype(int)
print df
A B C
0 1 2 5
2 6 8 4
So, what's happening is:
- Replace
0byNaNwith.replace() - Use
.dropna()to dropNaNconsidering only columnsAandC - Replace
NaNback to0with.fillna()(not needed if you use all columns instead of only a subset) - Correct the data type from
floattointwith.astype()
add a comment |Â
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
9
down vote
accepted
I think you need create boolean DataFrame by compare all filtered columns values by scalar for not equality and then check all Trues per rows by all:
df = df[(df[['A','C']] != 0).all(axis=1)]
print (df)
A B C
0 1 2 5
2 6 8 4
Details:
print (df[['A','C']] != 0)
A C
0 True True
1 True False
2 True True
3 False True
print ((df[['A','C']] != 0).all(axis=1))
0 True
1 False
2 True
3 False
dtype: bool
I think you need create boolean DataFrame by compare all values by scalar and then check any Trues per rows by any and last invert mask by ~:
df = df[~(df[['A','C']] == 0).any(axis=1)]
Details:
print (df[['A','C']])
A C
0 1 5
1 4 0
2 6 4
3 0 2
print (df[['A','C']] == 0)
A C
0 False False
1 False True
2 False False
3 True False
print ((df[['A','C']] == 0).any(axis=1))
0 False
1 True
2 False
3 True
dtype: bool
print (~(df[['A','C']] == 0).any(axis=1))
0 True
1 False
2 True
3 False
dtype: bool
Jezrael , I want to consider only column A and C , pls check my question once
â pyd
Jan 18 at 11:31
@pyd Clarify this in your question.
â Mast
Jan 18 at 11:39
You have both "all not equal to 0" and "not any equal to zero". Did you intend these to be two options, or did you accidentally post two solutions?
â Acccumulation
Jan 18 at 17:51
@Accumulation No, it was no accident. I post first the best solution and second very nice, the best 2. :)
â jezrael
Jan 18 at 18:03
add a comment |Â
up vote
9
down vote
accepted
I think you need create boolean DataFrame by compare all filtered columns values by scalar for not equality and then check all Trues per rows by all:
df = df[(df[['A','C']] != 0).all(axis=1)]
print (df)
A B C
0 1 2 5
2 6 8 4
Details:
print (df[['A','C']] != 0)
A C
0 True True
1 True False
2 True True
3 False True
print ((df[['A','C']] != 0).all(axis=1))
0 True
1 False
2 True
3 False
dtype: bool
I think you need create boolean DataFrame by compare all values by scalar and then check any Trues per rows by any and last invert mask by ~:
df = df[~(df[['A','C']] == 0).any(axis=1)]
Details:
print (df[['A','C']])
A C
0 1 5
1 4 0
2 6 4
3 0 2
print (df[['A','C']] == 0)
A C
0 False False
1 False True
2 False False
3 True False
print ((df[['A','C']] == 0).any(axis=1))
0 False
1 True
2 False
3 True
dtype: bool
print (~(df[['A','C']] == 0).any(axis=1))
0 True
1 False
2 True
3 False
dtype: bool
Jezrael , I want to consider only column A and C , pls check my question once
â pyd
Jan 18 at 11:31
@pyd Clarify this in your question.
â Mast
Jan 18 at 11:39
You have both "all not equal to 0" and "not any equal to zero". Did you intend these to be two options, or did you accidentally post two solutions?
â Acccumulation
Jan 18 at 17:51
@Accumulation No, it was no accident. I post first the best solution and second very nice, the best 2. :)
â jezrael
Jan 18 at 18:03
add a comment |Â
up vote
9
down vote
accepted
up vote
9
down vote
accepted
I think you need create boolean DataFrame by compare all filtered columns values by scalar for not equality and then check all Trues per rows by all:
df = df[(df[['A','C']] != 0).all(axis=1)]
print (df)
A B C
0 1 2 5
2 6 8 4
Details:
print (df[['A','C']] != 0)
A C
0 True True
1 True False
2 True True
3 False True
print ((df[['A','C']] != 0).all(axis=1))
0 True
1 False
2 True
3 False
dtype: bool
I think you need create boolean DataFrame by compare all values by scalar and then check any Trues per rows by any and last invert mask by ~:
df = df[~(df[['A','C']] == 0).any(axis=1)]
Details:
print (df[['A','C']])
A C
0 1 5
1 4 0
2 6 4
3 0 2
print (df[['A','C']] == 0)
A C
0 False False
1 False True
2 False False
3 True False
print ((df[['A','C']] == 0).any(axis=1))
0 False
1 True
2 False
3 True
dtype: bool
print (~(df[['A','C']] == 0).any(axis=1))
0 True
1 False
2 True
3 False
dtype: bool
I think you need create boolean DataFrame by compare all filtered columns values by scalar for not equality and then check all Trues per rows by all:
df = df[(df[['A','C']] != 0).all(axis=1)]
print (df)
A B C
0 1 2 5
2 6 8 4
Details:
print (df[['A','C']] != 0)
A C
0 True True
1 True False
2 True True
3 False True
print ((df[['A','C']] != 0).all(axis=1))
0 True
1 False
2 True
3 False
dtype: bool
I think you need create boolean DataFrame by compare all values by scalar and then check any Trues per rows by any and last invert mask by ~:
df = df[~(df[['A','C']] == 0).any(axis=1)]
Details:
print (df[['A','C']])
A C
0 1 5
1 4 0
2 6 4
3 0 2
print (df[['A','C']] == 0)
A C
0 False False
1 False True
2 False False
3 True False
print ((df[['A','C']] == 0).any(axis=1))
0 False
1 True
2 False
3 True
dtype: bool
print (~(df[['A','C']] == 0).any(axis=1))
0 True
1 False
2 True
3 False
dtype: bool
edited Jan 18 at 11:41
answered Jan 18 at 11:28
jezrael
20615
20615
Jezrael , I want to consider only column A and C , pls check my question once
â pyd
Jan 18 at 11:31
@pyd Clarify this in your question.
â Mast
Jan 18 at 11:39
You have both "all not equal to 0" and "not any equal to zero". Did you intend these to be two options, or did you accidentally post two solutions?
â Acccumulation
Jan 18 at 17:51
@Accumulation No, it was no accident. I post first the best solution and second very nice, the best 2. :)
â jezrael
Jan 18 at 18:03
add a comment |Â
Jezrael , I want to consider only column A and C , pls check my question once
â pyd
Jan 18 at 11:31
@pyd Clarify this in your question.
â Mast
Jan 18 at 11:39
You have both "all not equal to 0" and "not any equal to zero". Did you intend these to be two options, or did you accidentally post two solutions?
â Acccumulation
Jan 18 at 17:51
@Accumulation No, it was no accident. I post first the best solution and second very nice, the best 2. :)
â jezrael
Jan 18 at 18:03
Jezrael , I want to consider only column A and C , pls check my question once
â pyd
Jan 18 at 11:31
Jezrael , I want to consider only column A and C , pls check my question once
â pyd
Jan 18 at 11:31
@pyd Clarify this in your question.
â Mast
Jan 18 at 11:39
@pyd Clarify this in your question.
â Mast
Jan 18 at 11:39
You have both "all not equal to 0" and "not any equal to zero". Did you intend these to be two options, or did you accidentally post two solutions?
â Acccumulation
Jan 18 at 17:51
You have both "all not equal to 0" and "not any equal to zero". Did you intend these to be two options, or did you accidentally post two solutions?
â Acccumulation
Jan 18 at 17:51
@Accumulation No, it was no accident. I post first the best solution and second very nice, the best 2. :)
â jezrael
Jan 18 at 18:03
@Accumulation No, it was no accident. I post first the best solution and second very nice, the best 2. :)
â jezrael
Jan 18 at 18:03
add a comment |Â
up vote
1
down vote
One line hack using .dropna()
import pandas as pd
df = pd.DataFrame('A':[1,4,6,0],'B':[2,4,8,4],'C':[5,0,4,2])
print df
A B C
0 1 2 5
1 4 4 0
2 6 8 4
3 0 4 2
columns = ['A', 'C']
df = df.replace(0, pd.np.nan).dropna(axis=0, how='any', subset=columns).fillna(0).astype(int)
print df
A B C
0 1 2 5
2 6 8 4
So, what's happening is:
- Replace
0byNaNwith.replace() - Use
.dropna()to dropNaNconsidering only columnsAandC - Replace
NaNback to0with.fillna()(not needed if you use all columns instead of only a subset) - Correct the data type from
floattointwith.astype()
add a comment |Â
up vote
1
down vote
One line hack using .dropna()
import pandas as pd
df = pd.DataFrame('A':[1,4,6,0],'B':[2,4,8,4],'C':[5,0,4,2])
print df
A B C
0 1 2 5
1 4 4 0
2 6 8 4
3 0 4 2
columns = ['A', 'C']
df = df.replace(0, pd.np.nan).dropna(axis=0, how='any', subset=columns).fillna(0).astype(int)
print df
A B C
0 1 2 5
2 6 8 4
So, what's happening is:
- Replace
0byNaNwith.replace() - Use
.dropna()to dropNaNconsidering only columnsAandC - Replace
NaNback to0with.fillna()(not needed if you use all columns instead of only a subset) - Correct the data type from
floattointwith.astype()
add a comment |Â
up vote
1
down vote
up vote
1
down vote
One line hack using .dropna()
import pandas as pd
df = pd.DataFrame('A':[1,4,6,0],'B':[2,4,8,4],'C':[5,0,4,2])
print df
A B C
0 1 2 5
1 4 4 0
2 6 8 4
3 0 4 2
columns = ['A', 'C']
df = df.replace(0, pd.np.nan).dropna(axis=0, how='any', subset=columns).fillna(0).astype(int)
print df
A B C
0 1 2 5
2 6 8 4
So, what's happening is:
- Replace
0byNaNwith.replace() - Use
.dropna()to dropNaNconsidering only columnsAandC - Replace
NaNback to0with.fillna()(not needed if you use all columns instead of only a subset) - Correct the data type from
floattointwith.astype()
One line hack using .dropna()
import pandas as pd
df = pd.DataFrame('A':[1,4,6,0],'B':[2,4,8,4],'C':[5,0,4,2])
print df
A B C
0 1 2 5
1 4 4 0
2 6 8 4
3 0 4 2
columns = ['A', 'C']
df = df.replace(0, pd.np.nan).dropna(axis=0, how='any', subset=columns).fillna(0).astype(int)
print df
A B C
0 1 2 5
2 6 8 4
So, what's happening is:
- Replace
0byNaNwith.replace() - Use
.dropna()to dropNaNconsidering only columnsAandC - Replace
NaNback to0with.fillna()(not needed if you use all columns instead of only a subset) - Correct the data type from
floattointwith.astype()
edited Jan 26 at 17:48
answered Jan 23 at 9:08
paulo.filip3
1113
1113
add a comment |Â
add a comment |Â
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f185389%2fdropping-rows-from-a-pandas-dataframe-where-some-of-the-columns-have-value-0%23new-answer', 'question_page');
);
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Do you want to know a better way to do what your code is doing, or do you want us to code golf it?
â Peilonrayz
Jan 18 at 11:27
I need a better way
â pyd
Jan 18 at 11:27