Python functions to fetch spreadsheets through SFTP and from Dropbox [closed]

.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty margin-bottom:0;

up vote
1
down vote

favorite

I have written two functions- first one is to read files from SFTP and second one is to read files from Dropbox.

Both functions have some similar line of code like validating extension and reading/saving file that can be extracted into a separated function and thus need your suggestion how to do it.

PFB my both functions-

class SftpHelper:
 def fetch_file_from_sftp(self, file_name, sheet_name=0):
 valid_extensions = ['csv', 'xls', 'xlsx']
 extension = file_name.split('.')[-1]
 sftp, transport = self.connect_to_sftp()
 remote_path = self.remote_dir + file_name
 data = io.BytesIO()
 sftp.getfo(remote_path, data, callback=None)
 if extension == 'csv':
 file_df = pd.read_csv(io.BytesIO(data.getvalue()))
 else:
 file_df = pd.read_excel(io.BytesIO(data.getvalue()), sheet_name=sheet_name)
 self.close_sftp_connection(sftp, transport)
 return file_df

class DropBoxHelper:
 def read_file_from_dropbox(self, file_name, sheet_name=0):
 valid_extensions = ['csv', 'xls', 'xlsx']
 extension = file_name.split('.')[-1]
 dbx = self.connect_to_dropbox()
 metadata,data=dbx.files_download(file_name)
 if extension == 'csv':
 file_df = pd.read_csv(io.BytesIO(data.content))
 else:
 file_df = pd.read_excel((io.BytesIO(data.content)), sheet_name=sheet_name)
 return file_df

Can anyone please help me to extract the common logix to a seperate function and then use that one in my two functions?

edited Jul 11 at 14:14

200_success

123k14143399

asked Jul 11 at 8:38

AnalyticsPy

134

closed as off-topic by Daniel, Graipher, Toby Speight, Stephen Rauch, t3chb0t Jul 12 at 12:49

This question appears to be off-topic. The users who voted to close gave this specific reason:

"Code not implemented or not working as intended: Code Review is a community where programmers peer-review your working code to address issues such as security, maintainability, performance, and scalability. We require that the code be working correctly, to the best of the author's knowledge, before proceeding with a review." â€“ Daniel, Graipher, Toby Speight, Stephen Rauch, t3chb0t

If this question can be reworded to fit the rules in the help center, please edit the question.

Please do not update the code in your question to incorporate feedback from answers, doing so goes against the Question + Answer style of Code Review. This is not a forum where you should keep the most updated version in your question. Please see what you may and may not do after receiving answers.
â€“Â Mast
Jul 11 at 11:07

I will take care this in future.
â€“Â AnalyticsPy
Jul 11 at 11:13

Please include enough context (the connect_to_sftp and connect_to_dropbox methods) so that we can give you proper advice.
â€“Â 200_success
Jul 11 at 14:16

add a commentÂ |Â

up vote
1
down vote

favorite

I have written two functions- first one is to read files from SFTP and second one is to read files from Dropbox.

Both functions have some similar line of code like validating extension and reading/saving file that can be extracted into a separated function and thus need your suggestion how to do it.

PFB my both functions-

class SftpHelper:
 def fetch_file_from_sftp(self, file_name, sheet_name=0):
 valid_extensions = ['csv', 'xls', 'xlsx']
 extension = file_name.split('.')[-1]
 sftp, transport = self.connect_to_sftp()
 remote_path = self.remote_dir + file_name
 data = io.BytesIO()
 sftp.getfo(remote_path, data, callback=None)
 if extension == 'csv':
 file_df = pd.read_csv(io.BytesIO(data.getvalue()))
 else:
 file_df = pd.read_excel(io.BytesIO(data.getvalue()), sheet_name=sheet_name)
 self.close_sftp_connection(sftp, transport)
 return file_df

class DropBoxHelper:
 def read_file_from_dropbox(self, file_name, sheet_name=0):
 valid_extensions = ['csv', 'xls', 'xlsx']
 extension = file_name.split('.')[-1]
 dbx = self.connect_to_dropbox()
 metadata,data=dbx.files_download(file_name)
 if extension == 'csv':
 file_df = pd.read_csv(io.BytesIO(data.content))
 else:
 file_df = pd.read_excel((io.BytesIO(data.content)), sheet_name=sheet_name)
 return file_df

Can anyone please help me to extract the common logix to a seperate function and then use that one in my two functions?

edited Jul 11 at 14:14

200_success

123k14143399

asked Jul 11 at 8:38

AnalyticsPy

134

closed as off-topic by Daniel, Graipher, Toby Speight, Stephen Rauch, t3chb0t Jul 12 at 12:49

This question appears to be off-topic. The users who voted to close gave this specific reason:

"Code not implemented or not working as intended: Code Review is a community where programmers peer-review your working code to address issues such as security, maintainability, performance, and scalability. We require that the code be working correctly, to the best of the author's knowledge, before proceeding with a review." â€“ Daniel, Graipher, Toby Speight, Stephen Rauch, t3chb0t

If this question can be reworded to fit the rules in the help center, please edit the question.

Please do not update the code in your question to incorporate feedback from answers, doing so goes against the Question + Answer style of Code Review. This is not a forum where you should keep the most updated version in your question. Please see what you may and may not do after receiving answers.
â€“Â Mast
Jul 11 at 11:07

I will take care this in future.
â€“Â AnalyticsPy
Jul 11 at 11:13

Please include enough context (the connect_to_sftp and connect_to_dropbox methods) so that we can give you proper advice.
â€“Â 200_success
Jul 11 at 14:16

add a commentÂ |Â

up vote
1
down vote

favorite

I have written two functions- first one is to read files from SFTP and second one is to read files from Dropbox.

Both functions have some similar line of code like validating extension and reading/saving file that can be extracted into a separated function and thus need your suggestion how to do it.

PFB my both functions-

class SftpHelper:
 def fetch_file_from_sftp(self, file_name, sheet_name=0):
 valid_extensions = ['csv', 'xls', 'xlsx']
 extension = file_name.split('.')[-1]
 sftp, transport = self.connect_to_sftp()
 remote_path = self.remote_dir + file_name
 data = io.BytesIO()
 sftp.getfo(remote_path, data, callback=None)
 if extension == 'csv':
 file_df = pd.read_csv(io.BytesIO(data.getvalue()))
 else:
 file_df = pd.read_excel(io.BytesIO(data.getvalue()), sheet_name=sheet_name)
 self.close_sftp_connection(sftp, transport)
 return file_df

class DropBoxHelper:
 def read_file_from_dropbox(self, file_name, sheet_name=0):
 valid_extensions = ['csv', 'xls', 'xlsx']
 extension = file_name.split('.')[-1]
 dbx = self.connect_to_dropbox()
 metadata,data=dbx.files_download(file_name)
 if extension == 'csv':
 file_df = pd.read_csv(io.BytesIO(data.content))
 else:
 file_df = pd.read_excel((io.BytesIO(data.content)), sheet_name=sheet_name)
 return file_df

Can anyone please help me to extract the common logix to a seperate function and then use that one in my two functions?

edited Jul 11 at 14:14

200_success

123k14143399

asked Jul 11 at 8:38

AnalyticsPy

134

I have written two functions- first one is to read files from SFTP and second one is to read files from Dropbox.

Both functions have some similar line of code like validating extension and reading/saving file that can be extracted into a separated function and thus need your suggestion how to do it.

PFB my both functions-

class SftpHelper:
 def fetch_file_from_sftp(self, file_name, sheet_name=0):
 valid_extensions = ['csv', 'xls', 'xlsx']
 extension = file_name.split('.')[-1]
 sftp, transport = self.connect_to_sftp()
 remote_path = self.remote_dir + file_name
 data = io.BytesIO()
 sftp.getfo(remote_path, data, callback=None)
 if extension == 'csv':
 file_df = pd.read_csv(io.BytesIO(data.getvalue()))
 else:
 file_df = pd.read_excel(io.BytesIO(data.getvalue()), sheet_name=sheet_name)
 self.close_sftp_connection(sftp, transport)
 return file_df

class DropBoxHelper:
 def read_file_from_dropbox(self, file_name, sheet_name=0):
 valid_extensions = ['csv', 'xls', 'xlsx']
 extension = file_name.split('.')[-1]
 dbx = self.connect_to_dropbox()
 metadata,data=dbx.files_download(file_name)
 if extension == 'csv':
 file_df = pd.read_csv(io.BytesIO(data.content))
 else:
 file_df = pd.read_excel((io.BytesIO(data.content)), sheet_name=sheet_name)
 return file_df

Can anyone please help me to extract the common logix to a seperate function and then use that one in my two functions?

edited Jul 11 at 14:14

200_success

123k14143399

asked Jul 11 at 8:38

AnalyticsPy

134

edited Jul 11 at 14:14

200_success

123k14143399

edited Jul 11 at 14:14

200_success

123k14143399

edited Jul 11 at 14:14

200_success

123k14143399

asked Jul 11 at 8:38

AnalyticsPy

134

asked Jul 11 at 8:38

AnalyticsPy

134

asked Jul 11 at 8:38

AnalyticsPy

134

closed as off-topic by Daniel, Graipher, Toby Speight, Stephen Rauch, t3chb0t Jul 12 at 12:49

This question appears to be off-topic. The users who voted to close gave this specific reason:

"Code not implemented or not working as intended: Code Review is a community where programmers peer-review your working code to address issues such as security, maintainability, performance, and scalability. We require that the code be working correctly, to the best of the author's knowledge, before proceeding with a review." â€“ Daniel, Graipher, Toby Speight, Stephen Rauch, t3chb0t

If this question can be reworded to fit the rules in the help center, please edit the question.

closed as off-topic by Daniel, Graipher, Toby Speight, Stephen Rauch, t3chb0t Jul 12 at 12:49

This question appears to be off-topic. The users who voted to close gave this specific reason:

"Code not implemented or not working as intended: Code Review is a community where programmers peer-review your working code to address issues such as security, maintainability, performance, and scalability. We require that the code be working correctly, to the best of the author's knowledge, before proceeding with a review." â€“ Daniel, Graipher, Toby Speight, Stephen Rauch, t3chb0t

If this question can be reworded to fit the rules in the help center, please edit the question.

Please do not update the code in your question to incorporate feedback from answers, doing so goes against the Question + Answer style of Code Review. This is not a forum where you should keep the most updated version in your question. Please see what you may and may not do after receiving answers.
â€“Â Mast
Jul 11 at 11:07

I will take care this in future.
â€“Â AnalyticsPy
Jul 11 at 11:13

Please include enough context (the connect_to_sftp and connect_to_dropbox methods) so that we can give you proper advice.
â€“Â 200_success
Jul 11 at 14:16

add a commentÂ |Â

Please do not update the code in your question to incorporate feedback from answers, doing so goes against the Question + Answer style of Code Review. This is not a forum where you should keep the most updated version in your question. Please see what you may and may not do after receiving answers.
â€“Â Mast
Jul 11 at 11:07

I will take care this in future.
â€“Â AnalyticsPy
Jul 11 at 11:13

Please include enough context (the connect_to_sftp and connect_to_dropbox methods) so that we can give you proper advice.
â€“Â 200_success
Jul 11 at 14:16

Please do not update the code in your question to incorporate feedback from answers, doing so goes against the Question + Answer style of Code Review. This is not a forum where you should keep the most updated version in your question. Please see what you may and may not do after receiving answers.
â€“Â Mast
Jul 11 at 11:07

I will take care this in future.
â€“Â AnalyticsPy
Jul 11 at 11:13

Please include enough context (the connect_to_sftp and connect_to_dropbox methods) so that we can give you proper advice.
â€“Â 200_success
Jul 11 at 14:16

add a commentÂ |Â

2 Answers
2

active

oldest

votes

up vote
1
down vote

accepted

Since you are already using classes here, you could derive from a base class that have the shared behaviour and delegate to derived classes the specific behaviour of the connection:

from os.path import splitext


class _RemoteHelper
 def file_reader(self, file_name, sheet_name=0):
 _, extension = splitext(file_name)
 data = self._internal_file_reader(file_name)

 if extension == 'csv':
 return pd.read_csv(io.BytesIO(data))
 else:
 return pd.read_excel((io.BytesIO(data)), sheet_name=sheet_name)


class SftpHelper(_RemoteHelper):
 def _internal_file_reader(self, file_name):
 data = io.BytesIO()
 sftp, transport = self.connect_to_sftp()
 sftp.getfo(self.remote_dir + file_name, data, callback=None)
 self.close_sftp_connection(sftp, transport)
 return data.getvalue()


class DropBoxHelper(_RemoteHelper):
 def _internal_file_reader(self, file_name):
 dbx = self.connect_to_dropbox()
 _, data = dbx.files_download(file_name)
 return data.content

This have the neat advantage of harmonizing the interfaces accros both classes.

answered Jul 11 at 9:37

Mathias Ettinger

21.7k32875

add a commentÂ |Â

up vote
2
down vote

From looking at the code, some things seem off:

valid_extensions is defined, but not used

connect_to_sftp(), self.remote_dir, io.BytesIO(), sftp.getfo(), pd, self.close_sftp_connection() and a bunch of other functions/fields are not defined

That being said, the core problem is addressed by creating a parent class which both your classes can inherit from. It'd look something like this:

class FileHelper:
 def parse_fetched_file(self, file_name, data, sheet_name):
 valid_extensions = ['csv', 'xls', 'xlsx']
 extension = file_name.split('.')[-1]
 if extension == 'csv':
 return pd.read_csv(io.BytesIO(data.content))
 return pd.read_excel((io.BytesIO(data.content)), sheet_name=sheet_name)


class SftpHelper(FileHelper):
 def fetch_file_from_sftp(self, file_name, sheet_name = 0):
 sftp, transport = self.connect_to_sftp()
 remote_path = self.remote_dir + file_name
 data = io.BytesIO()
 sftp.getfo(remote_path, data, callback=None)
 file_df super(SftpHelper, self).parse_fetched_file(file_name, data, sheet_name)
 self.close_sftp_connection(sftp, transport)
 return file_df

class DropBoxHelper(FileHelper):
 def read_file_from_dropbox(self, file_name, sheet_name = 0):
 dbx = self.connect_to_dropbox()
 metadata, data = dbx.files_download(file_name)
 return super(DropBoxHelper, self).parse_fetched_file(file_name, data, sheet_name)

I'm not 100% sure that it's the most efficient syntax, but it gets the job done.

answered Jul 11 at 10:49

maxb

721312

thanks for reviewing my code. I have edited my code snippet and added valid_extensions related line. If you see the file read code , there is difference in sftp and dropbox read operations.
â€“Â AnalyticsPy
Jul 11 at 11:07

for example in case of sftp I am using data.getvalue() but in case of dropbox I am using data.content.
â€“Â AnalyticsPy
Jul 11 at 11:09

Ah, I missed that part, I'll try to update my answer in a bit. Still, the structure will be the same, but you'd send in data.content/data.getvalue() to parse_fetched_data instead.
â€“Â maxb
Jul 11 at 11:33

Don't you still have the unused valid_extensions variable? Also, why do you need to call super?
â€“Â Solomon Ucko
Jul 11 at 13:45

I was a bit unsure about the "correct" way to call functions in the super class, but Mathias provided a much clearer solution. I kept the valid_extensions because the code was obviously non-functional, so I assumed that it should later be used at that place.
â€“Â maxb
Jul 12 at 6:29

add a commentÂ |Â

2 Answers
2

active

oldest

votes

2 Answers
2

active

oldest

votes

up vote
1
down vote

accepted

Since you are already using classes here, you could derive from a base class that have the shared behaviour and delegate to derived classes the specific behaviour of the connection:

from os.path import splitext


class _RemoteHelper
 def file_reader(self, file_name, sheet_name=0):
 _, extension = splitext(file_name)
 data = self._internal_file_reader(file_name)

 if extension == 'csv':
 return pd.read_csv(io.BytesIO(data))
 else:
 return pd.read_excel((io.BytesIO(data)), sheet_name=sheet_name)


class SftpHelper(_RemoteHelper):
 def _internal_file_reader(self, file_name):
 data = io.BytesIO()
 sftp, transport = self.connect_to_sftp()
 sftp.getfo(self.remote_dir + file_name, data, callback=None)
 self.close_sftp_connection(sftp, transport)
 return data.getvalue()


class DropBoxHelper(_RemoteHelper):
 def _internal_file_reader(self, file_name):
 dbx = self.connect_to_dropbox()
 _, data = dbx.files_download(file_name)
 return data.content

This have the neat advantage of harmonizing the interfaces accros both classes.

answered Jul 11 at 9:37

Mathias Ettinger

21.7k32875

add a commentÂ |Â

up vote
1
down vote

accepted

Since you are already using classes here, you could derive from a base class that have the shared behaviour and delegate to derived classes the specific behaviour of the connection:

from os.path import splitext


class _RemoteHelper
 def file_reader(self, file_name, sheet_name=0):
 _, extension = splitext(file_name)
 data = self._internal_file_reader(file_name)

 if extension == 'csv':
 return pd.read_csv(io.BytesIO(data))
 else:
 return pd.read_excel((io.BytesIO(data)), sheet_name=sheet_name)


class SftpHelper(_RemoteHelper):
 def _internal_file_reader(self, file_name):
 data = io.BytesIO()
 sftp, transport = self.connect_to_sftp()
 sftp.getfo(self.remote_dir + file_name, data, callback=None)
 self.close_sftp_connection(sftp, transport)
 return data.getvalue()


class DropBoxHelper(_RemoteHelper):
 def _internal_file_reader(self, file_name):
 dbx = self.connect_to_dropbox()
 _, data = dbx.files_download(file_name)
 return data.content

This have the neat advantage of harmonizing the interfaces accros both classes.

answered Jul 11 at 9:37

Mathias Ettinger

21.7k32875

add a commentÂ |Â

up vote
1
down vote

accepted

Since you are already using classes here, you could derive from a base class that have the shared behaviour and delegate to derived classes the specific behaviour of the connection:

from os.path import splitext


class _RemoteHelper
 def file_reader(self, file_name, sheet_name=0):
 _, extension = splitext(file_name)
 data = self._internal_file_reader(file_name)

 if extension == 'csv':
 return pd.read_csv(io.BytesIO(data))
 else:
 return pd.read_excel((io.BytesIO(data)), sheet_name=sheet_name)


class SftpHelper(_RemoteHelper):
 def _internal_file_reader(self, file_name):
 data = io.BytesIO()
 sftp, transport = self.connect_to_sftp()
 sftp.getfo(self.remote_dir + file_name, data, callback=None)
 self.close_sftp_connection(sftp, transport)
 return data.getvalue()


class DropBoxHelper(_RemoteHelper):
 def _internal_file_reader(self, file_name):
 dbx = self.connect_to_dropbox()
 _, data = dbx.files_download(file_name)
 return data.content

This have the neat advantage of harmonizing the interfaces accros both classes.

answered Jul 11 at 9:37

Mathias Ettinger

21.7k32875

Since you are already using classes here, you could derive from a base class that have the shared behaviour and delegate to derived classes the specific behaviour of the connection:

from os.path import splitext


class _RemoteHelper
 def file_reader(self, file_name, sheet_name=0):
 _, extension = splitext(file_name)
 data = self._internal_file_reader(file_name)

 if extension == 'csv':
 return pd.read_csv(io.BytesIO(data))
 else:
 return pd.read_excel((io.BytesIO(data)), sheet_name=sheet_name)


class SftpHelper(_RemoteHelper):
 def _internal_file_reader(self, file_name):
 data = io.BytesIO()
 sftp, transport = self.connect_to_sftp()
 sftp.getfo(self.remote_dir + file_name, data, callback=None)
 self.close_sftp_connection(sftp, transport)
 return data.getvalue()


class DropBoxHelper(_RemoteHelper):
 def _internal_file_reader(self, file_name):
 dbx = self.connect_to_dropbox()
 _, data = dbx.files_download(file_name)
 return data.content

This have the neat advantage of harmonizing the interfaces accros both classes.

answered Jul 11 at 9:37

Mathias Ettinger

21.7k32875

answered Jul 11 at 9:37

Mathias Ettinger

21.7k32875

answered Jul 11 at 9:37

Mathias Ettinger

21.7k32875

answered Jul 11 at 9:37

Mathias Ettinger

21.7k32875

add a commentÂ |Â

up vote
2
down vote

From looking at the code, some things seem off:

valid_extensions is defined, but not used

connect_to_sftp(), self.remote_dir, io.BytesIO(), sftp.getfo(), pd, self.close_sftp_connection() and a bunch of other functions/fields are not defined

That being said, the core problem is addressed by creating a parent class which both your classes can inherit from. It'd look something like this:

class FileHelper:
 def parse_fetched_file(self, file_name, data, sheet_name):
 valid_extensions = ['csv', 'xls', 'xlsx']
 extension = file_name.split('.')[-1]
 if extension == 'csv':
 return pd.read_csv(io.BytesIO(data.content))
 return pd.read_excel((io.BytesIO(data.content)), sheet_name=sheet_name)


class SftpHelper(FileHelper):
 def fetch_file_from_sftp(self, file_name, sheet_name = 0):
 sftp, transport = self.connect_to_sftp()
 remote_path = self.remote_dir + file_name
 data = io.BytesIO()
 sftp.getfo(remote_path, data, callback=None)
 file_df super(SftpHelper, self).parse_fetched_file(file_name, data, sheet_name)
 self.close_sftp_connection(sftp, transport)
 return file_df

class DropBoxHelper(FileHelper):
 def read_file_from_dropbox(self, file_name, sheet_name = 0):
 dbx = self.connect_to_dropbox()
 metadata, data = dbx.files_download(file_name)
 return super(DropBoxHelper, self).parse_fetched_file(file_name, data, sheet_name)

I'm not 100% sure that it's the most efficient syntax, but it gets the job done.

answered Jul 11 at 10:49

maxb

721312

thanks for reviewing my code. I have edited my code snippet and added valid_extensions related line. If you see the file read code , there is difference in sftp and dropbox read operations.
â€“Â AnalyticsPy
Jul 11 at 11:07

for example in case of sftp I am using data.getvalue() but in case of dropbox I am using data.content.
â€“Â AnalyticsPy
Jul 11 at 11:09

Ah, I missed that part, I'll try to update my answer in a bit. Still, the structure will be the same, but you'd send in data.content/data.getvalue() to parse_fetched_data instead.
â€“Â maxb
Jul 11 at 11:33

Don't you still have the unused valid_extensions variable? Also, why do you need to call super?
â€“Â Solomon Ucko
Jul 11 at 13:45

I was a bit unsure about the "correct" way to call functions in the super class, but Mathias provided a much clearer solution. I kept the valid_extensions because the code was obviously non-functional, so I assumed that it should later be used at that place.
â€“Â maxb
Jul 12 at 6:29

add a commentÂ |Â

up vote
2
down vote

From looking at the code, some things seem off:

valid_extensions is defined, but not used

connect_to_sftp(), self.remote_dir, io.BytesIO(), sftp.getfo(), pd, self.close_sftp_connection() and a bunch of other functions/fields are not defined

That being said, the core problem is addressed by creating a parent class which both your classes can inherit from. It'd look something like this:

class FileHelper:
 def parse_fetched_file(self, file_name, data, sheet_name):
 valid_extensions = ['csv', 'xls', 'xlsx']
 extension = file_name.split('.')[-1]
 if extension == 'csv':
 return pd.read_csv(io.BytesIO(data.content))
 return pd.read_excel((io.BytesIO(data.content)), sheet_name=sheet_name)


class SftpHelper(FileHelper):
 def fetch_file_from_sftp(self, file_name, sheet_name = 0):
 sftp, transport = self.connect_to_sftp()
 remote_path = self.remote_dir + file_name
 data = io.BytesIO()
 sftp.getfo(remote_path, data, callback=None)
 file_df super(SftpHelper, self).parse_fetched_file(file_name, data, sheet_name)
 self.close_sftp_connection(sftp, transport)
 return file_df

class DropBoxHelper(FileHelper):
 def read_file_from_dropbox(self, file_name, sheet_name = 0):
 dbx = self.connect_to_dropbox()
 metadata, data = dbx.files_download(file_name)
 return super(DropBoxHelper, self).parse_fetched_file(file_name, data, sheet_name)

I'm not 100% sure that it's the most efficient syntax, but it gets the job done.

answered Jul 11 at 10:49

maxb

721312

thanks for reviewing my code. I have edited my code snippet and added valid_extensions related line. If you see the file read code , there is difference in sftp and dropbox read operations.
â€“Â AnalyticsPy
Jul 11 at 11:07

for example in case of sftp I am using data.getvalue() but in case of dropbox I am using data.content.
â€“Â AnalyticsPy
Jul 11 at 11:09

Ah, I missed that part, I'll try to update my answer in a bit. Still, the structure will be the same, but you'd send in data.content/data.getvalue() to parse_fetched_data instead.
â€“Â maxb
Jul 11 at 11:33

Don't you still have the unused valid_extensions variable? Also, why do you need to call super?
â€“Â Solomon Ucko
Jul 11 at 13:45

I was a bit unsure about the "correct" way to call functions in the super class, but Mathias provided a much clearer solution. I kept the valid_extensions because the code was obviously non-functional, so I assumed that it should later be used at that place.
â€“Â maxb
Jul 12 at 6:29

add a commentÂ |Â

up vote
2
down vote

From looking at the code, some things seem off:

valid_extensions is defined, but not used

connect_to_sftp(), self.remote_dir, io.BytesIO(), sftp.getfo(), pd, self.close_sftp_connection() and a bunch of other functions/fields are not defined

That being said, the core problem is addressed by creating a parent class which both your classes can inherit from. It'd look something like this:

class FileHelper:
 def parse_fetched_file(self, file_name, data, sheet_name):
 valid_extensions = ['csv', 'xls', 'xlsx']
 extension = file_name.split('.')[-1]
 if extension == 'csv':
 return pd.read_csv(io.BytesIO(data.content))
 return pd.read_excel((io.BytesIO(data.content)), sheet_name=sheet_name)


class SftpHelper(FileHelper):
 def fetch_file_from_sftp(self, file_name, sheet_name = 0):
 sftp, transport = self.connect_to_sftp()
 remote_path = self.remote_dir + file_name
 data = io.BytesIO()
 sftp.getfo(remote_path, data, callback=None)
 file_df super(SftpHelper, self).parse_fetched_file(file_name, data, sheet_name)
 self.close_sftp_connection(sftp, transport)
 return file_df

class DropBoxHelper(FileHelper):
 def read_file_from_dropbox(self, file_name, sheet_name = 0):
 dbx = self.connect_to_dropbox()
 metadata, data = dbx.files_download(file_name)
 return super(DropBoxHelper, self).parse_fetched_file(file_name, data, sheet_name)

I'm not 100% sure that it's the most efficient syntax, but it gets the job done.

answered Jul 11 at 10:49

maxb

721312

From looking at the code, some things seem off:

valid_extensions is defined, but not used

connect_to_sftp(), self.remote_dir, io.BytesIO(), sftp.getfo(), pd, self.close_sftp_connection() and a bunch of other functions/fields are not defined

That being said, the core problem is addressed by creating a parent class which both your classes can inherit from. It'd look something like this:

class FileHelper:
 def parse_fetched_file(self, file_name, data, sheet_name):
 valid_extensions = ['csv', 'xls', 'xlsx']
 extension = file_name.split('.')[-1]
 if extension == 'csv':
 return pd.read_csv(io.BytesIO(data.content))
 return pd.read_excel((io.BytesIO(data.content)), sheet_name=sheet_name)


class SftpHelper(FileHelper):
 def fetch_file_from_sftp(self, file_name, sheet_name = 0):
 sftp, transport = self.connect_to_sftp()
 remote_path = self.remote_dir + file_name
 data = io.BytesIO()
 sftp.getfo(remote_path, data, callback=None)
 file_df super(SftpHelper, self).parse_fetched_file(file_name, data, sheet_name)
 self.close_sftp_connection(sftp, transport)
 return file_df

class DropBoxHelper(FileHelper):
 def read_file_from_dropbox(self, file_name, sheet_name = 0):
 dbx = self.connect_to_dropbox()
 metadata, data = dbx.files_download(file_name)
 return super(DropBoxHelper, self).parse_fetched_file(file_name, data, sheet_name)

I'm not 100% sure that it's the most efficient syntax, but it gets the job done.

answered Jul 11 at 10:49

maxb

721312

answered Jul 11 at 10:49

maxb

721312

answered Jul 11 at 10:49

maxb

721312

answered Jul 11 at 10:49

maxb

721312

thanks for reviewing my code. I have edited my code snippet and added valid_extensions related line. If you see the file read code , there is difference in sftp and dropbox read operations.
â€“Â AnalyticsPy
Jul 11 at 11:07

for example in case of sftp I am using data.getvalue() but in case of dropbox I am using data.content.
â€“Â AnalyticsPy
Jul 11 at 11:09

Ah, I missed that part, I'll try to update my answer in a bit. Still, the structure will be the same, but you'd send in data.content/data.getvalue() to parse_fetched_data instead.
â€“Â maxb
Jul 11 at 11:33

Don't you still have the unused valid_extensions variable? Also, why do you need to call super?
â€“Â Solomon Ucko
Jul 11 at 13:45

I was a bit unsure about the "correct" way to call functions in the super class, but Mathias provided a much clearer solution. I kept the valid_extensions because the code was obviously non-functional, so I assumed that it should later be used at that place.
â€“Â maxb
Jul 12 at 6:29

add a commentÂ |Â

thanks for reviewing my code. I have edited my code snippet and added valid_extensions related line. If you see the file read code , there is difference in sftp and dropbox read operations.
â€“Â AnalyticsPy
Jul 11 at 11:07

for example in case of sftp I am using data.getvalue() but in case of dropbox I am using data.content.
â€“Â AnalyticsPy
Jul 11 at 11:09

Ah, I missed that part, I'll try to update my answer in a bit. Still, the structure will be the same, but you'd send in data.content/data.getvalue() to parse_fetched_data instead.
â€“Â maxb
Jul 11 at 11:33

Don't you still have the unused valid_extensions variable? Also, why do you need to call super?
â€“Â Solomon Ucko
Jul 11 at 13:45

I was a bit unsure about the "correct" way to call functions in the super class, but Mathias provided a much clearer solution. I kept the valid_extensions because the code was obviously non-functional, so I assumed that it should later be used at that place.
â€“Â maxb
Jul 12 at 6:29

thanks for reviewing my code. I have edited my code snippet and added valid_extensions related line. If you see the file read code , there is difference in sftp and dropbox read operations.
â€“Â AnalyticsPy
Jul 11 at 11:07

for example in case of sftp I am using data.getvalue() but in case of dropbox I am using data.content.
â€“Â AnalyticsPy
Jul 11 at 11:09

Ah, I missed that part, I'll try to update my answer in a bit. Still, the structure will be the same, but you'd send in data.content/data.getvalue() to parse_fetched_data instead.
â€“Â maxb
Jul 11 at 11:33

Don't you still have the unused valid_extensions variable? Also, why do you need to call super?
â€“Â Solomon Ucko
Jul 11 at 13:45

I was a bit unsure about the "correct" way to call functions in the super class, but Mathias provided a much clearer solution. I kept the valid_extensions because the code was obviously non-functional, so I assumed that it should later be used at that place.
â€“Â maxb
Jul 12 at 6:29

add a commentÂ |Â

搜尋此網誌

trjhtr