Models for analysing Twitter data (*which based on tweepy package)

.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty margin-bottom:0;

up vote
1
down vote

favorite

I have three modules here models.py, functionals.py, and adjustment.py. These three may be used to perform analysis of Twitter data through tweepy objects.

The Tweets class is a simple one which takes 'tweet' objects and take it as a 'list', which is necessary (not yet implemented as necessary) as an argument of Authors class.

The Authors class is used to perform analysis of tweets by their authors/users. For example, we can plot the number of Followers of each user appearing in the data (the plot is by hbar_plot function).

Here is an example usage :

import numpy
from models import Authors
from models import Tweets

import matplotlib.pyplot as plt

stats = numpy.load('testfile.npy')
tweets = Tweets(stats)

Model = Authors(tweets)

fig, ax = plt.subplots(1, 1)
Model.hbar_plot(ax, measurement = 'Followers', incolor_measurement = 'Following', height = 0.5, color = (1,0,0,1))

fig, ax = plt.subplots(1, 1)
Model.hbar_plot(ax, measurement = 'Sample Tweets', incolor_measurement = 'Followers', height = 0.5, color = (1,0,0,1))
plt.show()

The example above plots the barplots, with bar colors representing the incolor_measurement variable.

The testfile.npy is available in https://github.com/anbarief/statistweepy/blob/master/examples/testfile.npy . This file consists of about 200 'tweet' objects of tweepy, collected a few days ago.

My question is, how to make better organization of the classes and their methods, also the functions here? How to write and organize it better?

The modules :

adjustment.py:

import copy

class Adjustment(object):

 def __init__(self, title = "an adjustment.", **kwargs):

 self.title = title

 if 'freq_lim' in kwargs:
 self.freq_lim = kwargs['freq_lim']
 else:
 self.freq_lim = None

 if 'char_lim' in kwargs:
 self.char_lim = kwargs['char_lim']
 else:
 self.char_lim = None

 if 'exclude' in kwargs:
 self.exclude = kwargs['exclude']
 else:
 self.exclude = None

functionals.py:

def filter_unique(tweet_stats_list, output = 'status'):

 stats = tweet_stats_list
 unique = 

 for tweet in stats:

 try:
 if not (tweet.retweeted_status in unique):
 if output == 'status':
 unique.append(tweet.retweeted_status)
 elif output == 'text':
 unique.append(tweet.retweeted_status.text)

 except:
 if not (tweet in unique):
 if output == 'status':
 unique.append(tweet)
 elif output == 'text':
 unique.append(tweet.text)

 return unique

def split_texts(texts, adjustment = None):
 split_list = 
 for text in texts:
 split = text.split()
 split_list.extend(split)
 return split_list

models.py:

import copy
import numpy
import matplotlib.pyplot as plt
from matplotlib import colors as mplcolors
from . import adjustment as adjust
from . import functionals as func


class Tweets(list):

 """

 Tweets model.

 """

 def __init__(self, *args, filter_by_unique = False, **kwargs):

 if filter_by_unique:
 tweets = func.filter_unique(args[0])
 else:
 tweets = args[0]

 list.__init__(self, tweets, **kwargs)

 @property
 def sorted_by_time(self):
 return sorted(self, key = lambda x: x.created_at)

 @property
 def oldest(self):
 return self.sorted_by_time[0]

 @property
 def newest(self):
 return self.sorted_by_time[-1]


class Authors(object):

 """

 Authors model.

 """

 def __init__(self, tweets):

 self.tweets = tweets
 self.authors = author.name : author for author in list(set([tweet.author for tweet in self.tweets]))
 self.username = author.screen_name : author for author in list(set([tweet.author for tweet in self.tweets]))
 self.followers_count = author: self.authors[author].followers_count for author in self.authors
 self.following_count = author: self.authors[author].friends_count for author in self.authors
 self.totaltweets = author: self.authors[author].statuses_count for author in self.authors
 self.tweets_by_author = author: [tweet for tweet in self.tweets if tweet.author.name == author] for author in self.authors
 self.tweets_count = author: len(self.tweets_by_author[author]) for author in self.tweets_by_author

 def hbar_plot(self, ax, measurement = 'Followers', color = (0,0,1,1), incolor_measurement = None, height = 1, textsize = 7, **kwargs):

 measurements = 'Followers': self.followers_count, 'Following' : self.following_count, 'Total Tweets' : self.totaltweets, 'Sample Tweets' : self.tweets_count
 sorted_authors = sorted(measurements[measurement], key = lambda x : measurements[measurement][x])

 if type(color) == str:
 color = mplcolors.hex2color(mplcolors.cnames[color]) 

 colors = len(self.authors)*[color]
 if incolor_measurement != None:
 minor_max = max(measurements[incolor_measurement].values())
 transparency = [measurements[incolor_measurement][author]/minor_max for author in sorted_authors]
 colors = [(color[0], color[1], color[2], trans) for trans in transparency]

 var = [i+height for i in range(len(self.authors))]
 ax.barh([i-height/2 for i in var], [measurements[measurement][author] for author in sorted_authors], height = height, color = colors, **kwargs)
 ax.set_yticks(var)
 ax.set_yticklabels(sorted_authors, rotation = 'horizontal', size = textsize)

 if incolor_measurement != None:
 ax.set_xlabel(measurement + ' (color : '+incolor_measurement+')')
 else :
 ax.set_xlabel(measurement)
 plt.tight_layout()

Result of example usage :

enter image description here

edited Jul 30 at 17:51

asked Jul 17 at 14:06

Arief

405112

add a commentÂ |Â

up vote
1
down vote

favorite

I have three modules here models.py, functionals.py, and adjustment.py. These three may be used to perform analysis of Twitter data through tweepy objects.

The Tweets class is a simple one which takes 'tweet' objects and take it as a 'list', which is necessary (not yet implemented as necessary) as an argument of Authors class.

Here is an example usage :

import numpy
from models import Authors
from models import Tweets

import matplotlib.pyplot as plt

stats = numpy.load('testfile.npy')
tweets = Tweets(stats)

Model = Authors(tweets)

fig, ax = plt.subplots(1, 1)
Model.hbar_plot(ax, measurement = 'Followers', incolor_measurement = 'Following', height = 0.5, color = (1,0,0,1))

fig, ax = plt.subplots(1, 1)
Model.hbar_plot(ax, measurement = 'Sample Tweets', incolor_measurement = 'Followers', height = 0.5, color = (1,0,0,1))
plt.show()

The example above plots the barplots, with bar colors representing the incolor_measurement variable.

My question is, how to make better organization of the classes and their methods, also the functions here? How to write and organize it better?

The modules :

adjustment.py:

import copy

class Adjustment(object):

 def __init__(self, title = "an adjustment.", **kwargs):

 self.title = title

 if 'freq_lim' in kwargs:
 self.freq_lim = kwargs['freq_lim']
 else:
 self.freq_lim = None

 if 'char_lim' in kwargs:
 self.char_lim = kwargs['char_lim']
 else:
 self.char_lim = None

 if 'exclude' in kwargs:
 self.exclude = kwargs['exclude']
 else:
 self.exclude = None

functionals.py:

def filter_unique(tweet_stats_list, output = 'status'):

 stats = tweet_stats_list
 unique = 

 for tweet in stats:

 try:
 if not (tweet.retweeted_status in unique):
 if output == 'status':
 unique.append(tweet.retweeted_status)
 elif output == 'text':
 unique.append(tweet.retweeted_status.text)

 except:
 if not (tweet in unique):
 if output == 'status':
 unique.append(tweet)
 elif output == 'text':
 unique.append(tweet.text)

 return unique

def split_texts(texts, adjustment = None):
 split_list = 
 for text in texts:
 split = text.split()
 split_list.extend(split)
 return split_list

models.py:

import copy
import numpy
import matplotlib.pyplot as plt
from matplotlib import colors as mplcolors
from . import adjustment as adjust
from . import functionals as func


class Tweets(list):

 """

 Tweets model.

 """

 def __init__(self, *args, filter_by_unique = False, **kwargs):

 if filter_by_unique:
 tweets = func.filter_unique(args[0])
 else:
 tweets = args[0]

 list.__init__(self, tweets, **kwargs)

 @property
 def sorted_by_time(self):
 return sorted(self, key = lambda x: x.created_at)

 @property
 def oldest(self):
 return self.sorted_by_time[0]

 @property
 def newest(self):
 return self.sorted_by_time[-1]


class Authors(object):

 """

 Authors model.

 """

 def __init__(self, tweets):

 self.tweets = tweets
 self.authors = author.name : author for author in list(set([tweet.author for tweet in self.tweets]))
 self.username = author.screen_name : author for author in list(set([tweet.author for tweet in self.tweets]))
 self.followers_count = author: self.authors[author].followers_count for author in self.authors
 self.following_count = author: self.authors[author].friends_count for author in self.authors
 self.totaltweets = author: self.authors[author].statuses_count for author in self.authors
 self.tweets_by_author = author: [tweet for tweet in self.tweets if tweet.author.name == author] for author in self.authors
 self.tweets_count = author: len(self.tweets_by_author[author]) for author in self.tweets_by_author

 def hbar_plot(self, ax, measurement = 'Followers', color = (0,0,1,1), incolor_measurement = None, height = 1, textsize = 7, **kwargs):

 measurements = 'Followers': self.followers_count, 'Following' : self.following_count, 'Total Tweets' : self.totaltweets, 'Sample Tweets' : self.tweets_count
 sorted_authors = sorted(measurements[measurement], key = lambda x : measurements[measurement][x])

 if type(color) == str:
 color = mplcolors.hex2color(mplcolors.cnames[color]) 

 colors = len(self.authors)*[color]
 if incolor_measurement != None:
 minor_max = max(measurements[incolor_measurement].values())
 transparency = [measurements[incolor_measurement][author]/minor_max for author in sorted_authors]
 colors = [(color[0], color[1], color[2], trans) for trans in transparency]

 var = [i+height for i in range(len(self.authors))]
 ax.barh([i-height/2 for i in var], [measurements[measurement][author] for author in sorted_authors], height = height, color = colors, **kwargs)
 ax.set_yticks(var)
 ax.set_yticklabels(sorted_authors, rotation = 'horizontal', size = textsize)

 if incolor_measurement != None:
 ax.set_xlabel(measurement + ' (color : '+incolor_measurement+')')
 else :
 ax.set_xlabel(measurement)
 plt.tight_layout()

Result of example usage :

enter image description here

edited Jul 30 at 17:51

asked Jul 17 at 14:06

Arief

405112

add a commentÂ |Â

up vote
1
down vote

favorite

I have three modules here models.py, functionals.py, and adjustment.py. These three may be used to perform analysis of Twitter data through tweepy objects.

The Tweets class is a simple one which takes 'tweet' objects and take it as a 'list', which is necessary (not yet implemented as necessary) as an argument of Authors class.

Here is an example usage :

import numpy
from models import Authors
from models import Tweets

import matplotlib.pyplot as plt

stats = numpy.load('testfile.npy')
tweets = Tweets(stats)

Model = Authors(tweets)

fig, ax = plt.subplots(1, 1)
Model.hbar_plot(ax, measurement = 'Followers', incolor_measurement = 'Following', height = 0.5, color = (1,0,0,1))

fig, ax = plt.subplots(1, 1)
Model.hbar_plot(ax, measurement = 'Sample Tweets', incolor_measurement = 'Followers', height = 0.5, color = (1,0,0,1))
plt.show()

The example above plots the barplots, with bar colors representing the incolor_measurement variable.

My question is, how to make better organization of the classes and their methods, also the functions here? How to write and organize it better?

The modules :

adjustment.py:

import copy

class Adjustment(object):

 def __init__(self, title = "an adjustment.", **kwargs):

 self.title = title

 if 'freq_lim' in kwargs:
 self.freq_lim = kwargs['freq_lim']
 else:
 self.freq_lim = None

 if 'char_lim' in kwargs:
 self.char_lim = kwargs['char_lim']
 else:
 self.char_lim = None

 if 'exclude' in kwargs:
 self.exclude = kwargs['exclude']
 else:
 self.exclude = None

functionals.py:

def filter_unique(tweet_stats_list, output = 'status'):

 stats = tweet_stats_list
 unique = 

 for tweet in stats:

 try:
 if not (tweet.retweeted_status in unique):
 if output == 'status':
 unique.append(tweet.retweeted_status)
 elif output == 'text':
 unique.append(tweet.retweeted_status.text)

 except:
 if not (tweet in unique):
 if output == 'status':
 unique.append(tweet)
 elif output == 'text':
 unique.append(tweet.text)

 return unique

def split_texts(texts, adjustment = None):
 split_list = 
 for text in texts:
 split = text.split()
 split_list.extend(split)
 return split_list

models.py:

import copy
import numpy
import matplotlib.pyplot as plt
from matplotlib import colors as mplcolors
from . import adjustment as adjust
from . import functionals as func


class Tweets(list):

 """

 Tweets model.

 """

 def __init__(self, *args, filter_by_unique = False, **kwargs):

 if filter_by_unique:
 tweets = func.filter_unique(args[0])
 else:
 tweets = args[0]

 list.__init__(self, tweets, **kwargs)

 @property
 def sorted_by_time(self):
 return sorted(self, key = lambda x: x.created_at)

 @property
 def oldest(self):
 return self.sorted_by_time[0]

 @property
 def newest(self):
 return self.sorted_by_time[-1]


class Authors(object):

 """

 Authors model.

 """

 def __init__(self, tweets):

 self.tweets = tweets
 self.authors = author.name : author for author in list(set([tweet.author for tweet in self.tweets]))
 self.username = author.screen_name : author for author in list(set([tweet.author for tweet in self.tweets]))
 self.followers_count = author: self.authors[author].followers_count for author in self.authors
 self.following_count = author: self.authors[author].friends_count for author in self.authors
 self.totaltweets = author: self.authors[author].statuses_count for author in self.authors
 self.tweets_by_author = author: [tweet for tweet in self.tweets if tweet.author.name == author] for author in self.authors
 self.tweets_count = author: len(self.tweets_by_author[author]) for author in self.tweets_by_author

 def hbar_plot(self, ax, measurement = 'Followers', color = (0,0,1,1), incolor_measurement = None, height = 1, textsize = 7, **kwargs):

 measurements = 'Followers': self.followers_count, 'Following' : self.following_count, 'Total Tweets' : self.totaltweets, 'Sample Tweets' : self.tweets_count
 sorted_authors = sorted(measurements[measurement], key = lambda x : measurements[measurement][x])

 if type(color) == str:
 color = mplcolors.hex2color(mplcolors.cnames[color]) 

 colors = len(self.authors)*[color]
 if incolor_measurement != None:
 minor_max = max(measurements[incolor_measurement].values())
 transparency = [measurements[incolor_measurement][author]/minor_max for author in sorted_authors]
 colors = [(color[0], color[1], color[2], trans) for trans in transparency]

 var = [i+height for i in range(len(self.authors))]
 ax.barh([i-height/2 for i in var], [measurements[measurement][author] for author in sorted_authors], height = height, color = colors, **kwargs)
 ax.set_yticks(var)
 ax.set_yticklabels(sorted_authors, rotation = 'horizontal', size = textsize)

 if incolor_measurement != None:
 ax.set_xlabel(measurement + ' (color : '+incolor_measurement+')')
 else :
 ax.set_xlabel(measurement)
 plt.tight_layout()

Result of example usage :

enter image description here

edited Jul 30 at 17:51

asked Jul 17 at 14:06

Arief

405112

I have three modules here models.py, functionals.py, and adjustment.py. These three may be used to perform analysis of Twitter data through tweepy objects.

The Tweets class is a simple one which takes 'tweet' objects and take it as a 'list', which is necessary (not yet implemented as necessary) as an argument of Authors class.

Here is an example usage :

import numpy
from models import Authors
from models import Tweets

import matplotlib.pyplot as plt

stats = numpy.load('testfile.npy')
tweets = Tweets(stats)

Model = Authors(tweets)

fig, ax = plt.subplots(1, 1)
Model.hbar_plot(ax, measurement = 'Followers', incolor_measurement = 'Following', height = 0.5, color = (1,0,0,1))

fig, ax = plt.subplots(1, 1)
Model.hbar_plot(ax, measurement = 'Sample Tweets', incolor_measurement = 'Followers', height = 0.5, color = (1,0,0,1))
plt.show()

The example above plots the barplots, with bar colors representing the incolor_measurement variable.

My question is, how to make better organization of the classes and their methods, also the functions here? How to write and organize it better?

The modules :

adjustment.py:

import copy

class Adjustment(object):

 def __init__(self, title = "an adjustment.", **kwargs):

 self.title = title

 if 'freq_lim' in kwargs:
 self.freq_lim = kwargs['freq_lim']
 else:
 self.freq_lim = None

 if 'char_lim' in kwargs:
 self.char_lim = kwargs['char_lim']
 else:
 self.char_lim = None

 if 'exclude' in kwargs:
 self.exclude = kwargs['exclude']
 else:
 self.exclude = None

functionals.py:

def filter_unique(tweet_stats_list, output = 'status'):

 stats = tweet_stats_list
 unique = 

 for tweet in stats:

 try:
 if not (tweet.retweeted_status in unique):
 if output == 'status':
 unique.append(tweet.retweeted_status)
 elif output == 'text':
 unique.append(tweet.retweeted_status.text)

 except:
 if not (tweet in unique):
 if output == 'status':
 unique.append(tweet)
 elif output == 'text':
 unique.append(tweet.text)

 return unique

def split_texts(texts, adjustment = None):
 split_list = 
 for text in texts:
 split = text.split()
 split_list.extend(split)
 return split_list

models.py:

import copy
import numpy
import matplotlib.pyplot as plt
from matplotlib import colors as mplcolors
from . import adjustment as adjust
from . import functionals as func


class Tweets(list):

 """

 Tweets model.

 """

 def __init__(self, *args, filter_by_unique = False, **kwargs):

 if filter_by_unique:
 tweets = func.filter_unique(args[0])
 else:
 tweets = args[0]

 list.__init__(self, tweets, **kwargs)

 @property
 def sorted_by_time(self):
 return sorted(self, key = lambda x: x.created_at)

 @property
 def oldest(self):
 return self.sorted_by_time[0]

 @property
 def newest(self):
 return self.sorted_by_time[-1]


class Authors(object):

 """

 Authors model.

 """

 def __init__(self, tweets):

 self.tweets = tweets
 self.authors = author.name : author for author in list(set([tweet.author for tweet in self.tweets]))
 self.username = author.screen_name : author for author in list(set([tweet.author for tweet in self.tweets]))
 self.followers_count = author: self.authors[author].followers_count for author in self.authors
 self.following_count = author: self.authors[author].friends_count for author in self.authors
 self.totaltweets = author: self.authors[author].statuses_count for author in self.authors
 self.tweets_by_author = author: [tweet for tweet in self.tweets if tweet.author.name == author] for author in self.authors
 self.tweets_count = author: len(self.tweets_by_author[author]) for author in self.tweets_by_author

 def hbar_plot(self, ax, measurement = 'Followers', color = (0,0,1,1), incolor_measurement = None, height = 1, textsize = 7, **kwargs):

 measurements = 'Followers': self.followers_count, 'Following' : self.following_count, 'Total Tweets' : self.totaltweets, 'Sample Tweets' : self.tweets_count
 sorted_authors = sorted(measurements[measurement], key = lambda x : measurements[measurement][x])

 if type(color) == str:
 color = mplcolors.hex2color(mplcolors.cnames[color]) 

 colors = len(self.authors)*[color]
 if incolor_measurement != None:
 minor_max = max(measurements[incolor_measurement].values())
 transparency = [measurements[incolor_measurement][author]/minor_max for author in sorted_authors]
 colors = [(color[0], color[1], color[2], trans) for trans in transparency]

 var = [i+height for i in range(len(self.authors))]
 ax.barh([i-height/2 for i in var], [measurements[measurement][author] for author in sorted_authors], height = height, color = colors, **kwargs)
 ax.set_yticks(var)
 ax.set_yticklabels(sorted_authors, rotation = 'horizontal', size = textsize)

 if incolor_measurement != None:
 ax.set_xlabel(measurement + ' (color : '+incolor_measurement+')')
 else :
 ax.set_xlabel(measurement)
 plt.tight_layout()

Result of example usage :

enter image description here

edited Jul 30 at 17:51

asked Jul 17 at 14:06

Arief

405112

edited Jul 30 at 17:51

asked Jul 17 at 14:06

Arief

405112

asked Jul 17 at 14:06

Arief

405112

asked Jul 17 at 14:06

Arief

405112

add a commentÂ |Â

1 Answer
1

active

oldest

votes

up vote
2
down vote

accepted

YouÃ¢Â€Â™ve presented 5 units (classes, functions) in 3 different files. I understand that you have a few more in your repository but I think this is over-splitting. You should at least merge adjustment.py into models.py and radial_bar.py into functional.py (and, btw, this kind of file is usually named utils.py).

Out of the 5 units presented here, 2 are unused: split_texts and Adjustment. So a quick word about them:

You can drop the adjustment parameter of split_texts as it is unused;

You can use itertools.chain.from_iterable to flatten nested iterables;

You can simplify optional arguments by dropping **kwargs in favor of default values.

All in all, these two can be condensed to:

import itertools


class Adjustment:
 def __init__(self, title="An adjustment.", freq_lim=None, char_lim=None, exclude=None):
 self.title = title
 self.freq_lim = freq_lim
 self.char_lim = char_lim
 self.exclude = exclude


def split_texts(texts):
 return list(itertools.chain.from_iterable(map(str.split, texts)))

But the Adjustment class would be even more succint and easily extensible if it were one of the new dataclasses:

from dataclasses import dataclass


@dataclass(frozen=True)
class Adjustment:
 title: str = "An adjustment."
 freq_lim: int = None
 char_lim: int = None
 exclude: str = None

filter_unique doesn't feel like it produce any usable output. For starter, you are comparing apples and oranges when output is 'text' because you store the .text attribute of your objects in unique but you are only checking if the object itself is in unique, not its text. Second, you return unique which may contain tweets, statuses or text; but your usage in the Tweet class suggest that you want to return tweets every time.

To improve things, we could:

check unicity using a set whose contain check is done in $mathcalO(1)$ instead of $mathcalO(n)$ for lists;

extract the desired attribute out of the object to test for unicity but return the actual tweet;

extract attributes step by step instead of having 4Ã‚Â explicit cases;

turn the function into a generator, since it will be fed into the list constructor anyway.

Proposed improvements:

from contextlib import suppress


def filter_unique(tweet_stats_list, output='status'):
 uniques = set()
 for tweet in tweet_stats_list:
 tweet_attr = tweet
 with suppress(AttributeError):
 tweet_attr = tweet_attr.retweeted_status
 if output == 'text':
 tweet_attr = tweet_attr.text
 if tweet_attr not in uniques:
 uniques.add(tweet_attr)
 yield tweet

The Tweet class feels somewhat fine but I don't understand how it improves value over a simple list based on the provided model (I didn't read the other models on your repository, so it may be more obvious there). Other than that the docstring doesn't add any value: this is the class Tweet in the module models, sure this is some kind of """Tweet model"""Ã¢Â€Â¦

I also don't understand why you allow for variable number of arguments (using *args) but only ever use args[0]: youÃ¢Â€Â™re not even guaranteed that there is at least 1 element in args. Better use an explicit argument here.

Lastly, you should use super() instead of explicitly calling the parent class. It doesn't matter much in such case but it's a good habit to get into if you ever use multiple inheritance (or just in case you decide to add a layer of abstraction between Tweet and list).

Proposed improvements:

import operator


class Tweets(list):
 def __init__(self, tweets, filter_by_unique=False, **kwargs):
 if filter_by_unique:
 tweets = filter_unique(tweets)

 super().__init__(tweets, **kwargs)

 @property
 def sorted_by_time(self):
 return sorted(self, key=operator.attrgetter('created_at'))

 @property
 def oldest(self):
 return min(self, key=operator.attrgetter('created_at'))

 @property
 def newest(self):
 return max(self, key=operator.attrgetter('created_at'))

The Author class feels very messy. You extract a handful of information out of your tweets stored as attributes, but half of them end up unused.

Since you're only interested in counting some stats about your author, you should store only those. And to help you along the way (especially counting the number of tweets of a particular author), you can:

Sort the array of tweets by author names before processing it;

Group the array by author names to gather together the tweets of a single author.

Note that I don't know tweepy enough to know if there is a more "unique" identification method than the name of the author. (I do imagine that several author can share a single name, so there must be some.)

Proposed improvements:

import operator


class Authors(object):
 def __init__(self, tweets):
 unicity_key = operator.attrgetter('author.name')
 tweets = sorted(tweets, key=unicity_key)

 self.followers_count = 
 self.following_count = 
 self.total_tweets = 
 self.tweets_count = 
 for _, author_tweets in itertools.groupby(tweets, key=unicity_key):
 author_tweets = list(author_tweets)
 author = author_tweets[0].author
 self.followers_count[author.name] = author.followers_count
 self.following_count[author.name] = author.friends_count
 self.total_tweets[author.name] = author.statuses_count
 self.tweets_count[author.name] = len(author_tweets)

 def hbar_plot(self, ax, measurement='Followers', color=(0,0,1,1), incolor_measurement=None, height=1, textsize=7, **kwargs):
 measurements = 
 'Followers': self.followers_count,
 'Following': self.following_count,
 'Total Tweets': self.total_tweets,
 'Sample Tweets': self.tweets_count,
 
 author_measurement = measurements[measurement]
 sorted_authors = sorted(author_measurement, key=author_measurement.__getitem__)

 if isinstance(color, str):
 color = mplcolors.hex2color(mplcolors.cnames[color])

 if incolor_measurement is not None:
 color_measurement = measurements[incolor_measurement]
 minor_max = max(color_measurement.values())
 colors = [(*color[:3], color_measurement[author] / minor_max) for author in sorted_authors]
 measurement = ' (color: )'.format(measurement, incolor_measurement)
 else:
 colors = [color] * len(author_measurement)

 ticks, values = zip(*((i + height, author_measurement[author]) for i, author in enumerate(sorted_authors)))
 ax.barh([i - height / 2 for i in ticks], values, height=height, color=colors, **kwargs)
 ax.set_yticks(ticks)
 ax.set_yticklabels(sorted_authors, rotation='horizontal', size=textsize)
 ax.set_xlabel(measurement)
 plt.tight_layout()

edited Jul 18 at 10:34

answered Jul 18 at 0:40

Mathias Ettinger

21.7k32875

Thanks for the review. The Adjustment class may be updated with more inputs for more possible adjustments.
â€“Â Arief
Jul 18 at 10:06

@Arief Possibly. But you don't extract anything else from kwargs and you don't store it either; so there is no need in having it around.
â€“Â Mathias Ettinger
Jul 18 at 10:22

Whyfilter_unique is not useful..? the argument output=... is for preference whether the output is in form of Status with full features, or only the text. Each Status object has unique ID, two with same ID could be mixed in without notice.. the filtering is on ID by default.
â€“Â Arief
Jul 26 at 10:44

1

@Arief In your original version, if output is text, you check for status but store text in unique; thus no filtering taking place. Now I don't know the inner details of your objects but if statuses have a unique ID, you should store and check for that ID instead.
â€“Â Mathias Ettinger
Jul 26 at 11:45

1

@Arief Because 1) we reuse the expression twice, so better give it a name; 2) namming a lambda defeat the purpose and is better done by defining a function; 3) instead of defining a function here and reinventing the wheel, it's better to use what's already in the language; 4) lambdas are slow anyway and the operator module is implemented in C.
â€“Â Mathias Ettinger
Jul 31 at 6:41

Â |Â
show 3 more comments

Your Answer

StackExchange.ifUsing("editor", function ()
return StackExchange.using("mathjaxEditing", function ()
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["\$", "\$"]]);
);
);
, "mathjax-editing");

StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "196"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
convertImagesToLinks: false,
noModals: false,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f199686%2fmodels-for-analysing-twitter-data-which-based-on-tweepy-package%23new-answer', 'question_page');

);

Post as a guest

Name

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

up vote
2
down vote

accepted

Out of the 5 units presented here, 2 are unused: split_texts and Adjustment. So a quick word about them:

You can drop the adjustment parameter of split_texts as it is unused;

You can use itertools.chain.from_iterable to flatten nested iterables;

You can simplify optional arguments by dropping **kwargs in favor of default values.

All in all, these two can be condensed to:

import itertools


class Adjustment:
 def __init__(self, title="An adjustment.", freq_lim=None, char_lim=None, exclude=None):
 self.title = title
 self.freq_lim = freq_lim
 self.char_lim = char_lim
 self.exclude = exclude


def split_texts(texts):
 return list(itertools.chain.from_iterable(map(str.split, texts)))

But the Adjustment class would be even more succint and easily extensible if it were one of the new dataclasses:

from dataclasses import dataclass


@dataclass(frozen=True)
class Adjustment:
 title: str = "An adjustment."
 freq_lim: int = None
 char_lim: int = None
 exclude: str = None

To improve things, we could:

check unicity using a set whose contain check is done in $mathcalO(1)$ instead of $mathcalO(n)$ for lists;

extract the desired attribute out of the object to test for unicity but return the actual tweet;

extract attributes step by step instead of having 4Ã‚Â explicit cases;

turn the function into a generator, since it will be fed into the list constructor anyway.

Proposed improvements:

from contextlib import suppress


def filter_unique(tweet_stats_list, output='status'):
 uniques = set()
 for tweet in tweet_stats_list:
 tweet_attr = tweet
 with suppress(AttributeError):
 tweet_attr = tweet_attr.retweeted_status
 if output == 'text':
 tweet_attr = tweet_attr.text
 if tweet_attr not in uniques:
 uniques.add(tweet_attr)
 yield tweet

Proposed improvements:

import operator


class Tweets(list):
 def __init__(self, tweets, filter_by_unique=False, **kwargs):
 if filter_by_unique:
 tweets = filter_unique(tweets)

 super().__init__(tweets, **kwargs)

 @property
 def sorted_by_time(self):
 return sorted(self, key=operator.attrgetter('created_at'))

 @property
 def oldest(self):
 return min(self, key=operator.attrgetter('created_at'))

 @property
 def newest(self):
 return max(self, key=operator.attrgetter('created_at'))

The Author class feels very messy. You extract a handful of information out of your tweets stored as attributes, but half of them end up unused.

Sort the array of tweets by author names before processing it;

Group the array by author names to gather together the tweets of a single author.

Proposed improvements:

import operator


class Authors(object):
 def __init__(self, tweets):
 unicity_key = operator.attrgetter('author.name')
 tweets = sorted(tweets, key=unicity_key)

 self.followers_count = 
 self.following_count = 
 self.total_tweets = 
 self.tweets_count = 
 for _, author_tweets in itertools.groupby(tweets, key=unicity_key):
 author_tweets = list(author_tweets)
 author = author_tweets[0].author
 self.followers_count[author.name] = author.followers_count
 self.following_count[author.name] = author.friends_count
 self.total_tweets[author.name] = author.statuses_count
 self.tweets_count[author.name] = len(author_tweets)

 def hbar_plot(self, ax, measurement='Followers', color=(0,0,1,1), incolor_measurement=None, height=1, textsize=7, **kwargs):
 measurements = 
 'Followers': self.followers_count,
 'Following': self.following_count,
 'Total Tweets': self.total_tweets,
 'Sample Tweets': self.tweets_count,
 
 author_measurement = measurements[measurement]
 sorted_authors = sorted(author_measurement, key=author_measurement.__getitem__)

 if isinstance(color, str):
 color = mplcolors.hex2color(mplcolors.cnames[color])

 if incolor_measurement is not None:
 color_measurement = measurements[incolor_measurement]
 minor_max = max(color_measurement.values())
 colors = [(*color[:3], color_measurement[author] / minor_max) for author in sorted_authors]
 measurement = ' (color: )'.format(measurement, incolor_measurement)
 else:
 colors = [color] * len(author_measurement)

 ticks, values = zip(*((i + height, author_measurement[author]) for i, author in enumerate(sorted_authors)))
 ax.barh([i - height / 2 for i in ticks], values, height=height, color=colors, **kwargs)
 ax.set_yticks(ticks)
 ax.set_yticklabels(sorted_authors, rotation='horizontal', size=textsize)
 ax.set_xlabel(measurement)
 plt.tight_layout()

edited Jul 18 at 10:34

answered Jul 18 at 0:40

Mathias Ettinger

21.7k32875

Thanks for the review. The Adjustment class may be updated with more inputs for more possible adjustments.
â€“Â Arief
Jul 18 at 10:06

@Arief Possibly. But you don't extract anything else from kwargs and you don't store it either; so there is no need in having it around.
â€“Â Mathias Ettinger
Jul 18 at 10:22

Whyfilter_unique is not useful..? the argument output=... is for preference whether the output is in form of Status with full features, or only the text. Each Status object has unique ID, two with same ID could be mixed in without notice.. the filtering is on ID by default.
â€“Â Arief
Jul 26 at 10:44

1

@Arief In your original version, if output is text, you check for status but store text in unique; thus no filtering taking place. Now I don't know the inner details of your objects but if statuses have a unique ID, you should store and check for that ID instead.
â€“Â Mathias Ettinger
Jul 26 at 11:45

1

@Arief Because 1) we reuse the expression twice, so better give it a name; 2) namming a lambda defeat the purpose and is better done by defining a function; 3) instead of defining a function here and reinventing the wheel, it's better to use what's already in the language; 4) lambdas are slow anyway and the operator module is implemented in C.
â€“Â Mathias Ettinger
Jul 31 at 6:41

Â |Â
show 3 more comments

up vote
2
down vote

accepted

Out of the 5 units presented here, 2 are unused: split_texts and Adjustment. So a quick word about them:

You can drop the adjustment parameter of split_texts as it is unused;

You can use itertools.chain.from_iterable to flatten nested iterables;

You can simplify optional arguments by dropping **kwargs in favor of default values.

All in all, these two can be condensed to:

import itertools


class Adjustment:
 def __init__(self, title="An adjustment.", freq_lim=None, char_lim=None, exclude=None):
 self.title = title
 self.freq_lim = freq_lim
 self.char_lim = char_lim
 self.exclude = exclude


def split_texts(texts):
 return list(itertools.chain.from_iterable(map(str.split, texts)))

But the Adjustment class would be even more succint and easily extensible if it were one of the new dataclasses:

from dataclasses import dataclass


@dataclass(frozen=True)
class Adjustment:
 title: str = "An adjustment."
 freq_lim: int = None
 char_lim: int = None
 exclude: str = None

To improve things, we could:

check unicity using a set whose contain check is done in $mathcalO(1)$ instead of $mathcalO(n)$ for lists;

extract the desired attribute out of the object to test for unicity but return the actual tweet;

extract attributes step by step instead of having 4Ã‚Â explicit cases;

turn the function into a generator, since it will be fed into the list constructor anyway.

Proposed improvements:

from contextlib import suppress


def filter_unique(tweet_stats_list, output='status'):
 uniques = set()
 for tweet in tweet_stats_list:
 tweet_attr = tweet
 with suppress(AttributeError):
 tweet_attr = tweet_attr.retweeted_status
 if output == 'text':
 tweet_attr = tweet_attr.text
 if tweet_attr not in uniques:
 uniques.add(tweet_attr)
 yield tweet

Proposed improvements:

import operator


class Tweets(list):
 def __init__(self, tweets, filter_by_unique=False, **kwargs):
 if filter_by_unique:
 tweets = filter_unique(tweets)

 super().__init__(tweets, **kwargs)

 @property
 def sorted_by_time(self):
 return sorted(self, key=operator.attrgetter('created_at'))

 @property
 def oldest(self):
 return min(self, key=operator.attrgetter('created_at'))

 @property
 def newest(self):
 return max(self, key=operator.attrgetter('created_at'))

The Author class feels very messy. You extract a handful of information out of your tweets stored as attributes, but half of them end up unused.

Sort the array of tweets by author names before processing it;

Group the array by author names to gather together the tweets of a single author.

Proposed improvements:

import operator


class Authors(object):
 def __init__(self, tweets):
 unicity_key = operator.attrgetter('author.name')
 tweets = sorted(tweets, key=unicity_key)

 self.followers_count = 
 self.following_count = 
 self.total_tweets = 
 self.tweets_count = 
 for _, author_tweets in itertools.groupby(tweets, key=unicity_key):
 author_tweets = list(author_tweets)
 author = author_tweets[0].author
 self.followers_count[author.name] = author.followers_count
 self.following_count[author.name] = author.friends_count
 self.total_tweets[author.name] = author.statuses_count
 self.tweets_count[author.name] = len(author_tweets)

 def hbar_plot(self, ax, measurement='Followers', color=(0,0,1,1), incolor_measurement=None, height=1, textsize=7, **kwargs):
 measurements = 
 'Followers': self.followers_count,
 'Following': self.following_count,
 'Total Tweets': self.total_tweets,
 'Sample Tweets': self.tweets_count,
 
 author_measurement = measurements[measurement]
 sorted_authors = sorted(author_measurement, key=author_measurement.__getitem__)

 if isinstance(color, str):
 color = mplcolors.hex2color(mplcolors.cnames[color])

 if incolor_measurement is not None:
 color_measurement = measurements[incolor_measurement]
 minor_max = max(color_measurement.values())
 colors = [(*color[:3], color_measurement[author] / minor_max) for author in sorted_authors]
 measurement = ' (color: )'.format(measurement, incolor_measurement)
 else:
 colors = [color] * len(author_measurement)

 ticks, values = zip(*((i + height, author_measurement[author]) for i, author in enumerate(sorted_authors)))
 ax.barh([i - height / 2 for i in ticks], values, height=height, color=colors, **kwargs)
 ax.set_yticks(ticks)
 ax.set_yticklabels(sorted_authors, rotation='horizontal', size=textsize)
 ax.set_xlabel(measurement)
 plt.tight_layout()

edited Jul 18 at 10:34

answered Jul 18 at 0:40

Mathias Ettinger

21.7k32875

Thanks for the review. The Adjustment class may be updated with more inputs for more possible adjustments.
â€“Â Arief
Jul 18 at 10:06

@Arief Possibly. But you don't extract anything else from kwargs and you don't store it either; so there is no need in having it around.
â€“Â Mathias Ettinger
Jul 18 at 10:22

Whyfilter_unique is not useful..? the argument output=... is for preference whether the output is in form of Status with full features, or only the text. Each Status object has unique ID, two with same ID could be mixed in without notice.. the filtering is on ID by default.
â€“Â Arief
Jul 26 at 10:44

1

@Arief In your original version, if output is text, you check for status but store text in unique; thus no filtering taking place. Now I don't know the inner details of your objects but if statuses have a unique ID, you should store and check for that ID instead.
â€“Â Mathias Ettinger
Jul 26 at 11:45

1

@Arief Because 1) we reuse the expression twice, so better give it a name; 2) namming a lambda defeat the purpose and is better done by defining a function; 3) instead of defining a function here and reinventing the wheel, it's better to use what's already in the language; 4) lambdas are slow anyway and the operator module is implemented in C.
â€“Â Mathias Ettinger
Jul 31 at 6:41

Â |Â
show 3 more comments

up vote
2
down vote

accepted

Out of the 5 units presented here, 2 are unused: split_texts and Adjustment. So a quick word about them:

You can drop the adjustment parameter of split_texts as it is unused;

You can use itertools.chain.from_iterable to flatten nested iterables;

You can simplify optional arguments by dropping **kwargs in favor of default values.

All in all, these two can be condensed to:

import itertools


class Adjustment:
 def __init__(self, title="An adjustment.", freq_lim=None, char_lim=None, exclude=None):
 self.title = title
 self.freq_lim = freq_lim
 self.char_lim = char_lim
 self.exclude = exclude


def split_texts(texts):
 return list(itertools.chain.from_iterable(map(str.split, texts)))

But the Adjustment class would be even more succint and easily extensible if it were one of the new dataclasses:

from dataclasses import dataclass


@dataclass(frozen=True)
class Adjustment:
 title: str = "An adjustment."
 freq_lim: int = None
 char_lim: int = None
 exclude: str = None

To improve things, we could:

check unicity using a set whose contain check is done in $mathcalO(1)$ instead of $mathcalO(n)$ for lists;

extract the desired attribute out of the object to test for unicity but return the actual tweet;

extract attributes step by step instead of having 4Ã‚Â explicit cases;

turn the function into a generator, since it will be fed into the list constructor anyway.

Proposed improvements:

from contextlib import suppress


def filter_unique(tweet_stats_list, output='status'):
 uniques = set()
 for tweet in tweet_stats_list:
 tweet_attr = tweet
 with suppress(AttributeError):
 tweet_attr = tweet_attr.retweeted_status
 if output == 'text':
 tweet_attr = tweet_attr.text
 if tweet_attr not in uniques:
 uniques.add(tweet_attr)
 yield tweet

Proposed improvements:

import operator


class Tweets(list):
 def __init__(self, tweets, filter_by_unique=False, **kwargs):
 if filter_by_unique:
 tweets = filter_unique(tweets)

 super().__init__(tweets, **kwargs)

 @property
 def sorted_by_time(self):
 return sorted(self, key=operator.attrgetter('created_at'))

 @property
 def oldest(self):
 return min(self, key=operator.attrgetter('created_at'))

 @property
 def newest(self):
 return max(self, key=operator.attrgetter('created_at'))

The Author class feels very messy. You extract a handful of information out of your tweets stored as attributes, but half of them end up unused.

Sort the array of tweets by author names before processing it;

Group the array by author names to gather together the tweets of a single author.

Proposed improvements:

import operator


class Authors(object):
 def __init__(self, tweets):
 unicity_key = operator.attrgetter('author.name')
 tweets = sorted(tweets, key=unicity_key)

 self.followers_count = 
 self.following_count = 
 self.total_tweets = 
 self.tweets_count = 
 for _, author_tweets in itertools.groupby(tweets, key=unicity_key):
 author_tweets = list(author_tweets)
 author = author_tweets[0].author
 self.followers_count[author.name] = author.followers_count
 self.following_count[author.name] = author.friends_count
 self.total_tweets[author.name] = author.statuses_count
 self.tweets_count[author.name] = len(author_tweets)

 def hbar_plot(self, ax, measurement='Followers', color=(0,0,1,1), incolor_measurement=None, height=1, textsize=7, **kwargs):
 measurements = 
 'Followers': self.followers_count,
 'Following': self.following_count,
 'Total Tweets': self.total_tweets,
 'Sample Tweets': self.tweets_count,
 
 author_measurement = measurements[measurement]
 sorted_authors = sorted(author_measurement, key=author_measurement.__getitem__)

 if isinstance(color, str):
 color = mplcolors.hex2color(mplcolors.cnames[color])

 if incolor_measurement is not None:
 color_measurement = measurements[incolor_measurement]
 minor_max = max(color_measurement.values())
 colors = [(*color[:3], color_measurement[author] / minor_max) for author in sorted_authors]
 measurement = ' (color: )'.format(measurement, incolor_measurement)
 else:
 colors = [color] * len(author_measurement)

 ticks, values = zip(*((i + height, author_measurement[author]) for i, author in enumerate(sorted_authors)))
 ax.barh([i - height / 2 for i in ticks], values, height=height, color=colors, **kwargs)
 ax.set_yticks(ticks)
 ax.set_yticklabels(sorted_authors, rotation='horizontal', size=textsize)
 ax.set_xlabel(measurement)
 plt.tight_layout()

edited Jul 18 at 10:34

answered Jul 18 at 0:40

Mathias Ettinger

21.7k32875

Out of the 5 units presented here, 2 are unused: split_texts and Adjustment. So a quick word about them:

You can drop the adjustment parameter of split_texts as it is unused;

You can use itertools.chain.from_iterable to flatten nested iterables;

You can simplify optional arguments by dropping **kwargs in favor of default values.

All in all, these two can be condensed to:

import itertools


class Adjustment:
 def __init__(self, title="An adjustment.", freq_lim=None, char_lim=None, exclude=None):
 self.title = title
 self.freq_lim = freq_lim
 self.char_lim = char_lim
 self.exclude = exclude


def split_texts(texts):
 return list(itertools.chain.from_iterable(map(str.split, texts)))

But the Adjustment class would be even more succint and easily extensible if it were one of the new dataclasses:

from dataclasses import dataclass


@dataclass(frozen=True)
class Adjustment:
 title: str = "An adjustment."
 freq_lim: int = None
 char_lim: int = None
 exclude: str = None

To improve things, we could:

check unicity using a set whose contain check is done in $mathcalO(1)$ instead of $mathcalO(n)$ for lists;

extract the desired attribute out of the object to test for unicity but return the actual tweet;

extract attributes step by step instead of having 4Ã‚Â explicit cases;

turn the function into a generator, since it will be fed into the list constructor anyway.

Proposed improvements:

from contextlib import suppress


def filter_unique(tweet_stats_list, output='status'):
 uniques = set()
 for tweet in tweet_stats_list:
 tweet_attr = tweet
 with suppress(AttributeError):
 tweet_attr = tweet_attr.retweeted_status
 if output == 'text':
 tweet_attr = tweet_attr.text
 if tweet_attr not in uniques:
 uniques.add(tweet_attr)
 yield tweet

Proposed improvements:

import operator


class Tweets(list):
 def __init__(self, tweets, filter_by_unique=False, **kwargs):
 if filter_by_unique:
 tweets = filter_unique(tweets)

 super().__init__(tweets, **kwargs)

 @property
 def sorted_by_time(self):
 return sorted(self, key=operator.attrgetter('created_at'))

 @property
 def oldest(self):
 return min(self, key=operator.attrgetter('created_at'))

 @property
 def newest(self):
 return max(self, key=operator.attrgetter('created_at'))

The Author class feels very messy. You extract a handful of information out of your tweets stored as attributes, but half of them end up unused.

Sort the array of tweets by author names before processing it;

Group the array by author names to gather together the tweets of a single author.

Proposed improvements:

import operator


class Authors(object):
 def __init__(self, tweets):
 unicity_key = operator.attrgetter('author.name')
 tweets = sorted(tweets, key=unicity_key)

 self.followers_count = 
 self.following_count = 
 self.total_tweets = 
 self.tweets_count = 
 for _, author_tweets in itertools.groupby(tweets, key=unicity_key):
 author_tweets = list(author_tweets)
 author = author_tweets[0].author
 self.followers_count[author.name] = author.followers_count
 self.following_count[author.name] = author.friends_count
 self.total_tweets[author.name] = author.statuses_count
 self.tweets_count[author.name] = len(author_tweets)

 def hbar_plot(self, ax, measurement='Followers', color=(0,0,1,1), incolor_measurement=None, height=1, textsize=7, **kwargs):
 measurements = 
 'Followers': self.followers_count,
 'Following': self.following_count,
 'Total Tweets': self.total_tweets,
 'Sample Tweets': self.tweets_count,
 
 author_measurement = measurements[measurement]
 sorted_authors = sorted(author_measurement, key=author_measurement.__getitem__)

 if isinstance(color, str):
 color = mplcolors.hex2color(mplcolors.cnames[color])

 if incolor_measurement is not None:
 color_measurement = measurements[incolor_measurement]
 minor_max = max(color_measurement.values())
 colors = [(*color[:3], color_measurement[author] / minor_max) for author in sorted_authors]
 measurement = ' (color: )'.format(measurement, incolor_measurement)
 else:
 colors = [color] * len(author_measurement)

 ticks, values = zip(*((i + height, author_measurement[author]) for i, author in enumerate(sorted_authors)))
 ax.barh([i - height / 2 for i in ticks], values, height=height, color=colors, **kwargs)
 ax.set_yticks(ticks)
 ax.set_yticklabels(sorted_authors, rotation='horizontal', size=textsize)
 ax.set_xlabel(measurement)
 plt.tight_layout()

edited Jul 18 at 10:34

answered Jul 18 at 0:40

Mathias Ettinger

21.7k32875

edited Jul 18 at 10:34

answered Jul 18 at 0:40

Mathias Ettinger

21.7k32875

answered Jul 18 at 0:40

Mathias Ettinger

21.7k32875

answered Jul 18 at 0:40

Mathias Ettinger

21.7k32875

Thanks for the review. The Adjustment class may be updated with more inputs for more possible adjustments.
â€“Â Arief
Jul 18 at 10:06

@Arief Possibly. But you don't extract anything else from kwargs and you don't store it either; so there is no need in having it around.
â€“Â Mathias Ettinger
Jul 18 at 10:22

Whyfilter_unique is not useful..? the argument output=... is for preference whether the output is in form of Status with full features, or only the text. Each Status object has unique ID, two with same ID could be mixed in without notice.. the filtering is on ID by default.
â€“Â Arief
Jul 26 at 10:44

1

@Arief In your original version, if output is text, you check for status but store text in unique; thus no filtering taking place. Now I don't know the inner details of your objects but if statuses have a unique ID, you should store and check for that ID instead.
â€“Â Mathias Ettinger
Jul 26 at 11:45

1

@Arief Because 1) we reuse the expression twice, so better give it a name; 2) namming a lambda defeat the purpose and is better done by defining a function; 3) instead of defining a function here and reinventing the wheel, it's better to use what's already in the language; 4) lambdas are slow anyway and the operator module is implemented in C.
â€“Â Mathias Ettinger
Jul 31 at 6:41

Â |Â
show 3 more comments

Thanks for the review. The Adjustment class may be updated with more inputs for more possible adjustments.
â€“Â Arief
Jul 18 at 10:06

@Arief Possibly. But you don't extract anything else from kwargs and you don't store it either; so there is no need in having it around.
â€“Â Mathias Ettinger
Jul 18 at 10:22

Whyfilter_unique is not useful..? the argument output=... is for preference whether the output is in form of Status with full features, or only the text. Each Status object has unique ID, two with same ID could be mixed in without notice.. the filtering is on ID by default.
â€“Â Arief
Jul 26 at 10:44

1

@Arief In your original version, if output is text, you check for status but store text in unique; thus no filtering taking place. Now I don't know the inner details of your objects but if statuses have a unique ID, you should store and check for that ID instead.
â€“Â Mathias Ettinger
Jul 26 at 11:45

1

@Arief Because 1) we reuse the expression twice, so better give it a name; 2) namming a lambda defeat the purpose and is better done by defining a function; 3) instead of defining a function here and reinventing the wheel, it's better to use what's already in the language; 4) lambdas are slow anyway and the operator module is implemented in C.
â€“Â Mathias Ettinger
Jul 31 at 6:41

Thanks for the review. The Adjustment class may be updated with more inputs for more possible adjustments.
â€“Â Arief
Jul 18 at 10:06

@Arief Possibly. But you don't extract anything else from kwargs and you don't store it either; so there is no need in having it around.
â€“Â Mathias Ettinger
Jul 18 at 10:22

Whyfilter_unique is not useful..? the argument output=... is for preference whether the output is in form of Status with full features, or only the text. Each Status object has unique ID, two with same ID could be mixed in without notice.. the filtering is on ID by default.
â€“Â Arief
Jul 26 at 10:44

@Arief In your original version, if output is text, you check for status but store text in unique; thus no filtering taking place. Now I don't know the inner details of your objects but if statuses have a unique ID, you should store and check for that ID instead.
â€“Â Mathias Ettinger
Jul 26 at 11:45

@Arief Because 1) we reuse the expression twice, so better give it a name; 2) namming a lambda defeat the purpose and is better done by defining a function; 3) instead of defining a function here and reinventing the wheel, it's better to use what's already in the language; 4) lambdas are slow anyway and the operator module is implemented in C.
â€“Â Mathias Ettinger
Jul 31 at 6:41

Â |Â
show 3 more comments

draft saved

draft discarded

draft saved

draft discarded

Post as a guest

Name

搜尋此網誌

trjhtr