Slow Flask-SQLAlchemy query using association tables

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP





.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty margin-bottom:0;







up vote
3
down vote

favorite












I have two models in Flask-SQLAlchemy (Post and Comment) that have many-to-many relationship that is manifested in the third model (post_mentions):



post_mentions = db.Table(
'post_mentions',
db.Column('post_id', db.Integer, db.ForeignKey('posts.id'), primary_key=True),
db.Column('comment_id', db.Integer, db.ForeignKey('comments.id'), primary_key=True),
)

class Post(db.Model):
__tablename__ = 'posts'

id = db.Column(db.Integer, primary_key=True)
name = db.Column(db.String, unique=True, nullable=False)
mentions = db.relationship('Comment', secondary=post_mentions, lazy='dynamic')

def __eq__(self, other):
return self.name.lower() == other.name.lower()

def __hash__(self):
return hash(self.name.lower())


class Comment(db.Model):
__tablename__ = 'comments'

id = db.Column(db.Integer, primary_key=True)
text = db.Column(db.Text, nullable=False)
created_at = db.Column(db.Integer, nullable=False)


There is also a /posts endpoint that triggers the following query:



# flask and other imports

@app.route('/posts')
def posts():
page_num = request.args.get('page', 1)
posts = models.Post.query.join(models.post_mentions)
.group_by(models.post_mentions.columns.post_id)
.order_by(func.count(models.post_mentions.columns.post_id).desc())
.paginate(page=int(page_num), per_page=25)
return render_template('posts.html', posts=posts)


There are more than 14k+ posts and 32k+ comments stored in SQLite database. As you can see from the snippet above, when someone hits /posts endpoint, SQLAlchemy loads all data at once to the memory and then subsequent queries (e.g. retrieving posts, comments to that posts, etc..) take sub-millisecond time, since data is being served from the memory without hitting the database. Initial load takes 10s+ on my laptop, which is, to put it mildly, suboptimal.



So the question is: Considering that users won't view 97+% of posts, how can I both order posts by number of mentions in comments and load them on demand instead of doing it in one swoop?







share|improve this question












bumped to the homepage by Community♦ yesterday


This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.




















    up vote
    3
    down vote

    favorite












    I have two models in Flask-SQLAlchemy (Post and Comment) that have many-to-many relationship that is manifested in the third model (post_mentions):



    post_mentions = db.Table(
    'post_mentions',
    db.Column('post_id', db.Integer, db.ForeignKey('posts.id'), primary_key=True),
    db.Column('comment_id', db.Integer, db.ForeignKey('comments.id'), primary_key=True),
    )

    class Post(db.Model):
    __tablename__ = 'posts'

    id = db.Column(db.Integer, primary_key=True)
    name = db.Column(db.String, unique=True, nullable=False)
    mentions = db.relationship('Comment', secondary=post_mentions, lazy='dynamic')

    def __eq__(self, other):
    return self.name.lower() == other.name.lower()

    def __hash__(self):
    return hash(self.name.lower())


    class Comment(db.Model):
    __tablename__ = 'comments'

    id = db.Column(db.Integer, primary_key=True)
    text = db.Column(db.Text, nullable=False)
    created_at = db.Column(db.Integer, nullable=False)


    There is also a /posts endpoint that triggers the following query:



    # flask and other imports

    @app.route('/posts')
    def posts():
    page_num = request.args.get('page', 1)
    posts = models.Post.query.join(models.post_mentions)
    .group_by(models.post_mentions.columns.post_id)
    .order_by(func.count(models.post_mentions.columns.post_id).desc())
    .paginate(page=int(page_num), per_page=25)
    return render_template('posts.html', posts=posts)


    There are more than 14k+ posts and 32k+ comments stored in SQLite database. As you can see from the snippet above, when someone hits /posts endpoint, SQLAlchemy loads all data at once to the memory and then subsequent queries (e.g. retrieving posts, comments to that posts, etc..) take sub-millisecond time, since data is being served from the memory without hitting the database. Initial load takes 10s+ on my laptop, which is, to put it mildly, suboptimal.



    So the question is: Considering that users won't view 97+% of posts, how can I both order posts by number of mentions in comments and load them on demand instead of doing it in one swoop?







    share|improve this question












    bumped to the homepage by Community♦ yesterday


    This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.
















      up vote
      3
      down vote

      favorite









      up vote
      3
      down vote

      favorite











      I have two models in Flask-SQLAlchemy (Post and Comment) that have many-to-many relationship that is manifested in the third model (post_mentions):



      post_mentions = db.Table(
      'post_mentions',
      db.Column('post_id', db.Integer, db.ForeignKey('posts.id'), primary_key=True),
      db.Column('comment_id', db.Integer, db.ForeignKey('comments.id'), primary_key=True),
      )

      class Post(db.Model):
      __tablename__ = 'posts'

      id = db.Column(db.Integer, primary_key=True)
      name = db.Column(db.String, unique=True, nullable=False)
      mentions = db.relationship('Comment', secondary=post_mentions, lazy='dynamic')

      def __eq__(self, other):
      return self.name.lower() == other.name.lower()

      def __hash__(self):
      return hash(self.name.lower())


      class Comment(db.Model):
      __tablename__ = 'comments'

      id = db.Column(db.Integer, primary_key=True)
      text = db.Column(db.Text, nullable=False)
      created_at = db.Column(db.Integer, nullable=False)


      There is also a /posts endpoint that triggers the following query:



      # flask and other imports

      @app.route('/posts')
      def posts():
      page_num = request.args.get('page', 1)
      posts = models.Post.query.join(models.post_mentions)
      .group_by(models.post_mentions.columns.post_id)
      .order_by(func.count(models.post_mentions.columns.post_id).desc())
      .paginate(page=int(page_num), per_page=25)
      return render_template('posts.html', posts=posts)


      There are more than 14k+ posts and 32k+ comments stored in SQLite database. As you can see from the snippet above, when someone hits /posts endpoint, SQLAlchemy loads all data at once to the memory and then subsequent queries (e.g. retrieving posts, comments to that posts, etc..) take sub-millisecond time, since data is being served from the memory without hitting the database. Initial load takes 10s+ on my laptop, which is, to put it mildly, suboptimal.



      So the question is: Considering that users won't view 97+% of posts, how can I both order posts by number of mentions in comments and load them on demand instead of doing it in one swoop?







      share|improve this question











      I have two models in Flask-SQLAlchemy (Post and Comment) that have many-to-many relationship that is manifested in the third model (post_mentions):



      post_mentions = db.Table(
      'post_mentions',
      db.Column('post_id', db.Integer, db.ForeignKey('posts.id'), primary_key=True),
      db.Column('comment_id', db.Integer, db.ForeignKey('comments.id'), primary_key=True),
      )

      class Post(db.Model):
      __tablename__ = 'posts'

      id = db.Column(db.Integer, primary_key=True)
      name = db.Column(db.String, unique=True, nullable=False)
      mentions = db.relationship('Comment', secondary=post_mentions, lazy='dynamic')

      def __eq__(self, other):
      return self.name.lower() == other.name.lower()

      def __hash__(self):
      return hash(self.name.lower())


      class Comment(db.Model):
      __tablename__ = 'comments'

      id = db.Column(db.Integer, primary_key=True)
      text = db.Column(db.Text, nullable=False)
      created_at = db.Column(db.Integer, nullable=False)


      There is also a /posts endpoint that triggers the following query:



      # flask and other imports

      @app.route('/posts')
      def posts():
      page_num = request.args.get('page', 1)
      posts = models.Post.query.join(models.post_mentions)
      .group_by(models.post_mentions.columns.post_id)
      .order_by(func.count(models.post_mentions.columns.post_id).desc())
      .paginate(page=int(page_num), per_page=25)
      return render_template('posts.html', posts=posts)


      There are more than 14k+ posts and 32k+ comments stored in SQLite database. As you can see from the snippet above, when someone hits /posts endpoint, SQLAlchemy loads all data at once to the memory and then subsequent queries (e.g. retrieving posts, comments to that posts, etc..) take sub-millisecond time, since data is being served from the memory without hitting the database. Initial load takes 10s+ on my laptop, which is, to put it mildly, suboptimal.



      So the question is: Considering that users won't view 97+% of posts, how can I both order posts by number of mentions in comments and load them on demand instead of doing it in one swoop?









      share|improve this question










      share|improve this question




      share|improve this question









      asked Jun 7 at 13:14









      Richard Feynman

      161




      161





      bumped to the homepage by Community♦ yesterday


      This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.







      bumped to the homepage by Community♦ yesterday


      This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.






















          1 Answer
          1






          active

          oldest

          votes

















          up vote
          1
          down vote













          Saw your post on indiehackers. I don't know this orm, but generally speaking, I see you have two options.



          Decide to preload/precache the data when your app starts and refresh it occasionally, if you insist on having all records available.



          But some good advice I've read is : never do in real time what you can do in advance. So... Why not even build some "top posts" table and seed that?






          share|improve this answer





















            Your Answer




            StackExchange.ifUsing("editor", function ()
            return StackExchange.using("mathjaxEditing", function ()
            StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
            StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["\$", "\$"]]);
            );
            );
            , "mathjax-editing");

            StackExchange.ifUsing("editor", function ()
            StackExchange.using("externalEditor", function ()
            StackExchange.using("snippets", function ()
            StackExchange.snippets.init();
            );
            );
            , "code-snippets");

            StackExchange.ready(function()
            var channelOptions =
            tags: "".split(" "),
            id: "196"
            ;
            initTagRenderer("".split(" "), "".split(" "), channelOptions);

            StackExchange.using("externalEditor", function()
            // Have to fire editor after snippets, if snippets enabled
            if (StackExchange.settings.snippets.snippetsEnabled)
            StackExchange.using("snippets", function()
            createEditor();
            );

            else
            createEditor();

            );

            function createEditor()
            StackExchange.prepareEditor(
            heartbeatType: 'answer',
            convertImagesToLinks: false,
            noModals: false,
            showLowRepImageUploadWarning: true,
            reputationToPostImages: null,
            bindNavPrevention: true,
            postfix: "",
            onDemand: true,
            discardSelector: ".discard-answer"
            ,immediatelyShowMarkdownHelp:true
            );



            );








             

            draft saved


            draft discarded


















            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f196034%2fslow-flask-sqlalchemy-query-using-association-tables%23new-answer', 'question_page');

            );

            Post as a guest






























            1 Answer
            1






            active

            oldest

            votes








            1 Answer
            1






            active

            oldest

            votes









            active

            oldest

            votes






            active

            oldest

            votes








            up vote
            1
            down vote













            Saw your post on indiehackers. I don't know this orm, but generally speaking, I see you have two options.



            Decide to preload/precache the data when your app starts and refresh it occasionally, if you insist on having all records available.



            But some good advice I've read is : never do in real time what you can do in advance. So... Why not even build some "top posts" table and seed that?






            share|improve this answer

























              up vote
              1
              down vote













              Saw your post on indiehackers. I don't know this orm, but generally speaking, I see you have two options.



              Decide to preload/precache the data when your app starts and refresh it occasionally, if you insist on having all records available.



              But some good advice I've read is : never do in real time what you can do in advance. So... Why not even build some "top posts" table and seed that?






              share|improve this answer























                up vote
                1
                down vote










                up vote
                1
                down vote









                Saw your post on indiehackers. I don't know this orm, but generally speaking, I see you have two options.



                Decide to preload/precache the data when your app starts and refresh it occasionally, if you insist on having all records available.



                But some good advice I've read is : never do in real time what you can do in advance. So... Why not even build some "top posts" table and seed that?






                share|improve this answer













                Saw your post on indiehackers. I don't know this orm, but generally speaking, I see you have two options.



                Decide to preload/precache the data when your app starts and refresh it occasionally, if you insist on having all records available.



                But some good advice I've read is : never do in real time what you can do in advance. So... Why not even build some "top posts" table and seed that?







                share|improve this answer













                share|improve this answer



                share|improve this answer











                answered Jun 9 at 16:05









                Dale Holborow

                1111




                1111






















                     

                    draft saved


                    draft discarded


























                     


                    draft saved


                    draft discarded














                    StackExchange.ready(
                    function ()
                    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f196034%2fslow-flask-sqlalchemy-query-using-association-tables%23new-answer', 'question_page');

                    );

                    Post as a guest













































































                    Popular posts from this blog

                    Greedy Best First Search implementation in Rust

                    Function to Return a JSON Like Objects Using VBA Collections and Arrays

                    C++11 CLH Lock Implementation