Python JIT instantiation

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP





.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty margin-bottom:0;







up vote
6
down vote

favorite












This code is about instantiating object attributes Just in Time.



I'm working on a REST API, where I currently create Python objects from the JSON data returned by the server, though not all attributes of the created objects will be called upon in user code. Hence only instantiating the attributes the user access via dot notation seems like a valid way to improve performance.



lazy_evaluate is a function, with a signature similar to functools.partial.
lazy_evaluate returns a function that when called multiple times only computes the result once.



I then take advantage of the builtin property function to create a getter that calls the function returned by lazy_evaluate.



There is clearly a trade off here. This only helps performance if a subset of the attributes are required in user code. Otherwise the extra code will slow it down.



The core of the idea is to use Python scoping to mutate a list in the scope outside the function evaluate. I dislike this the most.



  1. Do you think it's a good idea?

  2. Do you have a better solution?

  3. Do you think the complexity is worth it?

I will try and create more context here as to why I want this. I have implemented this idea in my package async_v20 found here. I have written docs here.



I left this out as I felt it was a distraction from the concept.



JSON from the server does get parsed into a python dict. Though aiohttp does this for me.



When the server sends JSON:



  • All attributes are converted to snake_case via a dictionary lookup

  • Time data is turned into a pandas.TimeStamp (requires extra parsing)

In some cases there may be 5000 objects to instantiate or objects may be nested around 3 deep. Objects get instantiated in this module.



This example runs a benchmark:



from time import time

class Time(object):
def __enter__(self):
self.start = time()
return self
def __exit__(self, exc_type, exc_val, exc_tb):
self.end = time()
self.interval = self.end - self.start

def lazy_evaluate(func, *args, **kwargs):
acc =
def evaluate():
if acc == :
print('EVALUATING')
acc.append(func(*args, **kwargs))
return acc[0]
return evaluate



class JitAttributes(object):

foo = property(lambda self: getattr(self, '_foo')())
bar = property(lambda self: getattr(self, '_bar')())
baz = property(lambda self: getattr(self, '_baz')())

def __init__(self, foo, bar, baz):
self._foo = foo
self._bar = bar
self._baz = baz

class Attributes(object):

def __init__(self, foo, bar, baz):
self.foo = foo
self.bar = bar
self.baz = baz


a = 500 # Careful making this too big!
b = a * 2
c = b * 2

print('RUNNING JitAttributes')
repeats = range(100000)

with Time() as t:
for _ in repeats:
JitAttributes(
lazy_evaluate(list, range(a)),
lazy_evaluate(list, range(b)),
lazy_evaluate(list, range(c)),
)

print('TOOK ', t.interval, ' seconds')


print('RUNNING Attributes')

with Time() as t:
for _ in repeats:
Attributes(
list(range(a)),
list(range(b)),
list(range(c)),
)

print('TOOK ', t.interval, ' seconds')


Outputs:





RUNNING JitAttributes
TOOK 0.2956113815307617 seconds
RUNNING Attributes
TOOK 4.880501985549927 seconds


Console demonstration:



>>> jit_instance = JitAttributes(
... lazy_evaluate(list, range(a)),
... lazy_evaluate(list, range(b)),
... lazy_evaluate(list, range(c)),
... )
>>> jit_instance.foo
# EVALUATING <- Only evaluates once :)
# [0, 1, 2, 3, ... ]
>>> jit_instance.foo
# [0, 1, 2, 3, ... ]
>>> non_jit_instance = Attributes(
... list(range(a)),
... list(range(b)),
... list(range(c)),
... )

>>> non_jit_instance.foo
# [0, 1, 2, 3, ...]
>>> non_jit_instance.foo
# [0, 1, 2, 3, ...]


Update



New review following on from this one







share|improve this question



























    up vote
    6
    down vote

    favorite












    This code is about instantiating object attributes Just in Time.



    I'm working on a REST API, where I currently create Python objects from the JSON data returned by the server, though not all attributes of the created objects will be called upon in user code. Hence only instantiating the attributes the user access via dot notation seems like a valid way to improve performance.



    lazy_evaluate is a function, with a signature similar to functools.partial.
    lazy_evaluate returns a function that when called multiple times only computes the result once.



    I then take advantage of the builtin property function to create a getter that calls the function returned by lazy_evaluate.



    There is clearly a trade off here. This only helps performance if a subset of the attributes are required in user code. Otherwise the extra code will slow it down.



    The core of the idea is to use Python scoping to mutate a list in the scope outside the function evaluate. I dislike this the most.



    1. Do you think it's a good idea?

    2. Do you have a better solution?

    3. Do you think the complexity is worth it?

    I will try and create more context here as to why I want this. I have implemented this idea in my package async_v20 found here. I have written docs here.



    I left this out as I felt it was a distraction from the concept.



    JSON from the server does get parsed into a python dict. Though aiohttp does this for me.



    When the server sends JSON:



    • All attributes are converted to snake_case via a dictionary lookup

    • Time data is turned into a pandas.TimeStamp (requires extra parsing)

    In some cases there may be 5000 objects to instantiate or objects may be nested around 3 deep. Objects get instantiated in this module.



    This example runs a benchmark:



    from time import time

    class Time(object):
    def __enter__(self):
    self.start = time()
    return self
    def __exit__(self, exc_type, exc_val, exc_tb):
    self.end = time()
    self.interval = self.end - self.start

    def lazy_evaluate(func, *args, **kwargs):
    acc =
    def evaluate():
    if acc == :
    print('EVALUATING')
    acc.append(func(*args, **kwargs))
    return acc[0]
    return evaluate



    class JitAttributes(object):

    foo = property(lambda self: getattr(self, '_foo')())
    bar = property(lambda self: getattr(self, '_bar')())
    baz = property(lambda self: getattr(self, '_baz')())

    def __init__(self, foo, bar, baz):
    self._foo = foo
    self._bar = bar
    self._baz = baz

    class Attributes(object):

    def __init__(self, foo, bar, baz):
    self.foo = foo
    self.bar = bar
    self.baz = baz


    a = 500 # Careful making this too big!
    b = a * 2
    c = b * 2

    print('RUNNING JitAttributes')
    repeats = range(100000)

    with Time() as t:
    for _ in repeats:
    JitAttributes(
    lazy_evaluate(list, range(a)),
    lazy_evaluate(list, range(b)),
    lazy_evaluate(list, range(c)),
    )

    print('TOOK ', t.interval, ' seconds')


    print('RUNNING Attributes')

    with Time() as t:
    for _ in repeats:
    Attributes(
    list(range(a)),
    list(range(b)),
    list(range(c)),
    )

    print('TOOK ', t.interval, ' seconds')


    Outputs:





    RUNNING JitAttributes
    TOOK 0.2956113815307617 seconds
    RUNNING Attributes
    TOOK 4.880501985549927 seconds


    Console demonstration:



    >>> jit_instance = JitAttributes(
    ... lazy_evaluate(list, range(a)),
    ... lazy_evaluate(list, range(b)),
    ... lazy_evaluate(list, range(c)),
    ... )
    >>> jit_instance.foo
    # EVALUATING <- Only evaluates once :)
    # [0, 1, 2, 3, ... ]
    >>> jit_instance.foo
    # [0, 1, 2, 3, ... ]
    >>> non_jit_instance = Attributes(
    ... list(range(a)),
    ... list(range(b)),
    ... list(range(c)),
    ... )

    >>> non_jit_instance.foo
    # [0, 1, 2, 3, ...]
    >>> non_jit_instance.foo
    # [0, 1, 2, 3, ...]


    Update



    New review following on from this one







    share|improve this question























      up vote
      6
      down vote

      favorite









      up vote
      6
      down vote

      favorite











      This code is about instantiating object attributes Just in Time.



      I'm working on a REST API, where I currently create Python objects from the JSON data returned by the server, though not all attributes of the created objects will be called upon in user code. Hence only instantiating the attributes the user access via dot notation seems like a valid way to improve performance.



      lazy_evaluate is a function, with a signature similar to functools.partial.
      lazy_evaluate returns a function that when called multiple times only computes the result once.



      I then take advantage of the builtin property function to create a getter that calls the function returned by lazy_evaluate.



      There is clearly a trade off here. This only helps performance if a subset of the attributes are required in user code. Otherwise the extra code will slow it down.



      The core of the idea is to use Python scoping to mutate a list in the scope outside the function evaluate. I dislike this the most.



      1. Do you think it's a good idea?

      2. Do you have a better solution?

      3. Do you think the complexity is worth it?

      I will try and create more context here as to why I want this. I have implemented this idea in my package async_v20 found here. I have written docs here.



      I left this out as I felt it was a distraction from the concept.



      JSON from the server does get parsed into a python dict. Though aiohttp does this for me.



      When the server sends JSON:



      • All attributes are converted to snake_case via a dictionary lookup

      • Time data is turned into a pandas.TimeStamp (requires extra parsing)

      In some cases there may be 5000 objects to instantiate or objects may be nested around 3 deep. Objects get instantiated in this module.



      This example runs a benchmark:



      from time import time

      class Time(object):
      def __enter__(self):
      self.start = time()
      return self
      def __exit__(self, exc_type, exc_val, exc_tb):
      self.end = time()
      self.interval = self.end - self.start

      def lazy_evaluate(func, *args, **kwargs):
      acc =
      def evaluate():
      if acc == :
      print('EVALUATING')
      acc.append(func(*args, **kwargs))
      return acc[0]
      return evaluate



      class JitAttributes(object):

      foo = property(lambda self: getattr(self, '_foo')())
      bar = property(lambda self: getattr(self, '_bar')())
      baz = property(lambda self: getattr(self, '_baz')())

      def __init__(self, foo, bar, baz):
      self._foo = foo
      self._bar = bar
      self._baz = baz

      class Attributes(object):

      def __init__(self, foo, bar, baz):
      self.foo = foo
      self.bar = bar
      self.baz = baz


      a = 500 # Careful making this too big!
      b = a * 2
      c = b * 2

      print('RUNNING JitAttributes')
      repeats = range(100000)

      with Time() as t:
      for _ in repeats:
      JitAttributes(
      lazy_evaluate(list, range(a)),
      lazy_evaluate(list, range(b)),
      lazy_evaluate(list, range(c)),
      )

      print('TOOK ', t.interval, ' seconds')


      print('RUNNING Attributes')

      with Time() as t:
      for _ in repeats:
      Attributes(
      list(range(a)),
      list(range(b)),
      list(range(c)),
      )

      print('TOOK ', t.interval, ' seconds')


      Outputs:





      RUNNING JitAttributes
      TOOK 0.2956113815307617 seconds
      RUNNING Attributes
      TOOK 4.880501985549927 seconds


      Console demonstration:



      >>> jit_instance = JitAttributes(
      ... lazy_evaluate(list, range(a)),
      ... lazy_evaluate(list, range(b)),
      ... lazy_evaluate(list, range(c)),
      ... )
      >>> jit_instance.foo
      # EVALUATING <- Only evaluates once :)
      # [0, 1, 2, 3, ... ]
      >>> jit_instance.foo
      # [0, 1, 2, 3, ... ]
      >>> non_jit_instance = Attributes(
      ... list(range(a)),
      ... list(range(b)),
      ... list(range(c)),
      ... )

      >>> non_jit_instance.foo
      # [0, 1, 2, 3, ...]
      >>> non_jit_instance.foo
      # [0, 1, 2, 3, ...]


      Update



      New review following on from this one







      share|improve this question













      This code is about instantiating object attributes Just in Time.



      I'm working on a REST API, where I currently create Python objects from the JSON data returned by the server, though not all attributes of the created objects will be called upon in user code. Hence only instantiating the attributes the user access via dot notation seems like a valid way to improve performance.



      lazy_evaluate is a function, with a signature similar to functools.partial.
      lazy_evaluate returns a function that when called multiple times only computes the result once.



      I then take advantage of the builtin property function to create a getter that calls the function returned by lazy_evaluate.



      There is clearly a trade off here. This only helps performance if a subset of the attributes are required in user code. Otherwise the extra code will slow it down.



      The core of the idea is to use Python scoping to mutate a list in the scope outside the function evaluate. I dislike this the most.



      1. Do you think it's a good idea?

      2. Do you have a better solution?

      3. Do you think the complexity is worth it?

      I will try and create more context here as to why I want this. I have implemented this idea in my package async_v20 found here. I have written docs here.



      I left this out as I felt it was a distraction from the concept.



      JSON from the server does get parsed into a python dict. Though aiohttp does this for me.



      When the server sends JSON:



      • All attributes are converted to snake_case via a dictionary lookup

      • Time data is turned into a pandas.TimeStamp (requires extra parsing)

      In some cases there may be 5000 objects to instantiate or objects may be nested around 3 deep. Objects get instantiated in this module.



      This example runs a benchmark:



      from time import time

      class Time(object):
      def __enter__(self):
      self.start = time()
      return self
      def __exit__(self, exc_type, exc_val, exc_tb):
      self.end = time()
      self.interval = self.end - self.start

      def lazy_evaluate(func, *args, **kwargs):
      acc =
      def evaluate():
      if acc == :
      print('EVALUATING')
      acc.append(func(*args, **kwargs))
      return acc[0]
      return evaluate



      class JitAttributes(object):

      foo = property(lambda self: getattr(self, '_foo')())
      bar = property(lambda self: getattr(self, '_bar')())
      baz = property(lambda self: getattr(self, '_baz')())

      def __init__(self, foo, bar, baz):
      self._foo = foo
      self._bar = bar
      self._baz = baz

      class Attributes(object):

      def __init__(self, foo, bar, baz):
      self.foo = foo
      self.bar = bar
      self.baz = baz


      a = 500 # Careful making this too big!
      b = a * 2
      c = b * 2

      print('RUNNING JitAttributes')
      repeats = range(100000)

      with Time() as t:
      for _ in repeats:
      JitAttributes(
      lazy_evaluate(list, range(a)),
      lazy_evaluate(list, range(b)),
      lazy_evaluate(list, range(c)),
      )

      print('TOOK ', t.interval, ' seconds')


      print('RUNNING Attributes')

      with Time() as t:
      for _ in repeats:
      Attributes(
      list(range(a)),
      list(range(b)),
      list(range(c)),
      )

      print('TOOK ', t.interval, ' seconds')


      Outputs:





      RUNNING JitAttributes
      TOOK 0.2956113815307617 seconds
      RUNNING Attributes
      TOOK 4.880501985549927 seconds


      Console demonstration:



      >>> jit_instance = JitAttributes(
      ... lazy_evaluate(list, range(a)),
      ... lazy_evaluate(list, range(b)),
      ... lazy_evaluate(list, range(c)),
      ... )
      >>> jit_instance.foo
      # EVALUATING <- Only evaluates once :)
      # [0, 1, 2, 3, ... ]
      >>> jit_instance.foo
      # [0, 1, 2, 3, ... ]
      >>> non_jit_instance = Attributes(
      ... list(range(a)),
      ... list(range(b)),
      ... list(range(c)),
      ... )

      >>> non_jit_instance.foo
      # [0, 1, 2, 3, ...]
      >>> non_jit_instance.foo
      # [0, 1, 2, 3, ...]


      Update



      New review following on from this one









      share|improve this question












      share|improve this question




      share|improve this question








      edited Jan 8 at 11:39
























      asked Jan 5 at 13:10









      James Schinner

      422113




      422113




















          1 Answer
          1






          active

          oldest

          votes

















          up vote
          3
          down vote



          accepted










          I have a problem with your code, in that the code seems unrelated to the problem statement you give at the top.



          You say you are "working on a REST API, where I currently create python objects from the JSON data returned by the server. Though not all attributes of the created objects will be called upon in user code."



          None of your code involves REST, or JSON. So while it's an interesting demonstration of certain parts of Python, it doesn't much address your issue.



          IMO, the feasibility of what you are discussing is going to be determined by the nature and character of the JSON data, and the REST API. Parsing JSON is generally a one-shot deal. If you get JSON data like this:



          ' "name":"John", "age":31, "city":"New York" '


          how would it benefit you to "defer" the evaluation or construction of the age and city attributes? It would make far more sense to just parse and assign them all.



          On the other hand, if your API returns a list of 1,000's of, say, homes with lat/long coordinates, and part of the object initialization process involves computing the distance from a given origin address, then perhaps delaying the computation or delaying object instantiation makes sense.



          Please share with us the nature of the problem you're really trying to solve, and we'll be able to comment effectively on whether your proposed solution is a good idea.



          EDIT:



          I had a look at your linked code. The key was, I think, in the helpers.py:get_attribute() function.



          You appear to be re-implementing the behavior of the __getattr__ or __getattribute__ dundermethods. I suggest that you look at implementing a simpler approach, where you store (say) '_attrname' as the lazy-evaluating function, and use __getattr__ to expand that value into 'attrname' when you fetch it the first time.



          Since Python will check for obj.attrname and only call obj.__getattr__() if not found, this would provide the advantage of doing the subsequent lookups in C code rather than Python.






          share|improve this answer























          • Ok, fair point. I updated my post
            – James Schinner
            Jan 5 at 23:06










          • Edited. That code is pretty opaque, which seems to be characteristic of ORM code. :( But I think getattr might be your friend. qv.
            – Austin Hastings
            Jan 6 at 2:55










          • That's a great suggestion! Thanks. Mmm I started to re-evaluate that whole ORM code. I got it working once and forgot about it... Truly valuable feedback!
            – James Schinner
            Jan 6 at 3:02






          • 1




            After making your suggested changes, my test suite runs around 10 seconds faster than previously. From ~46 to ~35, not the best benchmark but a nice indication. WooHoo!
            – James Schinner
            Jan 6 at 15:54






          • 1




            Never mind. Just know your suggestion got me a 66% improvement when simulating 1000 simultaneous requests. (when no attributes were accessed). Needless to say i'm pretty happy with it. Benchmark code: gist.github.com/jamespeterschinner/…
            – James Schinner
            Jan 8 at 2:29











          Your Answer




          StackExchange.ifUsing("editor", function ()
          return StackExchange.using("mathjaxEditing", function ()
          StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
          StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["\$", "\$"]]);
          );
          );
          , "mathjax-editing");

          StackExchange.ifUsing("editor", function ()
          StackExchange.using("externalEditor", function ()
          StackExchange.using("snippets", function ()
          StackExchange.snippets.init();
          );
          );
          , "code-snippets");

          StackExchange.ready(function()
          var channelOptions =
          tags: "".split(" "),
          id: "196"
          ;
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function()
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled)
          StackExchange.using("snippets", function()
          createEditor();
          );

          else
          createEditor();

          );

          function createEditor()
          StackExchange.prepareEditor(
          heartbeatType: 'answer',
          convertImagesToLinks: false,
          noModals: false,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: null,
          bindNavPrevention: true,
          postfix: "",
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          );



          );








           

          draft saved


          draft discarded


















          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f184363%2fpython-jit-instantiation%23new-answer', 'question_page');

          );

          Post as a guest






























          1 Answer
          1






          active

          oldest

          votes








          1 Answer
          1






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes








          up vote
          3
          down vote



          accepted










          I have a problem with your code, in that the code seems unrelated to the problem statement you give at the top.



          You say you are "working on a REST API, where I currently create python objects from the JSON data returned by the server. Though not all attributes of the created objects will be called upon in user code."



          None of your code involves REST, or JSON. So while it's an interesting demonstration of certain parts of Python, it doesn't much address your issue.



          IMO, the feasibility of what you are discussing is going to be determined by the nature and character of the JSON data, and the REST API. Parsing JSON is generally a one-shot deal. If you get JSON data like this:



          ' "name":"John", "age":31, "city":"New York" '


          how would it benefit you to "defer" the evaluation or construction of the age and city attributes? It would make far more sense to just parse and assign them all.



          On the other hand, if your API returns a list of 1,000's of, say, homes with lat/long coordinates, and part of the object initialization process involves computing the distance from a given origin address, then perhaps delaying the computation or delaying object instantiation makes sense.



          Please share with us the nature of the problem you're really trying to solve, and we'll be able to comment effectively on whether your proposed solution is a good idea.



          EDIT:



          I had a look at your linked code. The key was, I think, in the helpers.py:get_attribute() function.



          You appear to be re-implementing the behavior of the __getattr__ or __getattribute__ dundermethods. I suggest that you look at implementing a simpler approach, where you store (say) '_attrname' as the lazy-evaluating function, and use __getattr__ to expand that value into 'attrname' when you fetch it the first time.



          Since Python will check for obj.attrname and only call obj.__getattr__() if not found, this would provide the advantage of doing the subsequent lookups in C code rather than Python.






          share|improve this answer























          • Ok, fair point. I updated my post
            – James Schinner
            Jan 5 at 23:06










          • Edited. That code is pretty opaque, which seems to be characteristic of ORM code. :( But I think getattr might be your friend. qv.
            – Austin Hastings
            Jan 6 at 2:55










          • That's a great suggestion! Thanks. Mmm I started to re-evaluate that whole ORM code. I got it working once and forgot about it... Truly valuable feedback!
            – James Schinner
            Jan 6 at 3:02






          • 1




            After making your suggested changes, my test suite runs around 10 seconds faster than previously. From ~46 to ~35, not the best benchmark but a nice indication. WooHoo!
            – James Schinner
            Jan 6 at 15:54






          • 1




            Never mind. Just know your suggestion got me a 66% improvement when simulating 1000 simultaneous requests. (when no attributes were accessed). Needless to say i'm pretty happy with it. Benchmark code: gist.github.com/jamespeterschinner/…
            – James Schinner
            Jan 8 at 2:29















          up vote
          3
          down vote



          accepted










          I have a problem with your code, in that the code seems unrelated to the problem statement you give at the top.



          You say you are "working on a REST API, where I currently create python objects from the JSON data returned by the server. Though not all attributes of the created objects will be called upon in user code."



          None of your code involves REST, or JSON. So while it's an interesting demonstration of certain parts of Python, it doesn't much address your issue.



          IMO, the feasibility of what you are discussing is going to be determined by the nature and character of the JSON data, and the REST API. Parsing JSON is generally a one-shot deal. If you get JSON data like this:



          ' "name":"John", "age":31, "city":"New York" '


          how would it benefit you to "defer" the evaluation or construction of the age and city attributes? It would make far more sense to just parse and assign them all.



          On the other hand, if your API returns a list of 1,000's of, say, homes with lat/long coordinates, and part of the object initialization process involves computing the distance from a given origin address, then perhaps delaying the computation or delaying object instantiation makes sense.



          Please share with us the nature of the problem you're really trying to solve, and we'll be able to comment effectively on whether your proposed solution is a good idea.



          EDIT:



          I had a look at your linked code. The key was, I think, in the helpers.py:get_attribute() function.



          You appear to be re-implementing the behavior of the __getattr__ or __getattribute__ dundermethods. I suggest that you look at implementing a simpler approach, where you store (say) '_attrname' as the lazy-evaluating function, and use __getattr__ to expand that value into 'attrname' when you fetch it the first time.



          Since Python will check for obj.attrname and only call obj.__getattr__() if not found, this would provide the advantage of doing the subsequent lookups in C code rather than Python.






          share|improve this answer























          • Ok, fair point. I updated my post
            – James Schinner
            Jan 5 at 23:06










          • Edited. That code is pretty opaque, which seems to be characteristic of ORM code. :( But I think getattr might be your friend. qv.
            – Austin Hastings
            Jan 6 at 2:55










          • That's a great suggestion! Thanks. Mmm I started to re-evaluate that whole ORM code. I got it working once and forgot about it... Truly valuable feedback!
            – James Schinner
            Jan 6 at 3:02






          • 1




            After making your suggested changes, my test suite runs around 10 seconds faster than previously. From ~46 to ~35, not the best benchmark but a nice indication. WooHoo!
            – James Schinner
            Jan 6 at 15:54






          • 1




            Never mind. Just know your suggestion got me a 66% improvement when simulating 1000 simultaneous requests. (when no attributes were accessed). Needless to say i'm pretty happy with it. Benchmark code: gist.github.com/jamespeterschinner/…
            – James Schinner
            Jan 8 at 2:29













          up vote
          3
          down vote



          accepted







          up vote
          3
          down vote



          accepted






          I have a problem with your code, in that the code seems unrelated to the problem statement you give at the top.



          You say you are "working on a REST API, where I currently create python objects from the JSON data returned by the server. Though not all attributes of the created objects will be called upon in user code."



          None of your code involves REST, or JSON. So while it's an interesting demonstration of certain parts of Python, it doesn't much address your issue.



          IMO, the feasibility of what you are discussing is going to be determined by the nature and character of the JSON data, and the REST API. Parsing JSON is generally a one-shot deal. If you get JSON data like this:



          ' "name":"John", "age":31, "city":"New York" '


          how would it benefit you to "defer" the evaluation or construction of the age and city attributes? It would make far more sense to just parse and assign them all.



          On the other hand, if your API returns a list of 1,000's of, say, homes with lat/long coordinates, and part of the object initialization process involves computing the distance from a given origin address, then perhaps delaying the computation or delaying object instantiation makes sense.



          Please share with us the nature of the problem you're really trying to solve, and we'll be able to comment effectively on whether your proposed solution is a good idea.



          EDIT:



          I had a look at your linked code. The key was, I think, in the helpers.py:get_attribute() function.



          You appear to be re-implementing the behavior of the __getattr__ or __getattribute__ dundermethods. I suggest that you look at implementing a simpler approach, where you store (say) '_attrname' as the lazy-evaluating function, and use __getattr__ to expand that value into 'attrname' when you fetch it the first time.



          Since Python will check for obj.attrname and only call obj.__getattr__() if not found, this would provide the advantage of doing the subsequent lookups in C code rather than Python.






          share|improve this answer















          I have a problem with your code, in that the code seems unrelated to the problem statement you give at the top.



          You say you are "working on a REST API, where I currently create python objects from the JSON data returned by the server. Though not all attributes of the created objects will be called upon in user code."



          None of your code involves REST, or JSON. So while it's an interesting demonstration of certain parts of Python, it doesn't much address your issue.



          IMO, the feasibility of what you are discussing is going to be determined by the nature and character of the JSON data, and the REST API. Parsing JSON is generally a one-shot deal. If you get JSON data like this:



          ' "name":"John", "age":31, "city":"New York" '


          how would it benefit you to "defer" the evaluation or construction of the age and city attributes? It would make far more sense to just parse and assign them all.



          On the other hand, if your API returns a list of 1,000's of, say, homes with lat/long coordinates, and part of the object initialization process involves computing the distance from a given origin address, then perhaps delaying the computation or delaying object instantiation makes sense.



          Please share with us the nature of the problem you're really trying to solve, and we'll be able to comment effectively on whether your proposed solution is a good idea.



          EDIT:



          I had a look at your linked code. The key was, I think, in the helpers.py:get_attribute() function.



          You appear to be re-implementing the behavior of the __getattr__ or __getattribute__ dundermethods. I suggest that you look at implementing a simpler approach, where you store (say) '_attrname' as the lazy-evaluating function, and use __getattr__ to expand that value into 'attrname' when you fetch it the first time.



          Since Python will check for obj.attrname and only call obj.__getattr__() if not found, this would provide the advantage of doing the subsequent lookups in C code rather than Python.







          share|improve this answer















          share|improve this answer



          share|improve this answer








          edited Jan 6 at 2:54


























          answered Jan 5 at 17:40









          Austin Hastings

          6,1591130




          6,1591130











          • Ok, fair point. I updated my post
            – James Schinner
            Jan 5 at 23:06










          • Edited. That code is pretty opaque, which seems to be characteristic of ORM code. :( But I think getattr might be your friend. qv.
            – Austin Hastings
            Jan 6 at 2:55










          • That's a great suggestion! Thanks. Mmm I started to re-evaluate that whole ORM code. I got it working once and forgot about it... Truly valuable feedback!
            – James Schinner
            Jan 6 at 3:02






          • 1




            After making your suggested changes, my test suite runs around 10 seconds faster than previously. From ~46 to ~35, not the best benchmark but a nice indication. WooHoo!
            – James Schinner
            Jan 6 at 15:54






          • 1




            Never mind. Just know your suggestion got me a 66% improvement when simulating 1000 simultaneous requests. (when no attributes were accessed). Needless to say i'm pretty happy with it. Benchmark code: gist.github.com/jamespeterschinner/…
            – James Schinner
            Jan 8 at 2:29

















          • Ok, fair point. I updated my post
            – James Schinner
            Jan 5 at 23:06










          • Edited. That code is pretty opaque, which seems to be characteristic of ORM code. :( But I think getattr might be your friend. qv.
            – Austin Hastings
            Jan 6 at 2:55










          • That's a great suggestion! Thanks. Mmm I started to re-evaluate that whole ORM code. I got it working once and forgot about it... Truly valuable feedback!
            – James Schinner
            Jan 6 at 3:02






          • 1




            After making your suggested changes, my test suite runs around 10 seconds faster than previously. From ~46 to ~35, not the best benchmark but a nice indication. WooHoo!
            – James Schinner
            Jan 6 at 15:54






          • 1




            Never mind. Just know your suggestion got me a 66% improvement when simulating 1000 simultaneous requests. (when no attributes were accessed). Needless to say i'm pretty happy with it. Benchmark code: gist.github.com/jamespeterschinner/…
            – James Schinner
            Jan 8 at 2:29
















          Ok, fair point. I updated my post
          – James Schinner
          Jan 5 at 23:06




          Ok, fair point. I updated my post
          – James Schinner
          Jan 5 at 23:06












          Edited. That code is pretty opaque, which seems to be characteristic of ORM code. :( But I think getattr might be your friend. qv.
          – Austin Hastings
          Jan 6 at 2:55




          Edited. That code is pretty opaque, which seems to be characteristic of ORM code. :( But I think getattr might be your friend. qv.
          – Austin Hastings
          Jan 6 at 2:55












          That's a great suggestion! Thanks. Mmm I started to re-evaluate that whole ORM code. I got it working once and forgot about it... Truly valuable feedback!
          – James Schinner
          Jan 6 at 3:02




          That's a great suggestion! Thanks. Mmm I started to re-evaluate that whole ORM code. I got it working once and forgot about it... Truly valuable feedback!
          – James Schinner
          Jan 6 at 3:02




          1




          1




          After making your suggested changes, my test suite runs around 10 seconds faster than previously. From ~46 to ~35, not the best benchmark but a nice indication. WooHoo!
          – James Schinner
          Jan 6 at 15:54




          After making your suggested changes, my test suite runs around 10 seconds faster than previously. From ~46 to ~35, not the best benchmark but a nice indication. WooHoo!
          – James Schinner
          Jan 6 at 15:54




          1




          1




          Never mind. Just know your suggestion got me a 66% improvement when simulating 1000 simultaneous requests. (when no attributes were accessed). Needless to say i'm pretty happy with it. Benchmark code: gist.github.com/jamespeterschinner/…
          – James Schinner
          Jan 8 at 2:29





          Never mind. Just know your suggestion got me a 66% improvement when simulating 1000 simultaneous requests. (when no attributes were accessed). Needless to say i'm pretty happy with it. Benchmark code: gist.github.com/jamespeterschinner/…
          – James Schinner
          Jan 8 at 2:29













           

          draft saved


          draft discarded


























           


          draft saved


          draft discarded














          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f184363%2fpython-jit-instantiation%23new-answer', 'question_page');

          );

          Post as a guest













































































          Popular posts from this blog

          Greedy Best First Search implementation in Rust

          Function to Return a JSON Like Objects Using VBA Collections and Arrays

          C++11 CLH Lock Implementation