Parse YAML file with nested parameters as a Python class object

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP





.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty margin-bottom:0;







up vote
3
down vote

favorite












I would like to use a YAML file to store parameters used by computational models developed in Python. An example of such a file is below:



params.yaml



reactor:
diameter_inner: 2.89 cm
temperature: 773 kelvin
gas_mass_flow: 1.89 kg/s

biomass:
diameter: 2.5 mm # mean Sauter diameter (1)
density: 540 kg/m^3 # source unknown
sphericity: 0.89 unitless # assumed value
thermal_conductivity: 1.4 W/mK # based on value for pine (2)

catalyst:
density: 1200 kg/m^3 # from MSDS sheet
sphericity: 0.65 unitless # assumed value
diameters: [[86.1, 124, 159.03, 201], microns] # sieve screen diameters
surface_areas:
values:
- 12.9
- 15
- 18
- 24.01
- 31.8
- 38.51
- 42.6
units: square micron


Parameters for the Python model are organized based on the type of computations they apply to. For example, parameters used by the reactor model are listed in the reactor section. Units are important for the calculations so the YAML file needs to convey that information too.



I'm using the PyYAML package to read the YAML file into a Python dictionary. To allow easier access to the nested parameters, I use an intermediate Python class to parse the dictionary values into class attributes. The class attributers are then used to obtain the values associated with the parameters. Below is an example of how I envision using the approach for a much larger project:



params.py



import yaml


class Reactor:

def __init__(self, rdict):
self.diameter_inner = float(rdict['diameter_inner'].split()[0])
self.temperature = float(rdict['temperature'].split()[0])
self.gas_mass_flow = float(rdict['gas_mass_flow'].split()[0])


class Biomass:

def __init__(self, bdict):
self.diameter = float(bdict['diameter'].split()[0])
self.density = float(bdict['density'].split()[0])
self.sphericity = float(bdict['sphericity'].split()[0])


class Catalyst:

def __init__(self, cdict):
self.diameters = cdict['diameters'][0]
self.density = float(cdict['density'].split()[0])
self.sphericity = float(cdict['sphericity'].split()[0])
self.surface_areas = cdict['surface_areas']['values']


class Parameters:

def __init__(self, file):

with open(file, 'r') as f:
params = yaml.safe_load(f)

# reactor parameters
rdict = params['reactor']
self.reactor = Reactor(rdict)

# biomass parameters
bdict = params['biomass']
self.biomass = Biomass(bdict)

# catalyst parameters
cdict = params['catalyst']
self.catalyst = Catalyst(cdict)


example.py



from params import Parameters

pm = Parameters('params.yaml')

# reactor
d_inner = pm.reactor.diameter_inner
temp = pm.reactor.temperature
mf_gas = pm.reactor.gas_mass_flow

# biomass
d_bio = pm.biomass.diameter
rho_bio = pm.biomass.density

# catalyst
rho_cat = pm.catalyst.density
sp_cat = pm.catalyst.sphericity
d_cat = pm.catalyst.diameters
sa_cat = pm.catalyst.surface_areas

print('n--- Reactor Parameters ---')
print(f'd_inner = d_inner')
print(f'temp = temp')
print(f'mf_gas = mf_gas')

print('n--- Biomass Parameters ---')
print(f'd_bio = d_bio')
print(f'rho_bio = rho_bio')

print('n--- Catalyst Parameters ---')
print(f'rho_cat = rho_cat')
print(f'sp_cat = sp_cat')
print(f'd_cat = d_cat')
print(f'sa_cat = sa_cat')


This approach works fine but when more parameters are added to the YAML file it requires additional code to be added to the class objects. I could just use the dictionary returned from the YAML package but I find it easier and cleaner to get the parameter values with a class interface.



So I would like to know if there is a better approach that I should use to parse the YAML file? Or should I organize the YAML file with a different structure to more easily parse it?







share|improve this question

























    up vote
    3
    down vote

    favorite












    I would like to use a YAML file to store parameters used by computational models developed in Python. An example of such a file is below:



    params.yaml



    reactor:
    diameter_inner: 2.89 cm
    temperature: 773 kelvin
    gas_mass_flow: 1.89 kg/s

    biomass:
    diameter: 2.5 mm # mean Sauter diameter (1)
    density: 540 kg/m^3 # source unknown
    sphericity: 0.89 unitless # assumed value
    thermal_conductivity: 1.4 W/mK # based on value for pine (2)

    catalyst:
    density: 1200 kg/m^3 # from MSDS sheet
    sphericity: 0.65 unitless # assumed value
    diameters: [[86.1, 124, 159.03, 201], microns] # sieve screen diameters
    surface_areas:
    values:
    - 12.9
    - 15
    - 18
    - 24.01
    - 31.8
    - 38.51
    - 42.6
    units: square micron


    Parameters for the Python model are organized based on the type of computations they apply to. For example, parameters used by the reactor model are listed in the reactor section. Units are important for the calculations so the YAML file needs to convey that information too.



    I'm using the PyYAML package to read the YAML file into a Python dictionary. To allow easier access to the nested parameters, I use an intermediate Python class to parse the dictionary values into class attributes. The class attributers are then used to obtain the values associated with the parameters. Below is an example of how I envision using the approach for a much larger project:



    params.py



    import yaml


    class Reactor:

    def __init__(self, rdict):
    self.diameter_inner = float(rdict['diameter_inner'].split()[0])
    self.temperature = float(rdict['temperature'].split()[0])
    self.gas_mass_flow = float(rdict['gas_mass_flow'].split()[0])


    class Biomass:

    def __init__(self, bdict):
    self.diameter = float(bdict['diameter'].split()[0])
    self.density = float(bdict['density'].split()[0])
    self.sphericity = float(bdict['sphericity'].split()[0])


    class Catalyst:

    def __init__(self, cdict):
    self.diameters = cdict['diameters'][0]
    self.density = float(cdict['density'].split()[0])
    self.sphericity = float(cdict['sphericity'].split()[0])
    self.surface_areas = cdict['surface_areas']['values']


    class Parameters:

    def __init__(self, file):

    with open(file, 'r') as f:
    params = yaml.safe_load(f)

    # reactor parameters
    rdict = params['reactor']
    self.reactor = Reactor(rdict)

    # biomass parameters
    bdict = params['biomass']
    self.biomass = Biomass(bdict)

    # catalyst parameters
    cdict = params['catalyst']
    self.catalyst = Catalyst(cdict)


    example.py



    from params import Parameters

    pm = Parameters('params.yaml')

    # reactor
    d_inner = pm.reactor.diameter_inner
    temp = pm.reactor.temperature
    mf_gas = pm.reactor.gas_mass_flow

    # biomass
    d_bio = pm.biomass.diameter
    rho_bio = pm.biomass.density

    # catalyst
    rho_cat = pm.catalyst.density
    sp_cat = pm.catalyst.sphericity
    d_cat = pm.catalyst.diameters
    sa_cat = pm.catalyst.surface_areas

    print('n--- Reactor Parameters ---')
    print(f'd_inner = d_inner')
    print(f'temp = temp')
    print(f'mf_gas = mf_gas')

    print('n--- Biomass Parameters ---')
    print(f'd_bio = d_bio')
    print(f'rho_bio = rho_bio')

    print('n--- Catalyst Parameters ---')
    print(f'rho_cat = rho_cat')
    print(f'sp_cat = sp_cat')
    print(f'd_cat = d_cat')
    print(f'sa_cat = sa_cat')


    This approach works fine but when more parameters are added to the YAML file it requires additional code to be added to the class objects. I could just use the dictionary returned from the YAML package but I find it easier and cleaner to get the parameter values with a class interface.



    So I would like to know if there is a better approach that I should use to parse the YAML file? Or should I organize the YAML file with a different structure to more easily parse it?







    share|improve this question





















      up vote
      3
      down vote

      favorite









      up vote
      3
      down vote

      favorite











      I would like to use a YAML file to store parameters used by computational models developed in Python. An example of such a file is below:



      params.yaml



      reactor:
      diameter_inner: 2.89 cm
      temperature: 773 kelvin
      gas_mass_flow: 1.89 kg/s

      biomass:
      diameter: 2.5 mm # mean Sauter diameter (1)
      density: 540 kg/m^3 # source unknown
      sphericity: 0.89 unitless # assumed value
      thermal_conductivity: 1.4 W/mK # based on value for pine (2)

      catalyst:
      density: 1200 kg/m^3 # from MSDS sheet
      sphericity: 0.65 unitless # assumed value
      diameters: [[86.1, 124, 159.03, 201], microns] # sieve screen diameters
      surface_areas:
      values:
      - 12.9
      - 15
      - 18
      - 24.01
      - 31.8
      - 38.51
      - 42.6
      units: square micron


      Parameters for the Python model are organized based on the type of computations they apply to. For example, parameters used by the reactor model are listed in the reactor section. Units are important for the calculations so the YAML file needs to convey that information too.



      I'm using the PyYAML package to read the YAML file into a Python dictionary. To allow easier access to the nested parameters, I use an intermediate Python class to parse the dictionary values into class attributes. The class attributers are then used to obtain the values associated with the parameters. Below is an example of how I envision using the approach for a much larger project:



      params.py



      import yaml


      class Reactor:

      def __init__(self, rdict):
      self.diameter_inner = float(rdict['diameter_inner'].split()[0])
      self.temperature = float(rdict['temperature'].split()[0])
      self.gas_mass_flow = float(rdict['gas_mass_flow'].split()[0])


      class Biomass:

      def __init__(self, bdict):
      self.diameter = float(bdict['diameter'].split()[0])
      self.density = float(bdict['density'].split()[0])
      self.sphericity = float(bdict['sphericity'].split()[0])


      class Catalyst:

      def __init__(self, cdict):
      self.diameters = cdict['diameters'][0]
      self.density = float(cdict['density'].split()[0])
      self.sphericity = float(cdict['sphericity'].split()[0])
      self.surface_areas = cdict['surface_areas']['values']


      class Parameters:

      def __init__(self, file):

      with open(file, 'r') as f:
      params = yaml.safe_load(f)

      # reactor parameters
      rdict = params['reactor']
      self.reactor = Reactor(rdict)

      # biomass parameters
      bdict = params['biomass']
      self.biomass = Biomass(bdict)

      # catalyst parameters
      cdict = params['catalyst']
      self.catalyst = Catalyst(cdict)


      example.py



      from params import Parameters

      pm = Parameters('params.yaml')

      # reactor
      d_inner = pm.reactor.diameter_inner
      temp = pm.reactor.temperature
      mf_gas = pm.reactor.gas_mass_flow

      # biomass
      d_bio = pm.biomass.diameter
      rho_bio = pm.biomass.density

      # catalyst
      rho_cat = pm.catalyst.density
      sp_cat = pm.catalyst.sphericity
      d_cat = pm.catalyst.diameters
      sa_cat = pm.catalyst.surface_areas

      print('n--- Reactor Parameters ---')
      print(f'd_inner = d_inner')
      print(f'temp = temp')
      print(f'mf_gas = mf_gas')

      print('n--- Biomass Parameters ---')
      print(f'd_bio = d_bio')
      print(f'rho_bio = rho_bio')

      print('n--- Catalyst Parameters ---')
      print(f'rho_cat = rho_cat')
      print(f'sp_cat = sp_cat')
      print(f'd_cat = d_cat')
      print(f'sa_cat = sa_cat')


      This approach works fine but when more parameters are added to the YAML file it requires additional code to be added to the class objects. I could just use the dictionary returned from the YAML package but I find it easier and cleaner to get the parameter values with a class interface.



      So I would like to know if there is a better approach that I should use to parse the YAML file? Or should I organize the YAML file with a different structure to more easily parse it?







      share|improve this question











      I would like to use a YAML file to store parameters used by computational models developed in Python. An example of such a file is below:



      params.yaml



      reactor:
      diameter_inner: 2.89 cm
      temperature: 773 kelvin
      gas_mass_flow: 1.89 kg/s

      biomass:
      diameter: 2.5 mm # mean Sauter diameter (1)
      density: 540 kg/m^3 # source unknown
      sphericity: 0.89 unitless # assumed value
      thermal_conductivity: 1.4 W/mK # based on value for pine (2)

      catalyst:
      density: 1200 kg/m^3 # from MSDS sheet
      sphericity: 0.65 unitless # assumed value
      diameters: [[86.1, 124, 159.03, 201], microns] # sieve screen diameters
      surface_areas:
      values:
      - 12.9
      - 15
      - 18
      - 24.01
      - 31.8
      - 38.51
      - 42.6
      units: square micron


      Parameters for the Python model are organized based on the type of computations they apply to. For example, parameters used by the reactor model are listed in the reactor section. Units are important for the calculations so the YAML file needs to convey that information too.



      I'm using the PyYAML package to read the YAML file into a Python dictionary. To allow easier access to the nested parameters, I use an intermediate Python class to parse the dictionary values into class attributes. The class attributers are then used to obtain the values associated with the parameters. Below is an example of how I envision using the approach for a much larger project:



      params.py



      import yaml


      class Reactor:

      def __init__(self, rdict):
      self.diameter_inner = float(rdict['diameter_inner'].split()[0])
      self.temperature = float(rdict['temperature'].split()[0])
      self.gas_mass_flow = float(rdict['gas_mass_flow'].split()[0])


      class Biomass:

      def __init__(self, bdict):
      self.diameter = float(bdict['diameter'].split()[0])
      self.density = float(bdict['density'].split()[0])
      self.sphericity = float(bdict['sphericity'].split()[0])


      class Catalyst:

      def __init__(self, cdict):
      self.diameters = cdict['diameters'][0]
      self.density = float(cdict['density'].split()[0])
      self.sphericity = float(cdict['sphericity'].split()[0])
      self.surface_areas = cdict['surface_areas']['values']


      class Parameters:

      def __init__(self, file):

      with open(file, 'r') as f:
      params = yaml.safe_load(f)

      # reactor parameters
      rdict = params['reactor']
      self.reactor = Reactor(rdict)

      # biomass parameters
      bdict = params['biomass']
      self.biomass = Biomass(bdict)

      # catalyst parameters
      cdict = params['catalyst']
      self.catalyst = Catalyst(cdict)


      example.py



      from params import Parameters

      pm = Parameters('params.yaml')

      # reactor
      d_inner = pm.reactor.diameter_inner
      temp = pm.reactor.temperature
      mf_gas = pm.reactor.gas_mass_flow

      # biomass
      d_bio = pm.biomass.diameter
      rho_bio = pm.biomass.density

      # catalyst
      rho_cat = pm.catalyst.density
      sp_cat = pm.catalyst.sphericity
      d_cat = pm.catalyst.diameters
      sa_cat = pm.catalyst.surface_areas

      print('n--- Reactor Parameters ---')
      print(f'd_inner = d_inner')
      print(f'temp = temp')
      print(f'mf_gas = mf_gas')

      print('n--- Biomass Parameters ---')
      print(f'd_bio = d_bio')
      print(f'rho_bio = rho_bio')

      print('n--- Catalyst Parameters ---')
      print(f'rho_cat = rho_cat')
      print(f'sp_cat = sp_cat')
      print(f'd_cat = d_cat')
      print(f'sa_cat = sa_cat')


      This approach works fine but when more parameters are added to the YAML file it requires additional code to be added to the class objects. I could just use the dictionary returned from the YAML package but I find it easier and cleaner to get the parameter values with a class interface.



      So I would like to know if there is a better approach that I should use to parse the YAML file? Or should I organize the YAML file with a different structure to more easily parse it?









      share|improve this question










      share|improve this question




      share|improve this question









      asked Apr 17 at 1:00









      wigging

      1184




      1184




















          2 Answers
          2






          active

          oldest

          votes

















          up vote
          1
          down vote



          accepted










          you could use a nested parser using pint to do the unit parsing



          from pint import UnitRegistry, UndefinedUnitError
          UNITS = UnitRegistry()
          def nested_parser(params: dict):
          for key, value in params.items():
          if isinstance(value, str):
          try:
          value = units.Quantity(value)
          except UndefinedUnitError:
          pass
          yield key, value
          if isinstance(value, dict):
          if value.keys() == 'values', 'units':
          yield key, [i * UNITS(value['units']) for i in value['values']]
          else:
          yield key, dict(nested_parser(value))
          if isinstance(value, list):
          values, unit = value

          yield key, [i * UNITS(unit) for i in values]

          dict(nested_parser(yaml.safe_load(params)))



          'reactor': 'diameter_inner': <Quantity(2.89, 'centimeter')>,
          'temperature': <Quantity(773, 'kelvin')>,
          'gas_mass_flow': <Quantity(1.89, 'kilogram / second')>,
          'biomass': 'diameter': <Quantity(2.5, 'millimeter')>,
          'density': <Quantity(540.0, 'kilogram / meter ** 3')>,
          'sphericity': <Quantity(0.89, 'dimensionless')>,
          'thermal_conductivity': <Quantity(1.4, 'watt / millikelvin')>,
          'catalyst': 'density': <Quantity(1200.0, 'kilogram / meter ** 3')>,
          'sphericity': <Quantity(0.65, 'dimensionless')>,
          'diameters': [<Quantity(86.1, 'micrometer')>,
          <Quantity(124, 'micrometer')>,
          <Quantity(159.03, 'micrometer')>,
          <Quantity(201, 'micrometer')>],
          'surface_areas': [<Quantity(12.9, 'micrometer ** 2')>,
          <Quantity(15, 'micrometer ** 2')>,
          <Quantity(18, 'micrometer ** 2')>,
          <Quantity(24.01, 'micrometer ** 2')>,
          <Quantity(31.8, 'micrometer ** 2')>,
          <Quantity(38.51, 'micrometer ** 2')>,
          <Quantity(42.6, 'micrometer ** 2')>]



          You might need to make your units understandable for pint, but for me that just meant changing the microns to µm and square micron to µm², and unitless to dimensionless



          using this



          statically



          configuration = dict(nested_parser(yaml.safe_load(params)))

          # reactor
          reactor_config = configuration['reactor']
          d_inner = reactor_config['diameter_inner']
          temp = reactor_config['temperature']
          mf_gas = reactor_config['gas_mass_flow']

          print('n--- Reactor Parameters ---')
          print(f'd_inner = d_inner')
          print(f'temp = temp')
          print(f'mf_gas = mf_gas')


          dynamically



          for part, parameters in nested_parser(yaml.safe_load(params)):
          print(f'--- part Parameters ---')
          for parameter, value in parameters.items():
          print(f'parameter = value')
          print('n')


          you can check out the pint documentation on string formatting to format the units the way you want






          share|improve this answer























          • My next step is to incorporate Pint so thank you for the example. Can you also comment on how to utilize the your approach in a Python script? In my example I use the class objects in params.py to read the YAML dictionary and assign the values to attributes. Then I refer to those classes in the example.py script. Would this approach work with pint? Or is there a different approach I should use?
            – wigging
            Apr 17 at 16:29










          • pint works with this approach. The value of the attributes are not instances of pint.Quantity, so the handling of string methods and so will change, but fundamentally these quantities are no different than floats and ints. You can reform your classes to accept the dict of parameters, and use setattr to set the attributes dynamically. Note that using a class to only hold the parameter values is a bit overkill, and a dict will suffice for that purpose
            – Maarten Fabré
            Apr 18 at 7:17











          • I agree that using a class is overkill. Can you provide an example of how to get the values from the dictionary? I’m thinking that a function like get_value(‘density’) would work but how would I define which density?
            – wigging
            Apr 18 at 12:14










          • One more question. In yaml.safe_load(params) what is params? Is it a string representing the path to the yaml file?
            – wigging
            Apr 19 at 0:40










          • nested_parser takes any dict in the as in the yaml file, so params can be the file or a yaml string
            – Maarten Fabré
            Apr 19 at 19:50

















          up vote
          1
          down vote













          1. If you split the configuration fields into magnitude and unit (as you've already done for surface_areas) you won't have to split and parse them in code.

          2. If you then convert your configuration to JSON you won't need to convert strings to numbers. JSON strings must be quoted, and numbers must be unquoted, so the json module will simply do those conversions for you.

          Other than that:



          • Configuration handling should be separate from building other objects - that way it's easy to use your code whether the configuration comes from a file or from command-line parameters.

          • Accessing properties two levels deep (such as pm.biomass.diameter) violates the Law of Demeter. You could write for example an as_parameter_list for each class to get a representation like f'rho_cat = rho_cat' etc.





          share|improve this answer























          • I'm not interested in using JSON for the parameters file because it does not support comments. I plan to use comments to add more information about certain parameters. I also feel that the YAML format is more readable than JSON. Can you provide an example of the configuration handling you mentioned?
            – wigging
            Apr 17 at 2:13










          • Nothing open source off the top of my mind, but I bet any large project that allows configuration either via files or via command line arguments do this.
            – l0b0
            Apr 17 at 2:16











          Your Answer




          StackExchange.ifUsing("editor", function ()
          return StackExchange.using("mathjaxEditing", function ()
          StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
          StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["\$", "\$"]]);
          );
          );
          , "mathjax-editing");

          StackExchange.ifUsing("editor", function ()
          StackExchange.using("externalEditor", function ()
          StackExchange.using("snippets", function ()
          StackExchange.snippets.init();
          );
          );
          , "code-snippets");

          StackExchange.ready(function()
          var channelOptions =
          tags: "".split(" "),
          id: "196"
          ;
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function()
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled)
          StackExchange.using("snippets", function()
          createEditor();
          );

          else
          createEditor();

          );

          function createEditor()
          StackExchange.prepareEditor(
          heartbeatType: 'answer',
          convertImagesToLinks: false,
          noModals: false,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: null,
          bindNavPrevention: true,
          postfix: "",
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          );



          );








           

          draft saved


          draft discarded


















          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f192252%2fparse-yaml-file-with-nested-parameters-as-a-python-class-object%23new-answer', 'question_page');

          );

          Post as a guest






























          2 Answers
          2






          active

          oldest

          votes








          2 Answers
          2






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes








          up vote
          1
          down vote



          accepted










          you could use a nested parser using pint to do the unit parsing



          from pint import UnitRegistry, UndefinedUnitError
          UNITS = UnitRegistry()
          def nested_parser(params: dict):
          for key, value in params.items():
          if isinstance(value, str):
          try:
          value = units.Quantity(value)
          except UndefinedUnitError:
          pass
          yield key, value
          if isinstance(value, dict):
          if value.keys() == 'values', 'units':
          yield key, [i * UNITS(value['units']) for i in value['values']]
          else:
          yield key, dict(nested_parser(value))
          if isinstance(value, list):
          values, unit = value

          yield key, [i * UNITS(unit) for i in values]

          dict(nested_parser(yaml.safe_load(params)))



          'reactor': 'diameter_inner': <Quantity(2.89, 'centimeter')>,
          'temperature': <Quantity(773, 'kelvin')>,
          'gas_mass_flow': <Quantity(1.89, 'kilogram / second')>,
          'biomass': 'diameter': <Quantity(2.5, 'millimeter')>,
          'density': <Quantity(540.0, 'kilogram / meter ** 3')>,
          'sphericity': <Quantity(0.89, 'dimensionless')>,
          'thermal_conductivity': <Quantity(1.4, 'watt / millikelvin')>,
          'catalyst': 'density': <Quantity(1200.0, 'kilogram / meter ** 3')>,
          'sphericity': <Quantity(0.65, 'dimensionless')>,
          'diameters': [<Quantity(86.1, 'micrometer')>,
          <Quantity(124, 'micrometer')>,
          <Quantity(159.03, 'micrometer')>,
          <Quantity(201, 'micrometer')>],
          'surface_areas': [<Quantity(12.9, 'micrometer ** 2')>,
          <Quantity(15, 'micrometer ** 2')>,
          <Quantity(18, 'micrometer ** 2')>,
          <Quantity(24.01, 'micrometer ** 2')>,
          <Quantity(31.8, 'micrometer ** 2')>,
          <Quantity(38.51, 'micrometer ** 2')>,
          <Quantity(42.6, 'micrometer ** 2')>]



          You might need to make your units understandable for pint, but for me that just meant changing the microns to µm and square micron to µm², and unitless to dimensionless



          using this



          statically



          configuration = dict(nested_parser(yaml.safe_load(params)))

          # reactor
          reactor_config = configuration['reactor']
          d_inner = reactor_config['diameter_inner']
          temp = reactor_config['temperature']
          mf_gas = reactor_config['gas_mass_flow']

          print('n--- Reactor Parameters ---')
          print(f'd_inner = d_inner')
          print(f'temp = temp')
          print(f'mf_gas = mf_gas')


          dynamically



          for part, parameters in nested_parser(yaml.safe_load(params)):
          print(f'--- part Parameters ---')
          for parameter, value in parameters.items():
          print(f'parameter = value')
          print('n')


          you can check out the pint documentation on string formatting to format the units the way you want






          share|improve this answer























          • My next step is to incorporate Pint so thank you for the example. Can you also comment on how to utilize the your approach in a Python script? In my example I use the class objects in params.py to read the YAML dictionary and assign the values to attributes. Then I refer to those classes in the example.py script. Would this approach work with pint? Or is there a different approach I should use?
            – wigging
            Apr 17 at 16:29










          • pint works with this approach. The value of the attributes are not instances of pint.Quantity, so the handling of string methods and so will change, but fundamentally these quantities are no different than floats and ints. You can reform your classes to accept the dict of parameters, and use setattr to set the attributes dynamically. Note that using a class to only hold the parameter values is a bit overkill, and a dict will suffice for that purpose
            – Maarten Fabré
            Apr 18 at 7:17











          • I agree that using a class is overkill. Can you provide an example of how to get the values from the dictionary? I’m thinking that a function like get_value(‘density’) would work but how would I define which density?
            – wigging
            Apr 18 at 12:14










          • One more question. In yaml.safe_load(params) what is params? Is it a string representing the path to the yaml file?
            – wigging
            Apr 19 at 0:40










          • nested_parser takes any dict in the as in the yaml file, so params can be the file or a yaml string
            – Maarten Fabré
            Apr 19 at 19:50














          up vote
          1
          down vote



          accepted










          you could use a nested parser using pint to do the unit parsing



          from pint import UnitRegistry, UndefinedUnitError
          UNITS = UnitRegistry()
          def nested_parser(params: dict):
          for key, value in params.items():
          if isinstance(value, str):
          try:
          value = units.Quantity(value)
          except UndefinedUnitError:
          pass
          yield key, value
          if isinstance(value, dict):
          if value.keys() == 'values', 'units':
          yield key, [i * UNITS(value['units']) for i in value['values']]
          else:
          yield key, dict(nested_parser(value))
          if isinstance(value, list):
          values, unit = value

          yield key, [i * UNITS(unit) for i in values]

          dict(nested_parser(yaml.safe_load(params)))



          'reactor': 'diameter_inner': <Quantity(2.89, 'centimeter')>,
          'temperature': <Quantity(773, 'kelvin')>,
          'gas_mass_flow': <Quantity(1.89, 'kilogram / second')>,
          'biomass': 'diameter': <Quantity(2.5, 'millimeter')>,
          'density': <Quantity(540.0, 'kilogram / meter ** 3')>,
          'sphericity': <Quantity(0.89, 'dimensionless')>,
          'thermal_conductivity': <Quantity(1.4, 'watt / millikelvin')>,
          'catalyst': 'density': <Quantity(1200.0, 'kilogram / meter ** 3')>,
          'sphericity': <Quantity(0.65, 'dimensionless')>,
          'diameters': [<Quantity(86.1, 'micrometer')>,
          <Quantity(124, 'micrometer')>,
          <Quantity(159.03, 'micrometer')>,
          <Quantity(201, 'micrometer')>],
          'surface_areas': [<Quantity(12.9, 'micrometer ** 2')>,
          <Quantity(15, 'micrometer ** 2')>,
          <Quantity(18, 'micrometer ** 2')>,
          <Quantity(24.01, 'micrometer ** 2')>,
          <Quantity(31.8, 'micrometer ** 2')>,
          <Quantity(38.51, 'micrometer ** 2')>,
          <Quantity(42.6, 'micrometer ** 2')>]



          You might need to make your units understandable for pint, but for me that just meant changing the microns to µm and square micron to µm², and unitless to dimensionless



          using this



          statically



          configuration = dict(nested_parser(yaml.safe_load(params)))

          # reactor
          reactor_config = configuration['reactor']
          d_inner = reactor_config['diameter_inner']
          temp = reactor_config['temperature']
          mf_gas = reactor_config['gas_mass_flow']

          print('n--- Reactor Parameters ---')
          print(f'd_inner = d_inner')
          print(f'temp = temp')
          print(f'mf_gas = mf_gas')


          dynamically



          for part, parameters in nested_parser(yaml.safe_load(params)):
          print(f'--- part Parameters ---')
          for parameter, value in parameters.items():
          print(f'parameter = value')
          print('n')


          you can check out the pint documentation on string formatting to format the units the way you want






          share|improve this answer























          • My next step is to incorporate Pint so thank you for the example. Can you also comment on how to utilize the your approach in a Python script? In my example I use the class objects in params.py to read the YAML dictionary and assign the values to attributes. Then I refer to those classes in the example.py script. Would this approach work with pint? Or is there a different approach I should use?
            – wigging
            Apr 17 at 16:29










          • pint works with this approach. The value of the attributes are not instances of pint.Quantity, so the handling of string methods and so will change, but fundamentally these quantities are no different than floats and ints. You can reform your classes to accept the dict of parameters, and use setattr to set the attributes dynamically. Note that using a class to only hold the parameter values is a bit overkill, and a dict will suffice for that purpose
            – Maarten Fabré
            Apr 18 at 7:17











          • I agree that using a class is overkill. Can you provide an example of how to get the values from the dictionary? I’m thinking that a function like get_value(‘density’) would work but how would I define which density?
            – wigging
            Apr 18 at 12:14










          • One more question. In yaml.safe_load(params) what is params? Is it a string representing the path to the yaml file?
            – wigging
            Apr 19 at 0:40










          • nested_parser takes any dict in the as in the yaml file, so params can be the file or a yaml string
            – Maarten Fabré
            Apr 19 at 19:50












          up vote
          1
          down vote



          accepted







          up vote
          1
          down vote



          accepted






          you could use a nested parser using pint to do the unit parsing



          from pint import UnitRegistry, UndefinedUnitError
          UNITS = UnitRegistry()
          def nested_parser(params: dict):
          for key, value in params.items():
          if isinstance(value, str):
          try:
          value = units.Quantity(value)
          except UndefinedUnitError:
          pass
          yield key, value
          if isinstance(value, dict):
          if value.keys() == 'values', 'units':
          yield key, [i * UNITS(value['units']) for i in value['values']]
          else:
          yield key, dict(nested_parser(value))
          if isinstance(value, list):
          values, unit = value

          yield key, [i * UNITS(unit) for i in values]

          dict(nested_parser(yaml.safe_load(params)))



          'reactor': 'diameter_inner': <Quantity(2.89, 'centimeter')>,
          'temperature': <Quantity(773, 'kelvin')>,
          'gas_mass_flow': <Quantity(1.89, 'kilogram / second')>,
          'biomass': 'diameter': <Quantity(2.5, 'millimeter')>,
          'density': <Quantity(540.0, 'kilogram / meter ** 3')>,
          'sphericity': <Quantity(0.89, 'dimensionless')>,
          'thermal_conductivity': <Quantity(1.4, 'watt / millikelvin')>,
          'catalyst': 'density': <Quantity(1200.0, 'kilogram / meter ** 3')>,
          'sphericity': <Quantity(0.65, 'dimensionless')>,
          'diameters': [<Quantity(86.1, 'micrometer')>,
          <Quantity(124, 'micrometer')>,
          <Quantity(159.03, 'micrometer')>,
          <Quantity(201, 'micrometer')>],
          'surface_areas': [<Quantity(12.9, 'micrometer ** 2')>,
          <Quantity(15, 'micrometer ** 2')>,
          <Quantity(18, 'micrometer ** 2')>,
          <Quantity(24.01, 'micrometer ** 2')>,
          <Quantity(31.8, 'micrometer ** 2')>,
          <Quantity(38.51, 'micrometer ** 2')>,
          <Quantity(42.6, 'micrometer ** 2')>]



          You might need to make your units understandable for pint, but for me that just meant changing the microns to µm and square micron to µm², and unitless to dimensionless



          using this



          statically



          configuration = dict(nested_parser(yaml.safe_load(params)))

          # reactor
          reactor_config = configuration['reactor']
          d_inner = reactor_config['diameter_inner']
          temp = reactor_config['temperature']
          mf_gas = reactor_config['gas_mass_flow']

          print('n--- Reactor Parameters ---')
          print(f'd_inner = d_inner')
          print(f'temp = temp')
          print(f'mf_gas = mf_gas')


          dynamically



          for part, parameters in nested_parser(yaml.safe_load(params)):
          print(f'--- part Parameters ---')
          for parameter, value in parameters.items():
          print(f'parameter = value')
          print('n')


          you can check out the pint documentation on string formatting to format the units the way you want






          share|improve this answer















          you could use a nested parser using pint to do the unit parsing



          from pint import UnitRegistry, UndefinedUnitError
          UNITS = UnitRegistry()
          def nested_parser(params: dict):
          for key, value in params.items():
          if isinstance(value, str):
          try:
          value = units.Quantity(value)
          except UndefinedUnitError:
          pass
          yield key, value
          if isinstance(value, dict):
          if value.keys() == 'values', 'units':
          yield key, [i * UNITS(value['units']) for i in value['values']]
          else:
          yield key, dict(nested_parser(value))
          if isinstance(value, list):
          values, unit = value

          yield key, [i * UNITS(unit) for i in values]

          dict(nested_parser(yaml.safe_load(params)))



          'reactor': 'diameter_inner': <Quantity(2.89, 'centimeter')>,
          'temperature': <Quantity(773, 'kelvin')>,
          'gas_mass_flow': <Quantity(1.89, 'kilogram / second')>,
          'biomass': 'diameter': <Quantity(2.5, 'millimeter')>,
          'density': <Quantity(540.0, 'kilogram / meter ** 3')>,
          'sphericity': <Quantity(0.89, 'dimensionless')>,
          'thermal_conductivity': <Quantity(1.4, 'watt / millikelvin')>,
          'catalyst': 'density': <Quantity(1200.0, 'kilogram / meter ** 3')>,
          'sphericity': <Quantity(0.65, 'dimensionless')>,
          'diameters': [<Quantity(86.1, 'micrometer')>,
          <Quantity(124, 'micrometer')>,
          <Quantity(159.03, 'micrometer')>,
          <Quantity(201, 'micrometer')>],
          'surface_areas': [<Quantity(12.9, 'micrometer ** 2')>,
          <Quantity(15, 'micrometer ** 2')>,
          <Quantity(18, 'micrometer ** 2')>,
          <Quantity(24.01, 'micrometer ** 2')>,
          <Quantity(31.8, 'micrometer ** 2')>,
          <Quantity(38.51, 'micrometer ** 2')>,
          <Quantity(42.6, 'micrometer ** 2')>]



          You might need to make your units understandable for pint, but for me that just meant changing the microns to µm and square micron to µm², and unitless to dimensionless



          using this



          statically



          configuration = dict(nested_parser(yaml.safe_load(params)))

          # reactor
          reactor_config = configuration['reactor']
          d_inner = reactor_config['diameter_inner']
          temp = reactor_config['temperature']
          mf_gas = reactor_config['gas_mass_flow']

          print('n--- Reactor Parameters ---')
          print(f'd_inner = d_inner')
          print(f'temp = temp')
          print(f'mf_gas = mf_gas')


          dynamically



          for part, parameters in nested_parser(yaml.safe_load(params)):
          print(f'--- part Parameters ---')
          for parameter, value in parameters.items():
          print(f'parameter = value')
          print('n')


          you can check out the pint documentation on string formatting to format the units the way you want







          share|improve this answer















          share|improve this answer



          share|improve this answer








          edited Apr 18 at 14:05


























          answered Apr 17 at 9:40









          Maarten Fabré

          3,204214




          3,204214











          • My next step is to incorporate Pint so thank you for the example. Can you also comment on how to utilize the your approach in a Python script? In my example I use the class objects in params.py to read the YAML dictionary and assign the values to attributes. Then I refer to those classes in the example.py script. Would this approach work with pint? Or is there a different approach I should use?
            – wigging
            Apr 17 at 16:29










          • pint works with this approach. The value of the attributes are not instances of pint.Quantity, so the handling of string methods and so will change, but fundamentally these quantities are no different than floats and ints. You can reform your classes to accept the dict of parameters, and use setattr to set the attributes dynamically. Note that using a class to only hold the parameter values is a bit overkill, and a dict will suffice for that purpose
            – Maarten Fabré
            Apr 18 at 7:17











          • I agree that using a class is overkill. Can you provide an example of how to get the values from the dictionary? I’m thinking that a function like get_value(‘density’) would work but how would I define which density?
            – wigging
            Apr 18 at 12:14










          • One more question. In yaml.safe_load(params) what is params? Is it a string representing the path to the yaml file?
            – wigging
            Apr 19 at 0:40










          • nested_parser takes any dict in the as in the yaml file, so params can be the file or a yaml string
            – Maarten Fabré
            Apr 19 at 19:50
















          • My next step is to incorporate Pint so thank you for the example. Can you also comment on how to utilize the your approach in a Python script? In my example I use the class objects in params.py to read the YAML dictionary and assign the values to attributes. Then I refer to those classes in the example.py script. Would this approach work with pint? Or is there a different approach I should use?
            – wigging
            Apr 17 at 16:29










          • pint works with this approach. The value of the attributes are not instances of pint.Quantity, so the handling of string methods and so will change, but fundamentally these quantities are no different than floats and ints. You can reform your classes to accept the dict of parameters, and use setattr to set the attributes dynamically. Note that using a class to only hold the parameter values is a bit overkill, and a dict will suffice for that purpose
            – Maarten Fabré
            Apr 18 at 7:17











          • I agree that using a class is overkill. Can you provide an example of how to get the values from the dictionary? I’m thinking that a function like get_value(‘density’) would work but how would I define which density?
            – wigging
            Apr 18 at 12:14










          • One more question. In yaml.safe_load(params) what is params? Is it a string representing the path to the yaml file?
            – wigging
            Apr 19 at 0:40










          • nested_parser takes any dict in the as in the yaml file, so params can be the file or a yaml string
            – Maarten Fabré
            Apr 19 at 19:50















          My next step is to incorporate Pint so thank you for the example. Can you also comment on how to utilize the your approach in a Python script? In my example I use the class objects in params.py to read the YAML dictionary and assign the values to attributes. Then I refer to those classes in the example.py script. Would this approach work with pint? Or is there a different approach I should use?
          – wigging
          Apr 17 at 16:29




          My next step is to incorporate Pint so thank you for the example. Can you also comment on how to utilize the your approach in a Python script? In my example I use the class objects in params.py to read the YAML dictionary and assign the values to attributes. Then I refer to those classes in the example.py script. Would this approach work with pint? Or is there a different approach I should use?
          – wigging
          Apr 17 at 16:29












          pint works with this approach. The value of the attributes are not instances of pint.Quantity, so the handling of string methods and so will change, but fundamentally these quantities are no different than floats and ints. You can reform your classes to accept the dict of parameters, and use setattr to set the attributes dynamically. Note that using a class to only hold the parameter values is a bit overkill, and a dict will suffice for that purpose
          – Maarten Fabré
          Apr 18 at 7:17





          pint works with this approach. The value of the attributes are not instances of pint.Quantity, so the handling of string methods and so will change, but fundamentally these quantities are no different than floats and ints. You can reform your classes to accept the dict of parameters, and use setattr to set the attributes dynamically. Note that using a class to only hold the parameter values is a bit overkill, and a dict will suffice for that purpose
          – Maarten Fabré
          Apr 18 at 7:17













          I agree that using a class is overkill. Can you provide an example of how to get the values from the dictionary? I’m thinking that a function like get_value(‘density’) would work but how would I define which density?
          – wigging
          Apr 18 at 12:14




          I agree that using a class is overkill. Can you provide an example of how to get the values from the dictionary? I’m thinking that a function like get_value(‘density’) would work but how would I define which density?
          – wigging
          Apr 18 at 12:14












          One more question. In yaml.safe_load(params) what is params? Is it a string representing the path to the yaml file?
          – wigging
          Apr 19 at 0:40




          One more question. In yaml.safe_load(params) what is params? Is it a string representing the path to the yaml file?
          – wigging
          Apr 19 at 0:40












          nested_parser takes any dict in the as in the yaml file, so params can be the file or a yaml string
          – Maarten Fabré
          Apr 19 at 19:50




          nested_parser takes any dict in the as in the yaml file, so params can be the file or a yaml string
          – Maarten Fabré
          Apr 19 at 19:50












          up vote
          1
          down vote













          1. If you split the configuration fields into magnitude and unit (as you've already done for surface_areas) you won't have to split and parse them in code.

          2. If you then convert your configuration to JSON you won't need to convert strings to numbers. JSON strings must be quoted, and numbers must be unquoted, so the json module will simply do those conversions for you.

          Other than that:



          • Configuration handling should be separate from building other objects - that way it's easy to use your code whether the configuration comes from a file or from command-line parameters.

          • Accessing properties two levels deep (such as pm.biomass.diameter) violates the Law of Demeter. You could write for example an as_parameter_list for each class to get a representation like f'rho_cat = rho_cat' etc.





          share|improve this answer























          • I'm not interested in using JSON for the parameters file because it does not support comments. I plan to use comments to add more information about certain parameters. I also feel that the YAML format is more readable than JSON. Can you provide an example of the configuration handling you mentioned?
            – wigging
            Apr 17 at 2:13










          • Nothing open source off the top of my mind, but I bet any large project that allows configuration either via files or via command line arguments do this.
            – l0b0
            Apr 17 at 2:16















          up vote
          1
          down vote













          1. If you split the configuration fields into magnitude and unit (as you've already done for surface_areas) you won't have to split and parse them in code.

          2. If you then convert your configuration to JSON you won't need to convert strings to numbers. JSON strings must be quoted, and numbers must be unquoted, so the json module will simply do those conversions for you.

          Other than that:



          • Configuration handling should be separate from building other objects - that way it's easy to use your code whether the configuration comes from a file or from command-line parameters.

          • Accessing properties two levels deep (such as pm.biomass.diameter) violates the Law of Demeter. You could write for example an as_parameter_list for each class to get a representation like f'rho_cat = rho_cat' etc.





          share|improve this answer























          • I'm not interested in using JSON for the parameters file because it does not support comments. I plan to use comments to add more information about certain parameters. I also feel that the YAML format is more readable than JSON. Can you provide an example of the configuration handling you mentioned?
            – wigging
            Apr 17 at 2:13










          • Nothing open source off the top of my mind, but I bet any large project that allows configuration either via files or via command line arguments do this.
            – l0b0
            Apr 17 at 2:16













          up vote
          1
          down vote










          up vote
          1
          down vote









          1. If you split the configuration fields into magnitude and unit (as you've already done for surface_areas) you won't have to split and parse them in code.

          2. If you then convert your configuration to JSON you won't need to convert strings to numbers. JSON strings must be quoted, and numbers must be unquoted, so the json module will simply do those conversions for you.

          Other than that:



          • Configuration handling should be separate from building other objects - that way it's easy to use your code whether the configuration comes from a file or from command-line parameters.

          • Accessing properties two levels deep (such as pm.biomass.diameter) violates the Law of Demeter. You could write for example an as_parameter_list for each class to get a representation like f'rho_cat = rho_cat' etc.





          share|improve this answer















          1. If you split the configuration fields into magnitude and unit (as you've already done for surface_areas) you won't have to split and parse them in code.

          2. If you then convert your configuration to JSON you won't need to convert strings to numbers. JSON strings must be quoted, and numbers must be unquoted, so the json module will simply do those conversions for you.

          Other than that:



          • Configuration handling should be separate from building other objects - that way it's easy to use your code whether the configuration comes from a file or from command-line parameters.

          • Accessing properties two levels deep (such as pm.biomass.diameter) violates the Law of Demeter. You could write for example an as_parameter_list for each class to get a representation like f'rho_cat = rho_cat' etc.






          share|improve this answer















          share|improve this answer



          share|improve this answer








          edited Apr 17 at 1:57


























          answered Apr 17 at 1:48









          l0b0

          3,580922




          3,580922











          • I'm not interested in using JSON for the parameters file because it does not support comments. I plan to use comments to add more information about certain parameters. I also feel that the YAML format is more readable than JSON. Can you provide an example of the configuration handling you mentioned?
            – wigging
            Apr 17 at 2:13










          • Nothing open source off the top of my mind, but I bet any large project that allows configuration either via files or via command line arguments do this.
            – l0b0
            Apr 17 at 2:16

















          • I'm not interested in using JSON for the parameters file because it does not support comments. I plan to use comments to add more information about certain parameters. I also feel that the YAML format is more readable than JSON. Can you provide an example of the configuration handling you mentioned?
            – wigging
            Apr 17 at 2:13










          • Nothing open source off the top of my mind, but I bet any large project that allows configuration either via files or via command line arguments do this.
            – l0b0
            Apr 17 at 2:16
















          I'm not interested in using JSON for the parameters file because it does not support comments. I plan to use comments to add more information about certain parameters. I also feel that the YAML format is more readable than JSON. Can you provide an example of the configuration handling you mentioned?
          – wigging
          Apr 17 at 2:13




          I'm not interested in using JSON for the parameters file because it does not support comments. I plan to use comments to add more information about certain parameters. I also feel that the YAML format is more readable than JSON. Can you provide an example of the configuration handling you mentioned?
          – wigging
          Apr 17 at 2:13












          Nothing open source off the top of my mind, but I bet any large project that allows configuration either via files or via command line arguments do this.
          – l0b0
          Apr 17 at 2:16





          Nothing open source off the top of my mind, but I bet any large project that allows configuration either via files or via command line arguments do this.
          – l0b0
          Apr 17 at 2:16













           

          draft saved


          draft discarded


























           


          draft saved


          draft discarded














          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f192252%2fparse-yaml-file-with-nested-parameters-as-a-python-class-object%23new-answer', 'question_page');

          );

          Post as a guest













































































          Popular posts from this blog

          Greedy Best First Search implementation in Rust

          Function to Return a JSON Like Objects Using VBA Collections and Arrays

          C++11 CLH Lock Implementation