Iterate over a list of list names as file names

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP





.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty margin-bottom:0;







up vote
4
down vote

favorite












Interrogating a web API using query-url's, for each query I can get either zero hits, one hit, or multiple hits. Each of those categories needs to go into a separate CSV file for manual review or processing later. (More on this project here and here).



The input data (from a 14K line csv, one line per artist) is full of holes. Only a name is always given, but may be misspelled or in a form that the API does not recognise. Birth dates, death dates may or may not be known, with a precision like for example 'not before may 1533, not after january 1534'. It may also have exact dates in ISO format.



Using those three different output csv's, the user may go back to their source, try to refine their data, and run the script again to get a better match. Exactly one hit is what we go for: a persistent identifier for this specific artist.



In the code below, df is a Pandas dataframe that has all the information in a form that is easiest to interrogate the API with.



First, I try to get an exact match best_q (exact match of name string + any of the available input fields elsewhere in the record), if that yields zero, I try a slightly more loose match bracket_q (any of the words in the literal name string + any of the available input fields elsewhere in the record).



I output the dataframe as a separate csv, and each list of zero hits, single hits, or multiple hits also in a separate csv.



I'm seeking advice on two specific things.



  1. Is there a more Pythonic way of handling the lists? Right now, I think the code is readable enough, but I have one line to generate the lists, another to put them in a list of lists, and another to put them in a list of listnames.


  2. The second thing is the nested if..elif on zero hits for the first query. I know it ain't pretty, but it's still quite readable (to me), and I don't see how I could do that any other way. That is: I have to try best q first, and only if it yields zero, try again with bracket_q.


I have omitted what goes before. It works, it's been reviewed, I'm happy with it.



A final note: I'm not very concerned about performance, because the API is the bottleneck. I am concerned about readability. Users may want to tweak the script, somewhere down the line.



singles, multiples, zeroes = ( for i in range(3))

for row in df.itertuples():
query = best_q(row)
hits, uri = ask_rkd(query)
if hits == 1:
singles.append([row.priref, row.name, hits, uri])
elif hits > 1:
multiples.append([row.priref, row.name, hits])
elif hits == 0:
query = bracket_q(row)
hits, uri = ask_rkd(query)
if hits == 1:
singles.append([row.priref, row.name, hits, uri])
elif hits > 1:
multiples.append([row.priref, row.name, hits])
elif hits == 0:
zeroes.append([row.priref, str(row.name)]) # PM: str!!


lists = singles, multiples, zeroes
listnames = ['singles','multiples','zeroes']

for s, l in zip(listnames, lists):
listfile = '_.csv'.format(input_fname, s)
writelist(list=l, fname=listfile)

outfile = fname + '_out' + ext
df.to_csv(outfile, sep='|', encoding='utf-8-sig')






share|improve this question



























    up vote
    4
    down vote

    favorite












    Interrogating a web API using query-url's, for each query I can get either zero hits, one hit, or multiple hits. Each of those categories needs to go into a separate CSV file for manual review or processing later. (More on this project here and here).



    The input data (from a 14K line csv, one line per artist) is full of holes. Only a name is always given, but may be misspelled or in a form that the API does not recognise. Birth dates, death dates may or may not be known, with a precision like for example 'not before may 1533, not after january 1534'. It may also have exact dates in ISO format.



    Using those three different output csv's, the user may go back to their source, try to refine their data, and run the script again to get a better match. Exactly one hit is what we go for: a persistent identifier for this specific artist.



    In the code below, df is a Pandas dataframe that has all the information in a form that is easiest to interrogate the API with.



    First, I try to get an exact match best_q (exact match of name string + any of the available input fields elsewhere in the record), if that yields zero, I try a slightly more loose match bracket_q (any of the words in the literal name string + any of the available input fields elsewhere in the record).



    I output the dataframe as a separate csv, and each list of zero hits, single hits, or multiple hits also in a separate csv.



    I'm seeking advice on two specific things.



    1. Is there a more Pythonic way of handling the lists? Right now, I think the code is readable enough, but I have one line to generate the lists, another to put them in a list of lists, and another to put them in a list of listnames.


    2. The second thing is the nested if..elif on zero hits for the first query. I know it ain't pretty, but it's still quite readable (to me), and I don't see how I could do that any other way. That is: I have to try best q first, and only if it yields zero, try again with bracket_q.


    I have omitted what goes before. It works, it's been reviewed, I'm happy with it.



    A final note: I'm not very concerned about performance, because the API is the bottleneck. I am concerned about readability. Users may want to tweak the script, somewhere down the line.



    singles, multiples, zeroes = ( for i in range(3))

    for row in df.itertuples():
    query = best_q(row)
    hits, uri = ask_rkd(query)
    if hits == 1:
    singles.append([row.priref, row.name, hits, uri])
    elif hits > 1:
    multiples.append([row.priref, row.name, hits])
    elif hits == 0:
    query = bracket_q(row)
    hits, uri = ask_rkd(query)
    if hits == 1:
    singles.append([row.priref, row.name, hits, uri])
    elif hits > 1:
    multiples.append([row.priref, row.name, hits])
    elif hits == 0:
    zeroes.append([row.priref, str(row.name)]) # PM: str!!


    lists = singles, multiples, zeroes
    listnames = ['singles','multiples','zeroes']

    for s, l in zip(listnames, lists):
    listfile = '_.csv'.format(input_fname, s)
    writelist(list=l, fname=listfile)

    outfile = fname + '_out' + ext
    df.to_csv(outfile, sep='|', encoding='utf-8-sig')






    share|improve this question























      up vote
      4
      down vote

      favorite









      up vote
      4
      down vote

      favorite











      Interrogating a web API using query-url's, for each query I can get either zero hits, one hit, or multiple hits. Each of those categories needs to go into a separate CSV file for manual review or processing later. (More on this project here and here).



      The input data (from a 14K line csv, one line per artist) is full of holes. Only a name is always given, but may be misspelled or in a form that the API does not recognise. Birth dates, death dates may or may not be known, with a precision like for example 'not before may 1533, not after january 1534'. It may also have exact dates in ISO format.



      Using those three different output csv's, the user may go back to their source, try to refine their data, and run the script again to get a better match. Exactly one hit is what we go for: a persistent identifier for this specific artist.



      In the code below, df is a Pandas dataframe that has all the information in a form that is easiest to interrogate the API with.



      First, I try to get an exact match best_q (exact match of name string + any of the available input fields elsewhere in the record), if that yields zero, I try a slightly more loose match bracket_q (any of the words in the literal name string + any of the available input fields elsewhere in the record).



      I output the dataframe as a separate csv, and each list of zero hits, single hits, or multiple hits also in a separate csv.



      I'm seeking advice on two specific things.



      1. Is there a more Pythonic way of handling the lists? Right now, I think the code is readable enough, but I have one line to generate the lists, another to put them in a list of lists, and another to put them in a list of listnames.


      2. The second thing is the nested if..elif on zero hits for the first query. I know it ain't pretty, but it's still quite readable (to me), and I don't see how I could do that any other way. That is: I have to try best q first, and only if it yields zero, try again with bracket_q.


      I have omitted what goes before. It works, it's been reviewed, I'm happy with it.



      A final note: I'm not very concerned about performance, because the API is the bottleneck. I am concerned about readability. Users may want to tweak the script, somewhere down the line.



      singles, multiples, zeroes = ( for i in range(3))

      for row in df.itertuples():
      query = best_q(row)
      hits, uri = ask_rkd(query)
      if hits == 1:
      singles.append([row.priref, row.name, hits, uri])
      elif hits > 1:
      multiples.append([row.priref, row.name, hits])
      elif hits == 0:
      query = bracket_q(row)
      hits, uri = ask_rkd(query)
      if hits == 1:
      singles.append([row.priref, row.name, hits, uri])
      elif hits > 1:
      multiples.append([row.priref, row.name, hits])
      elif hits == 0:
      zeroes.append([row.priref, str(row.name)]) # PM: str!!


      lists = singles, multiples, zeroes
      listnames = ['singles','multiples','zeroes']

      for s, l in zip(listnames, lists):
      listfile = '_.csv'.format(input_fname, s)
      writelist(list=l, fname=listfile)

      outfile = fname + '_out' + ext
      df.to_csv(outfile, sep='|', encoding='utf-8-sig')






      share|improve this question













      Interrogating a web API using query-url's, for each query I can get either zero hits, one hit, or multiple hits. Each of those categories needs to go into a separate CSV file for manual review or processing later. (More on this project here and here).



      The input data (from a 14K line csv, one line per artist) is full of holes. Only a name is always given, but may be misspelled or in a form that the API does not recognise. Birth dates, death dates may or may not be known, with a precision like for example 'not before may 1533, not after january 1534'. It may also have exact dates in ISO format.



      Using those three different output csv's, the user may go back to their source, try to refine their data, and run the script again to get a better match. Exactly one hit is what we go for: a persistent identifier for this specific artist.



      In the code below, df is a Pandas dataframe that has all the information in a form that is easiest to interrogate the API with.



      First, I try to get an exact match best_q (exact match of name string + any of the available input fields elsewhere in the record), if that yields zero, I try a slightly more loose match bracket_q (any of the words in the literal name string + any of the available input fields elsewhere in the record).



      I output the dataframe as a separate csv, and each list of zero hits, single hits, or multiple hits also in a separate csv.



      I'm seeking advice on two specific things.



      1. Is there a more Pythonic way of handling the lists? Right now, I think the code is readable enough, but I have one line to generate the lists, another to put them in a list of lists, and another to put them in a list of listnames.


      2. The second thing is the nested if..elif on zero hits for the first query. I know it ain't pretty, but it's still quite readable (to me), and I don't see how I could do that any other way. That is: I have to try best q first, and only if it yields zero, try again with bracket_q.


      I have omitted what goes before. It works, it's been reviewed, I'm happy with it.



      A final note: I'm not very concerned about performance, because the API is the bottleneck. I am concerned about readability. Users may want to tweak the script, somewhere down the line.



      singles, multiples, zeroes = ( for i in range(3))

      for row in df.itertuples():
      query = best_q(row)
      hits, uri = ask_rkd(query)
      if hits == 1:
      singles.append([row.priref, row.name, hits, uri])
      elif hits > 1:
      multiples.append([row.priref, row.name, hits])
      elif hits == 0:
      query = bracket_q(row)
      hits, uri = ask_rkd(query)
      if hits == 1:
      singles.append([row.priref, row.name, hits, uri])
      elif hits > 1:
      multiples.append([row.priref, row.name, hits])
      elif hits == 0:
      zeroes.append([row.priref, str(row.name)]) # PM: str!!


      lists = singles, multiples, zeroes
      listnames = ['singles','multiples','zeroes']

      for s, l in zip(listnames, lists):
      listfile = '_.csv'.format(input_fname, s)
      writelist(list=l, fname=listfile)

      outfile = fname + '_out' + ext
      df.to_csv(outfile, sep='|', encoding='utf-8-sig')








      share|improve this question












      share|improve this question




      share|improve this question








      edited Aug 3 at 13:42









      Sam Onela

      5,72961543




      5,72961543









      asked Aug 2 at 10:31









      RolfBly

      584317




      584317




















          2 Answers
          2






          active

          oldest

          votes

















          up vote
          2
          down vote



          accepted










          1. You can simplify your if structure. You duplicate the code for hits == 1 and hits > 1. To do this move the if hits == 0 code into a 'guard-statement' that updates the state to the correct one.

          2. You should create a class to help ease your use code. A simple class with an internal list, a name, a selection and a size would allow you to Significantly reduce the amount of code you'd have to write.

          3. All the appends are the same, except you perform a slice to get the size that you'd like, you can do this in the list class made in 2.

          4. You only change what list you append to in your ifs, and so you can use a dictionary to reduce the amount of code needed for this. You'd need to have a 'default' list and to use a dict.get.

          5. You won't need to use zip if you make the list contain the name, leaving a basic for.

          I don't really know what the rest of your functions are, and so I'll leave it at this:



          class List:
          def __init__(self, name, selection, size):
          self._list =
          self.name = name
          self.selection = selection
          self.size = size

          def add(self, value):
          self._list.append(value[:size])

          lists = [
          List('zeroes', 0, 2),
          List('single', 1, 4),
          List('multiples', None, 3),
          ]
          list_selections = l.selection: l for l in lists
          default = list_selections.pop(None)

          for row in df.itertuples():
          hits, uri = ask_rkd(best_q(row))
          if hits == 0:
          hits, uri = ask_rkd(bracket_q(row))

          list_ = list_selections.get(hits, default)
          list_.add([row.priref, str(row.name), hits, uri])

          for list_ in lists:
          listfile = '_.csv'.format(input_fname, list_.name)
          writelist(list=list_, fname=listfile)

          outfile = fname + '_out' + ext
          df.to_csv(outfile, sep='|', encoding='utf-8-sig')





          share|improve this answer






























            up vote
            4
            down vote













            As you already noticed you have repeated code. you have variable names for your lists and also string definitions for list names which most probably should match. while this is no big problem for just 3 lists it could get cumbersome on adding another list. A simple way to avoid such name-to-string matching edits is to hold such variables in a dict() and have the string definition only.



            The second problem is to have different iterables which must match in length and order to be zipped lateron. Avoid this by holding tuples (or other containers) in a single iterable from the beginning. key-value pairs in a dict() also provide this binding.



            I your case I'd recommend to use the strings as keys



            #avoid named variables
            lists = name: for name in ('singles', 'multiples' , 'zeros')

            #access lists via name
            lists['singles'].append(0)

            #access via temporary
            l = lists['singles']
            l.append[0]

            #iterate for saving
            for s, l in lists.items():
            writelist(list=l, fname=s + '.csv')



            EDIT:



            Above answer applies to the first version of code where all that list init was skipped. While all still valid this can now be applied to the real code. concise and following the KISS principle. Names could be improved but are left here for outlining changes only.



            lists = name: for name in ('singles', 'multiples' , 'zeros')

            for row in df.itertuples():
            query = best_q(row)
            hits, uri = ask_rkd(query)
            if hits == 0:
            query = bracket_q(row)
            hits, uri = ask_rkd(query)

            if hits == 1:
            lists['singles'].append([row.priref, row.name, hits, uri])
            elif hits > 1:
            lists['multiples'].append([row.priref, row.name, hits])
            elif hits == 0:
            lists['zeroes'].append([row.priref, str(row.name)]) # PM: str!!

            for s, l in lists.items():
            listfile = '_.csv'.format(input_fname, s)
            writelist(list=l, fname=listfile)

            outfile = fname + '_out' + ext
            df.to_csv(outfile, sep='|', encoding='utf-8-sig')





            share|improve this answer























              Your Answer




              StackExchange.ifUsing("editor", function ()
              return StackExchange.using("mathjaxEditing", function ()
              StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
              StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["\$", "\$"]]);
              );
              );
              , "mathjax-editing");

              StackExchange.ifUsing("editor", function ()
              StackExchange.using("externalEditor", function ()
              StackExchange.using("snippets", function ()
              StackExchange.snippets.init();
              );
              );
              , "code-snippets");

              StackExchange.ready(function()
              var channelOptions =
              tags: "".split(" "),
              id: "196"
              ;
              initTagRenderer("".split(" "), "".split(" "), channelOptions);

              StackExchange.using("externalEditor", function()
              // Have to fire editor after snippets, if snippets enabled
              if (StackExchange.settings.snippets.snippetsEnabled)
              StackExchange.using("snippets", function()
              createEditor();
              );

              else
              createEditor();

              );

              function createEditor()
              StackExchange.prepareEditor(
              heartbeatType: 'answer',
              convertImagesToLinks: false,
              noModals: false,
              showLowRepImageUploadWarning: true,
              reputationToPostImages: null,
              bindNavPrevention: true,
              postfix: "",
              onDemand: true,
              discardSelector: ".discard-answer"
              ,immediatelyShowMarkdownHelp:true
              );



              );








               

              draft saved


              draft discarded


















              StackExchange.ready(
              function ()
              StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f200802%2fiterate-over-a-list-of-list-names-as-file-names%23new-answer', 'question_page');

              );

              Post as a guest






























              2 Answers
              2






              active

              oldest

              votes








              2 Answers
              2






              active

              oldest

              votes









              active

              oldest

              votes






              active

              oldest

              votes








              up vote
              2
              down vote



              accepted










              1. You can simplify your if structure. You duplicate the code for hits == 1 and hits > 1. To do this move the if hits == 0 code into a 'guard-statement' that updates the state to the correct one.

              2. You should create a class to help ease your use code. A simple class with an internal list, a name, a selection and a size would allow you to Significantly reduce the amount of code you'd have to write.

              3. All the appends are the same, except you perform a slice to get the size that you'd like, you can do this in the list class made in 2.

              4. You only change what list you append to in your ifs, and so you can use a dictionary to reduce the amount of code needed for this. You'd need to have a 'default' list and to use a dict.get.

              5. You won't need to use zip if you make the list contain the name, leaving a basic for.

              I don't really know what the rest of your functions are, and so I'll leave it at this:



              class List:
              def __init__(self, name, selection, size):
              self._list =
              self.name = name
              self.selection = selection
              self.size = size

              def add(self, value):
              self._list.append(value[:size])

              lists = [
              List('zeroes', 0, 2),
              List('single', 1, 4),
              List('multiples', None, 3),
              ]
              list_selections = l.selection: l for l in lists
              default = list_selections.pop(None)

              for row in df.itertuples():
              hits, uri = ask_rkd(best_q(row))
              if hits == 0:
              hits, uri = ask_rkd(bracket_q(row))

              list_ = list_selections.get(hits, default)
              list_.add([row.priref, str(row.name), hits, uri])

              for list_ in lists:
              listfile = '_.csv'.format(input_fname, list_.name)
              writelist(list=list_, fname=listfile)

              outfile = fname + '_out' + ext
              df.to_csv(outfile, sep='|', encoding='utf-8-sig')





              share|improve this answer



























                up vote
                2
                down vote



                accepted










                1. You can simplify your if structure. You duplicate the code for hits == 1 and hits > 1. To do this move the if hits == 0 code into a 'guard-statement' that updates the state to the correct one.

                2. You should create a class to help ease your use code. A simple class with an internal list, a name, a selection and a size would allow you to Significantly reduce the amount of code you'd have to write.

                3. All the appends are the same, except you perform a slice to get the size that you'd like, you can do this in the list class made in 2.

                4. You only change what list you append to in your ifs, and so you can use a dictionary to reduce the amount of code needed for this. You'd need to have a 'default' list and to use a dict.get.

                5. You won't need to use zip if you make the list contain the name, leaving a basic for.

                I don't really know what the rest of your functions are, and so I'll leave it at this:



                class List:
                def __init__(self, name, selection, size):
                self._list =
                self.name = name
                self.selection = selection
                self.size = size

                def add(self, value):
                self._list.append(value[:size])

                lists = [
                List('zeroes', 0, 2),
                List('single', 1, 4),
                List('multiples', None, 3),
                ]
                list_selections = l.selection: l for l in lists
                default = list_selections.pop(None)

                for row in df.itertuples():
                hits, uri = ask_rkd(best_q(row))
                if hits == 0:
                hits, uri = ask_rkd(bracket_q(row))

                list_ = list_selections.get(hits, default)
                list_.add([row.priref, str(row.name), hits, uri])

                for list_ in lists:
                listfile = '_.csv'.format(input_fname, list_.name)
                writelist(list=list_, fname=listfile)

                outfile = fname + '_out' + ext
                df.to_csv(outfile, sep='|', encoding='utf-8-sig')





                share|improve this answer

























                  up vote
                  2
                  down vote



                  accepted







                  up vote
                  2
                  down vote



                  accepted






                  1. You can simplify your if structure. You duplicate the code for hits == 1 and hits > 1. To do this move the if hits == 0 code into a 'guard-statement' that updates the state to the correct one.

                  2. You should create a class to help ease your use code. A simple class with an internal list, a name, a selection and a size would allow you to Significantly reduce the amount of code you'd have to write.

                  3. All the appends are the same, except you perform a slice to get the size that you'd like, you can do this in the list class made in 2.

                  4. You only change what list you append to in your ifs, and so you can use a dictionary to reduce the amount of code needed for this. You'd need to have a 'default' list and to use a dict.get.

                  5. You won't need to use zip if you make the list contain the name, leaving a basic for.

                  I don't really know what the rest of your functions are, and so I'll leave it at this:



                  class List:
                  def __init__(self, name, selection, size):
                  self._list =
                  self.name = name
                  self.selection = selection
                  self.size = size

                  def add(self, value):
                  self._list.append(value[:size])

                  lists = [
                  List('zeroes', 0, 2),
                  List('single', 1, 4),
                  List('multiples', None, 3),
                  ]
                  list_selections = l.selection: l for l in lists
                  default = list_selections.pop(None)

                  for row in df.itertuples():
                  hits, uri = ask_rkd(best_q(row))
                  if hits == 0:
                  hits, uri = ask_rkd(bracket_q(row))

                  list_ = list_selections.get(hits, default)
                  list_.add([row.priref, str(row.name), hits, uri])

                  for list_ in lists:
                  listfile = '_.csv'.format(input_fname, list_.name)
                  writelist(list=list_, fname=listfile)

                  outfile = fname + '_out' + ext
                  df.to_csv(outfile, sep='|', encoding='utf-8-sig')





                  share|improve this answer















                  1. You can simplify your if structure. You duplicate the code for hits == 1 and hits > 1. To do this move the if hits == 0 code into a 'guard-statement' that updates the state to the correct one.

                  2. You should create a class to help ease your use code. A simple class with an internal list, a name, a selection and a size would allow you to Significantly reduce the amount of code you'd have to write.

                  3. All the appends are the same, except you perform a slice to get the size that you'd like, you can do this in the list class made in 2.

                  4. You only change what list you append to in your ifs, and so you can use a dictionary to reduce the amount of code needed for this. You'd need to have a 'default' list and to use a dict.get.

                  5. You won't need to use zip if you make the list contain the name, leaving a basic for.

                  I don't really know what the rest of your functions are, and so I'll leave it at this:



                  class List:
                  def __init__(self, name, selection, size):
                  self._list =
                  self.name = name
                  self.selection = selection
                  self.size = size

                  def add(self, value):
                  self._list.append(value[:size])

                  lists = [
                  List('zeroes', 0, 2),
                  List('single', 1, 4),
                  List('multiples', None, 3),
                  ]
                  list_selections = l.selection: l for l in lists
                  default = list_selections.pop(None)

                  for row in df.itertuples():
                  hits, uri = ask_rkd(best_q(row))
                  if hits == 0:
                  hits, uri = ask_rkd(bracket_q(row))

                  list_ = list_selections.get(hits, default)
                  list_.add([row.priref, str(row.name), hits, uri])

                  for list_ in lists:
                  listfile = '_.csv'.format(input_fname, list_.name)
                  writelist(list=list_, fname=listfile)

                  outfile = fname + '_out' + ext
                  df.to_csv(outfile, sep='|', encoding='utf-8-sig')






                  share|improve this answer















                  share|improve this answer



                  share|improve this answer








                  edited Aug 3 at 14:32









                  Malachi♦

                  25.3k769173




                  25.3k769173











                  answered Aug 3 at 8:03









                  Peilonrayz

                  24.3k336101




                  24.3k336101






















                      up vote
                      4
                      down vote













                      As you already noticed you have repeated code. you have variable names for your lists and also string definitions for list names which most probably should match. while this is no big problem for just 3 lists it could get cumbersome on adding another list. A simple way to avoid such name-to-string matching edits is to hold such variables in a dict() and have the string definition only.



                      The second problem is to have different iterables which must match in length and order to be zipped lateron. Avoid this by holding tuples (or other containers) in a single iterable from the beginning. key-value pairs in a dict() also provide this binding.



                      I your case I'd recommend to use the strings as keys



                      #avoid named variables
                      lists = name: for name in ('singles', 'multiples' , 'zeros')

                      #access lists via name
                      lists['singles'].append(0)

                      #access via temporary
                      l = lists['singles']
                      l.append[0]

                      #iterate for saving
                      for s, l in lists.items():
                      writelist(list=l, fname=s + '.csv')



                      EDIT:



                      Above answer applies to the first version of code where all that list init was skipped. While all still valid this can now be applied to the real code. concise and following the KISS principle. Names could be improved but are left here for outlining changes only.



                      lists = name: for name in ('singles', 'multiples' , 'zeros')

                      for row in df.itertuples():
                      query = best_q(row)
                      hits, uri = ask_rkd(query)
                      if hits == 0:
                      query = bracket_q(row)
                      hits, uri = ask_rkd(query)

                      if hits == 1:
                      lists['singles'].append([row.priref, row.name, hits, uri])
                      elif hits > 1:
                      lists['multiples'].append([row.priref, row.name, hits])
                      elif hits == 0:
                      lists['zeroes'].append([row.priref, str(row.name)]) # PM: str!!

                      for s, l in lists.items():
                      listfile = '_.csv'.format(input_fname, s)
                      writelist(list=l, fname=listfile)

                      outfile = fname + '_out' + ext
                      df.to_csv(outfile, sep='|', encoding='utf-8-sig')





                      share|improve this answer



























                        up vote
                        4
                        down vote













                        As you already noticed you have repeated code. you have variable names for your lists and also string definitions for list names which most probably should match. while this is no big problem for just 3 lists it could get cumbersome on adding another list. A simple way to avoid such name-to-string matching edits is to hold such variables in a dict() and have the string definition only.



                        The second problem is to have different iterables which must match in length and order to be zipped lateron. Avoid this by holding tuples (or other containers) in a single iterable from the beginning. key-value pairs in a dict() also provide this binding.



                        I your case I'd recommend to use the strings as keys



                        #avoid named variables
                        lists = name: for name in ('singles', 'multiples' , 'zeros')

                        #access lists via name
                        lists['singles'].append(0)

                        #access via temporary
                        l = lists['singles']
                        l.append[0]

                        #iterate for saving
                        for s, l in lists.items():
                        writelist(list=l, fname=s + '.csv')



                        EDIT:



                        Above answer applies to the first version of code where all that list init was skipped. While all still valid this can now be applied to the real code. concise and following the KISS principle. Names could be improved but are left here for outlining changes only.



                        lists = name: for name in ('singles', 'multiples' , 'zeros')

                        for row in df.itertuples():
                        query = best_q(row)
                        hits, uri = ask_rkd(query)
                        if hits == 0:
                        query = bracket_q(row)
                        hits, uri = ask_rkd(query)

                        if hits == 1:
                        lists['singles'].append([row.priref, row.name, hits, uri])
                        elif hits > 1:
                        lists['multiples'].append([row.priref, row.name, hits])
                        elif hits == 0:
                        lists['zeroes'].append([row.priref, str(row.name)]) # PM: str!!

                        for s, l in lists.items():
                        listfile = '_.csv'.format(input_fname, s)
                        writelist(list=l, fname=listfile)

                        outfile = fname + '_out' + ext
                        df.to_csv(outfile, sep='|', encoding='utf-8-sig')





                        share|improve this answer

























                          up vote
                          4
                          down vote










                          up vote
                          4
                          down vote









                          As you already noticed you have repeated code. you have variable names for your lists and also string definitions for list names which most probably should match. while this is no big problem for just 3 lists it could get cumbersome on adding another list. A simple way to avoid such name-to-string matching edits is to hold such variables in a dict() and have the string definition only.



                          The second problem is to have different iterables which must match in length and order to be zipped lateron. Avoid this by holding tuples (or other containers) in a single iterable from the beginning. key-value pairs in a dict() also provide this binding.



                          I your case I'd recommend to use the strings as keys



                          #avoid named variables
                          lists = name: for name in ('singles', 'multiples' , 'zeros')

                          #access lists via name
                          lists['singles'].append(0)

                          #access via temporary
                          l = lists['singles']
                          l.append[0]

                          #iterate for saving
                          for s, l in lists.items():
                          writelist(list=l, fname=s + '.csv')



                          EDIT:



                          Above answer applies to the first version of code where all that list init was skipped. While all still valid this can now be applied to the real code. concise and following the KISS principle. Names could be improved but are left here for outlining changes only.



                          lists = name: for name in ('singles', 'multiples' , 'zeros')

                          for row in df.itertuples():
                          query = best_q(row)
                          hits, uri = ask_rkd(query)
                          if hits == 0:
                          query = bracket_q(row)
                          hits, uri = ask_rkd(query)

                          if hits == 1:
                          lists['singles'].append([row.priref, row.name, hits, uri])
                          elif hits > 1:
                          lists['multiples'].append([row.priref, row.name, hits])
                          elif hits == 0:
                          lists['zeroes'].append([row.priref, str(row.name)]) # PM: str!!

                          for s, l in lists.items():
                          listfile = '_.csv'.format(input_fname, s)
                          writelist(list=l, fname=listfile)

                          outfile = fname + '_out' + ext
                          df.to_csv(outfile, sep='|', encoding='utf-8-sig')





                          share|improve this answer















                          As you already noticed you have repeated code. you have variable names for your lists and also string definitions for list names which most probably should match. while this is no big problem for just 3 lists it could get cumbersome on adding another list. A simple way to avoid such name-to-string matching edits is to hold such variables in a dict() and have the string definition only.



                          The second problem is to have different iterables which must match in length and order to be zipped lateron. Avoid this by holding tuples (or other containers) in a single iterable from the beginning. key-value pairs in a dict() also provide this binding.



                          I your case I'd recommend to use the strings as keys



                          #avoid named variables
                          lists = name: for name in ('singles', 'multiples' , 'zeros')

                          #access lists via name
                          lists['singles'].append(0)

                          #access via temporary
                          l = lists['singles']
                          l.append[0]

                          #iterate for saving
                          for s, l in lists.items():
                          writelist(list=l, fname=s + '.csv')



                          EDIT:



                          Above answer applies to the first version of code where all that list init was skipped. While all still valid this can now be applied to the real code. concise and following the KISS principle. Names could be improved but are left here for outlining changes only.



                          lists = name: for name in ('singles', 'multiples' , 'zeros')

                          for row in df.itertuples():
                          query = best_q(row)
                          hits, uri = ask_rkd(query)
                          if hits == 0:
                          query = bracket_q(row)
                          hits, uri = ask_rkd(query)

                          if hits == 1:
                          lists['singles'].append([row.priref, row.name, hits, uri])
                          elif hits > 1:
                          lists['multiples'].append([row.priref, row.name, hits])
                          elif hits == 0:
                          lists['zeroes'].append([row.priref, str(row.name)]) # PM: str!!

                          for s, l in lists.items():
                          listfile = '_.csv'.format(input_fname, s)
                          writelist(list=l, fname=listfile)

                          outfile = fname + '_out' + ext
                          df.to_csv(outfile, sep='|', encoding='utf-8-sig')






                          share|improve this answer















                          share|improve this answer



                          share|improve this answer








                          edited Aug 3 at 13:53


























                          answered Aug 2 at 11:02









                          stefan

                          1,151110




                          1,151110






















                               

                              draft saved


                              draft discarded


























                               


                              draft saved


                              draft discarded














                              StackExchange.ready(
                              function ()
                              StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f200802%2fiterate-over-a-list-of-list-names-as-file-names%23new-answer', 'question_page');

                              );

                              Post as a guest













































































                              Popular posts from this blog

                              Greedy Best First Search implementation in Rust

                              Function to Return a JSON Like Objects Using VBA Collections and Arrays

                              C++11 CLH Lock Implementation