Reclassifying movies by theme

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP





.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty margin-bottom:0;







up vote
1
down vote

favorite












Any efficient way to solve the following problem assuming data is large. I solved the problem but how can I improve the code, which will make it efficient. any suggestions?



Data:



movie_sub_themes = 
'Epic': ['Ben Hur', 'Gone With the Wind', 'Lawrence of Arabia'],
'Spy': ['James Bond', 'Salt', 'Mission: Impossible'],
'Superhero': ['The Dark Knight Trilogy', 'Hancock, Superman'],
'Gangster': ['Gangs of New York', 'City of God', 'Reservoir Dogs'],
'Fairy Tale': ['Maleficent', 'Into the Woods', 'Jack the Giant Killer'],
'Romantic':['Casablanca', 'The English Patient', 'A Walk to Remember'],
'Epic Fantasy': ['Lord of the Rings', 'Chronicles of Narnia', 'Beowulf']

movie_themes =
'Action': ['Epic', 'Spy', 'Superhero'],
'Crime' : ['Gangster'],
'Fantasy' : ['Fairy Tale', 'Epic Fantasy'],
'Romance' : ['Romantic']

themes_keys = movie_themes.keys()
theme_movies_keys = movie_sub_themes.keys()

#Iterate in movie_themes
#Check movie_themes keys in movie_sub_keys
#if yes append the movie_sub_keys into the newdict
newdict =
for i in range(len(themes_keys)):
a =
for j in range(len(movie_themes[themes_keys[i]])):
try:
if movie_themes[themes_keys[i]][j] in theme_movies_keys:
a.append(movie_sub_themes[movie_themes[themes_keys[i]][j]])
except:
pass
newdict[themes_keys[i]] = a

# newdict contains nested lists
# Program to unpack the nested list into single list
# Storing the value into theme_movies_data
theme_movies_data =
for k, v in newdict.iteritems():
mylist_n = [j for i in v for j in i]
theme_movies_data[k] = dict.fromkeys(mylist_n).keys()

print (theme_movies_data)


Output:



'Action': ['Gone With the Wind', 'Ben Hur','Hancock, Superman','Mission: Impossible','James Bond','Lawrence of Arabia','Salt','The Dark Knight Trilogy'],
'Crime': ['City of God', 'Reservoir Dogs', 'Gangs of New York'],
'Fantasy': ['Jack the Giant Killer','Beowulf','Into the Woods','Maleficent','Lord of the Rings','Chronicles of Narnia'],
'Romance': ['The English Patient', 'A Walk to Remember', 'Casablanca']


Apologies for not properly commenting the code.



I am more concern about the running time.







share|improve this question

















  • 2




    Welcome to Code Review. Your title should explain what your code does on this site. codereview.stackexchange.com/help/how-to-ask
    – chicks
    May 22 at 15:27






  • 1




    "Any efficient way to solve the following problem"… Which problem?
    – Mathias Ettinger
    May 22 at 15:38
















up vote
1
down vote

favorite












Any efficient way to solve the following problem assuming data is large. I solved the problem but how can I improve the code, which will make it efficient. any suggestions?



Data:



movie_sub_themes = 
'Epic': ['Ben Hur', 'Gone With the Wind', 'Lawrence of Arabia'],
'Spy': ['James Bond', 'Salt', 'Mission: Impossible'],
'Superhero': ['The Dark Knight Trilogy', 'Hancock, Superman'],
'Gangster': ['Gangs of New York', 'City of God', 'Reservoir Dogs'],
'Fairy Tale': ['Maleficent', 'Into the Woods', 'Jack the Giant Killer'],
'Romantic':['Casablanca', 'The English Patient', 'A Walk to Remember'],
'Epic Fantasy': ['Lord of the Rings', 'Chronicles of Narnia', 'Beowulf']

movie_themes =
'Action': ['Epic', 'Spy', 'Superhero'],
'Crime' : ['Gangster'],
'Fantasy' : ['Fairy Tale', 'Epic Fantasy'],
'Romance' : ['Romantic']

themes_keys = movie_themes.keys()
theme_movies_keys = movie_sub_themes.keys()

#Iterate in movie_themes
#Check movie_themes keys in movie_sub_keys
#if yes append the movie_sub_keys into the newdict
newdict =
for i in range(len(themes_keys)):
a =
for j in range(len(movie_themes[themes_keys[i]])):
try:
if movie_themes[themes_keys[i]][j] in theme_movies_keys:
a.append(movie_sub_themes[movie_themes[themes_keys[i]][j]])
except:
pass
newdict[themes_keys[i]] = a

# newdict contains nested lists
# Program to unpack the nested list into single list
# Storing the value into theme_movies_data
theme_movies_data =
for k, v in newdict.iteritems():
mylist_n = [j for i in v for j in i]
theme_movies_data[k] = dict.fromkeys(mylist_n).keys()

print (theme_movies_data)


Output:



'Action': ['Gone With the Wind', 'Ben Hur','Hancock, Superman','Mission: Impossible','James Bond','Lawrence of Arabia','Salt','The Dark Knight Trilogy'],
'Crime': ['City of God', 'Reservoir Dogs', 'Gangs of New York'],
'Fantasy': ['Jack the Giant Killer','Beowulf','Into the Woods','Maleficent','Lord of the Rings','Chronicles of Narnia'],
'Romance': ['The English Patient', 'A Walk to Remember', 'Casablanca']


Apologies for not properly commenting the code.



I am more concern about the running time.







share|improve this question

















  • 2




    Welcome to Code Review. Your title should explain what your code does on this site. codereview.stackexchange.com/help/how-to-ask
    – chicks
    May 22 at 15:27






  • 1




    "Any efficient way to solve the following problem"… Which problem?
    – Mathias Ettinger
    May 22 at 15:38












up vote
1
down vote

favorite









up vote
1
down vote

favorite











Any efficient way to solve the following problem assuming data is large. I solved the problem but how can I improve the code, which will make it efficient. any suggestions?



Data:



movie_sub_themes = 
'Epic': ['Ben Hur', 'Gone With the Wind', 'Lawrence of Arabia'],
'Spy': ['James Bond', 'Salt', 'Mission: Impossible'],
'Superhero': ['The Dark Knight Trilogy', 'Hancock, Superman'],
'Gangster': ['Gangs of New York', 'City of God', 'Reservoir Dogs'],
'Fairy Tale': ['Maleficent', 'Into the Woods', 'Jack the Giant Killer'],
'Romantic':['Casablanca', 'The English Patient', 'A Walk to Remember'],
'Epic Fantasy': ['Lord of the Rings', 'Chronicles of Narnia', 'Beowulf']

movie_themes =
'Action': ['Epic', 'Spy', 'Superhero'],
'Crime' : ['Gangster'],
'Fantasy' : ['Fairy Tale', 'Epic Fantasy'],
'Romance' : ['Romantic']

themes_keys = movie_themes.keys()
theme_movies_keys = movie_sub_themes.keys()

#Iterate in movie_themes
#Check movie_themes keys in movie_sub_keys
#if yes append the movie_sub_keys into the newdict
newdict =
for i in range(len(themes_keys)):
a =
for j in range(len(movie_themes[themes_keys[i]])):
try:
if movie_themes[themes_keys[i]][j] in theme_movies_keys:
a.append(movie_sub_themes[movie_themes[themes_keys[i]][j]])
except:
pass
newdict[themes_keys[i]] = a

# newdict contains nested lists
# Program to unpack the nested list into single list
# Storing the value into theme_movies_data
theme_movies_data =
for k, v in newdict.iteritems():
mylist_n = [j for i in v for j in i]
theme_movies_data[k] = dict.fromkeys(mylist_n).keys()

print (theme_movies_data)


Output:



'Action': ['Gone With the Wind', 'Ben Hur','Hancock, Superman','Mission: Impossible','James Bond','Lawrence of Arabia','Salt','The Dark Knight Trilogy'],
'Crime': ['City of God', 'Reservoir Dogs', 'Gangs of New York'],
'Fantasy': ['Jack the Giant Killer','Beowulf','Into the Woods','Maleficent','Lord of the Rings','Chronicles of Narnia'],
'Romance': ['The English Patient', 'A Walk to Remember', 'Casablanca']


Apologies for not properly commenting the code.



I am more concern about the running time.







share|improve this question













Any efficient way to solve the following problem assuming data is large. I solved the problem but how can I improve the code, which will make it efficient. any suggestions?



Data:



movie_sub_themes = 
'Epic': ['Ben Hur', 'Gone With the Wind', 'Lawrence of Arabia'],
'Spy': ['James Bond', 'Salt', 'Mission: Impossible'],
'Superhero': ['The Dark Knight Trilogy', 'Hancock, Superman'],
'Gangster': ['Gangs of New York', 'City of God', 'Reservoir Dogs'],
'Fairy Tale': ['Maleficent', 'Into the Woods', 'Jack the Giant Killer'],
'Romantic':['Casablanca', 'The English Patient', 'A Walk to Remember'],
'Epic Fantasy': ['Lord of the Rings', 'Chronicles of Narnia', 'Beowulf']

movie_themes =
'Action': ['Epic', 'Spy', 'Superhero'],
'Crime' : ['Gangster'],
'Fantasy' : ['Fairy Tale', 'Epic Fantasy'],
'Romance' : ['Romantic']

themes_keys = movie_themes.keys()
theme_movies_keys = movie_sub_themes.keys()

#Iterate in movie_themes
#Check movie_themes keys in movie_sub_keys
#if yes append the movie_sub_keys into the newdict
newdict =
for i in range(len(themes_keys)):
a =
for j in range(len(movie_themes[themes_keys[i]])):
try:
if movie_themes[themes_keys[i]][j] in theme_movies_keys:
a.append(movie_sub_themes[movie_themes[themes_keys[i]][j]])
except:
pass
newdict[themes_keys[i]] = a

# newdict contains nested lists
# Program to unpack the nested list into single list
# Storing the value into theme_movies_data
theme_movies_data =
for k, v in newdict.iteritems():
mylist_n = [j for i in v for j in i]
theme_movies_data[k] = dict.fromkeys(mylist_n).keys()

print (theme_movies_data)


Output:



'Action': ['Gone With the Wind', 'Ben Hur','Hancock, Superman','Mission: Impossible','James Bond','Lawrence of Arabia','Salt','The Dark Knight Trilogy'],
'Crime': ['City of God', 'Reservoir Dogs', 'Gangs of New York'],
'Fantasy': ['Jack the Giant Killer','Beowulf','Into the Woods','Maleficent','Lord of the Rings','Chronicles of Narnia'],
'Romance': ['The English Patient', 'A Walk to Remember', 'Casablanca']


Apologies for not properly commenting the code.



I am more concern about the running time.









share|improve this question












share|improve this question




share|improve this question








edited May 22 at 17:59









200_success

123k14143399




123k14143399









asked May 22 at 15:21









Ajay Shewale

83




83







  • 2




    Welcome to Code Review. Your title should explain what your code does on this site. codereview.stackexchange.com/help/how-to-ask
    – chicks
    May 22 at 15:27






  • 1




    "Any efficient way to solve the following problem"… Which problem?
    – Mathias Ettinger
    May 22 at 15:38












  • 2




    Welcome to Code Review. Your title should explain what your code does on this site. codereview.stackexchange.com/help/how-to-ask
    – chicks
    May 22 at 15:27






  • 1




    "Any efficient way to solve the following problem"… Which problem?
    – Mathias Ettinger
    May 22 at 15:38







2




2




Welcome to Code Review. Your title should explain what your code does on this site. codereview.stackexchange.com/help/how-to-ask
– chicks
May 22 at 15:27




Welcome to Code Review. Your title should explain what your code does on this site. codereview.stackexchange.com/help/how-to-ask
– chicks
May 22 at 15:27




1




1




"Any efficient way to solve the following problem"… Which problem?
– Mathias Ettinger
May 22 at 15:38




"Any efficient way to solve the following problem"… Which problem?
– Mathias Ettinger
May 22 at 15:38










2 Answers
2






active

oldest

votes

















up vote
3
down vote



accepted











  1. theme_movies_data and newdict are bad variable names, change them to ones easier to read. This will reduce the amount of comments you need in your code.

  2. You can simplify your code if you stop using range and use dict.iteritems more.

  3. You shouldn't need your try. You would know this if you use range less.

  4. You don't need dict.fromkeys(mylist_n).keys() it's just useless.

new_dict = 
for key, themes in movie_themes.items():
a =
for theme in themes:
if theme in movie_sub_themes:
a.append(movie_sub_themes[theme])
new_dict[key] = a


theme_movies =
for key, movie_groups in new_dict.iteritems():
theme_movies_data[key] = [
movie
for movies in movie_groups
for movie in movies
]

print(theme_movies)


  1. You can remove the need for the second loop if you use a.extend.

  2. You can change the creation of a to a comprehension.

  3. You can change the creation of theme_movies to a dictionary comprehension.

theme_movies = 
key: sum(
movie_sub_themes.get(theme, )
for theme in themes
)
for key, themes in movie_themes.iteritems()


print(theme_movies)


Alternately if you don't like sum:



theme_movies = 
key: [
movie
for theme in themes
for movie in movie_sub_themes.get(theme, )
]
for key, themes in movie_themes.iteritems()


print(theme_movies)





share|improve this answer




























    up vote
    1
    down vote













    Here's my solution (using defaultdict):



    movie_sub_themes = 
    'Epic': ['Ben Hur', 'Gone With the Wind', 'Lawrence of Arabia'],
    'Spy': ['James Bond', 'Salt', 'Mission: Impossible'],
    'Superhero': ['The Dark Knight Trilogy', 'Hancock, Superman'],
    'Gangster': ['Gangs of New York', 'City of God', 'Reservoir Dogs'],
    'Fairy Tale': ['Maleficent', 'Into the Woods', 'Jack the Giant Killer'],
    'Romantic':['Casablanca', 'The English Patient', 'A Walk to Remember'],
    'Epic Fantasy': ['Lord of the Rings', 'Chronicles of Narnia', 'Beowulf']

    movie_themes =
    'Action': ['Epic', 'Spy', 'Superhero'],
    'Crime' : ['Gangster'],
    'Fantasy' : ['Fairy Tale', 'Epic Fantasy'],
    'Romance' : ['Romantic']

    from collections import defaultdict
    newdict = defaultdict(list)

    for theme, sub_themes_list in movie_themes.items():
    for sub_theme in sub_themes_list:
    newdict[theme] += movie_sub_themes.get(sub_theme, )

    dict(newdict)

    >> 'Action': ['Ben Hur',
    'Gone With the Wind',
    'Lawrence of Arabia',
    'James Bond',
    'Salt',
    'Mission: Impossible',
    'The Dark Knight Trilogy',
    'Hancock, Superman'],
    'Crime': ['Gangs of New York', 'City of God', 'Reservoir Dogs'],
    'Fantasy': ['Maleficent',
    'Into the Woods',
    'Jack the Giant Killer',
    'Lord of the Rings',
    'Chronicles of Narnia',
    'Beowulf'],
    'Romance': ['Casablanca', 'The English Patient', 'A Walk to Remember']


    timings: 4.84 µs vs 14.6 µs






    share|improve this answer



















    • 1




      Since it's Python 2.7, consider using iteritems instead of items. Also the immutable data at the top should be using tuples instead of lists.
      – Reinderien
      May 22 at 18:41










    • Also, in your inner loop, avoid writing newdict[theme]. You should cache the result of this index lookup in the level above.
      – Reinderien
      May 22 at 18:43










    Your Answer




    StackExchange.ifUsing("editor", function ()
    return StackExchange.using("mathjaxEditing", function ()
    StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
    StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["\$", "\$"]]);
    );
    );
    , "mathjax-editing");

    StackExchange.ifUsing("editor", function ()
    StackExchange.using("externalEditor", function ()
    StackExchange.using("snippets", function ()
    StackExchange.snippets.init();
    );
    );
    , "code-snippets");

    StackExchange.ready(function()
    var channelOptions =
    tags: "".split(" "),
    id: "196"
    ;
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function()
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled)
    StackExchange.using("snippets", function()
    createEditor();
    );

    else
    createEditor();

    );

    function createEditor()
    StackExchange.prepareEditor(
    heartbeatType: 'answer',
    convertImagesToLinks: false,
    noModals: false,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: null,
    bindNavPrevention: true,
    postfix: "",
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    );



    );








     

    draft saved


    draft discarded


















    StackExchange.ready(
    function ()
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f194952%2freclassifying-movies-by-theme%23new-answer', 'question_page');

    );

    Post as a guest






























    2 Answers
    2






    active

    oldest

    votes








    2 Answers
    2






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes








    up vote
    3
    down vote



    accepted











    1. theme_movies_data and newdict are bad variable names, change them to ones easier to read. This will reduce the amount of comments you need in your code.

    2. You can simplify your code if you stop using range and use dict.iteritems more.

    3. You shouldn't need your try. You would know this if you use range less.

    4. You don't need dict.fromkeys(mylist_n).keys() it's just useless.

    new_dict = 
    for key, themes in movie_themes.items():
    a =
    for theme in themes:
    if theme in movie_sub_themes:
    a.append(movie_sub_themes[theme])
    new_dict[key] = a


    theme_movies =
    for key, movie_groups in new_dict.iteritems():
    theme_movies_data[key] = [
    movie
    for movies in movie_groups
    for movie in movies
    ]

    print(theme_movies)


    1. You can remove the need for the second loop if you use a.extend.

    2. You can change the creation of a to a comprehension.

    3. You can change the creation of theme_movies to a dictionary comprehension.

    theme_movies = 
    key: sum(
    movie_sub_themes.get(theme, )
    for theme in themes
    )
    for key, themes in movie_themes.iteritems()


    print(theme_movies)


    Alternately if you don't like sum:



    theme_movies = 
    key: [
    movie
    for theme in themes
    for movie in movie_sub_themes.get(theme, )
    ]
    for key, themes in movie_themes.iteritems()


    print(theme_movies)





    share|improve this answer

























      up vote
      3
      down vote



      accepted











      1. theme_movies_data and newdict are bad variable names, change them to ones easier to read. This will reduce the amount of comments you need in your code.

      2. You can simplify your code if you stop using range and use dict.iteritems more.

      3. You shouldn't need your try. You would know this if you use range less.

      4. You don't need dict.fromkeys(mylist_n).keys() it's just useless.

      new_dict = 
      for key, themes in movie_themes.items():
      a =
      for theme in themes:
      if theme in movie_sub_themes:
      a.append(movie_sub_themes[theme])
      new_dict[key] = a


      theme_movies =
      for key, movie_groups in new_dict.iteritems():
      theme_movies_data[key] = [
      movie
      for movies in movie_groups
      for movie in movies
      ]

      print(theme_movies)


      1. You can remove the need for the second loop if you use a.extend.

      2. You can change the creation of a to a comprehension.

      3. You can change the creation of theme_movies to a dictionary comprehension.

      theme_movies = 
      key: sum(
      movie_sub_themes.get(theme, )
      for theme in themes
      )
      for key, themes in movie_themes.iteritems()


      print(theme_movies)


      Alternately if you don't like sum:



      theme_movies = 
      key: [
      movie
      for theme in themes
      for movie in movie_sub_themes.get(theme, )
      ]
      for key, themes in movie_themes.iteritems()


      print(theme_movies)





      share|improve this answer























        up vote
        3
        down vote



        accepted







        up vote
        3
        down vote



        accepted







        1. theme_movies_data and newdict are bad variable names, change them to ones easier to read. This will reduce the amount of comments you need in your code.

        2. You can simplify your code if you stop using range and use dict.iteritems more.

        3. You shouldn't need your try. You would know this if you use range less.

        4. You don't need dict.fromkeys(mylist_n).keys() it's just useless.

        new_dict = 
        for key, themes in movie_themes.items():
        a =
        for theme in themes:
        if theme in movie_sub_themes:
        a.append(movie_sub_themes[theme])
        new_dict[key] = a


        theme_movies =
        for key, movie_groups in new_dict.iteritems():
        theme_movies_data[key] = [
        movie
        for movies in movie_groups
        for movie in movies
        ]

        print(theme_movies)


        1. You can remove the need for the second loop if you use a.extend.

        2. You can change the creation of a to a comprehension.

        3. You can change the creation of theme_movies to a dictionary comprehension.

        theme_movies = 
        key: sum(
        movie_sub_themes.get(theme, )
        for theme in themes
        )
        for key, themes in movie_themes.iteritems()


        print(theme_movies)


        Alternately if you don't like sum:



        theme_movies = 
        key: [
        movie
        for theme in themes
        for movie in movie_sub_themes.get(theme, )
        ]
        for key, themes in movie_themes.iteritems()


        print(theme_movies)





        share|improve this answer














        1. theme_movies_data and newdict are bad variable names, change them to ones easier to read. This will reduce the amount of comments you need in your code.

        2. You can simplify your code if you stop using range and use dict.iteritems more.

        3. You shouldn't need your try. You would know this if you use range less.

        4. You don't need dict.fromkeys(mylist_n).keys() it's just useless.

        new_dict = 
        for key, themes in movie_themes.items():
        a =
        for theme in themes:
        if theme in movie_sub_themes:
        a.append(movie_sub_themes[theme])
        new_dict[key] = a


        theme_movies =
        for key, movie_groups in new_dict.iteritems():
        theme_movies_data[key] = [
        movie
        for movies in movie_groups
        for movie in movies
        ]

        print(theme_movies)


        1. You can remove the need for the second loop if you use a.extend.

        2. You can change the creation of a to a comprehension.

        3. You can change the creation of theme_movies to a dictionary comprehension.

        theme_movies = 
        key: sum(
        movie_sub_themes.get(theme, )
        for theme in themes
        )
        for key, themes in movie_themes.iteritems()


        print(theme_movies)


        Alternately if you don't like sum:



        theme_movies = 
        key: [
        movie
        for theme in themes
        for movie in movie_sub_themes.get(theme, )
        ]
        for key, themes in movie_themes.iteritems()


        print(theme_movies)






        share|improve this answer













        share|improve this answer



        share|improve this answer











        answered May 22 at 15:42









        Peilonrayz

        24.3k336102




        24.3k336102






















            up vote
            1
            down vote













            Here's my solution (using defaultdict):



            movie_sub_themes = 
            'Epic': ['Ben Hur', 'Gone With the Wind', 'Lawrence of Arabia'],
            'Spy': ['James Bond', 'Salt', 'Mission: Impossible'],
            'Superhero': ['The Dark Knight Trilogy', 'Hancock, Superman'],
            'Gangster': ['Gangs of New York', 'City of God', 'Reservoir Dogs'],
            'Fairy Tale': ['Maleficent', 'Into the Woods', 'Jack the Giant Killer'],
            'Romantic':['Casablanca', 'The English Patient', 'A Walk to Remember'],
            'Epic Fantasy': ['Lord of the Rings', 'Chronicles of Narnia', 'Beowulf']

            movie_themes =
            'Action': ['Epic', 'Spy', 'Superhero'],
            'Crime' : ['Gangster'],
            'Fantasy' : ['Fairy Tale', 'Epic Fantasy'],
            'Romance' : ['Romantic']

            from collections import defaultdict
            newdict = defaultdict(list)

            for theme, sub_themes_list in movie_themes.items():
            for sub_theme in sub_themes_list:
            newdict[theme] += movie_sub_themes.get(sub_theme, )

            dict(newdict)

            >> 'Action': ['Ben Hur',
            'Gone With the Wind',
            'Lawrence of Arabia',
            'James Bond',
            'Salt',
            'Mission: Impossible',
            'The Dark Knight Trilogy',
            'Hancock, Superman'],
            'Crime': ['Gangs of New York', 'City of God', 'Reservoir Dogs'],
            'Fantasy': ['Maleficent',
            'Into the Woods',
            'Jack the Giant Killer',
            'Lord of the Rings',
            'Chronicles of Narnia',
            'Beowulf'],
            'Romance': ['Casablanca', 'The English Patient', 'A Walk to Remember']


            timings: 4.84 µs vs 14.6 µs






            share|improve this answer



















            • 1




              Since it's Python 2.7, consider using iteritems instead of items. Also the immutable data at the top should be using tuples instead of lists.
              – Reinderien
              May 22 at 18:41










            • Also, in your inner loop, avoid writing newdict[theme]. You should cache the result of this index lookup in the level above.
              – Reinderien
              May 22 at 18:43














            up vote
            1
            down vote













            Here's my solution (using defaultdict):



            movie_sub_themes = 
            'Epic': ['Ben Hur', 'Gone With the Wind', 'Lawrence of Arabia'],
            'Spy': ['James Bond', 'Salt', 'Mission: Impossible'],
            'Superhero': ['The Dark Knight Trilogy', 'Hancock, Superman'],
            'Gangster': ['Gangs of New York', 'City of God', 'Reservoir Dogs'],
            'Fairy Tale': ['Maleficent', 'Into the Woods', 'Jack the Giant Killer'],
            'Romantic':['Casablanca', 'The English Patient', 'A Walk to Remember'],
            'Epic Fantasy': ['Lord of the Rings', 'Chronicles of Narnia', 'Beowulf']

            movie_themes =
            'Action': ['Epic', 'Spy', 'Superhero'],
            'Crime' : ['Gangster'],
            'Fantasy' : ['Fairy Tale', 'Epic Fantasy'],
            'Romance' : ['Romantic']

            from collections import defaultdict
            newdict = defaultdict(list)

            for theme, sub_themes_list in movie_themes.items():
            for sub_theme in sub_themes_list:
            newdict[theme] += movie_sub_themes.get(sub_theme, )

            dict(newdict)

            >> 'Action': ['Ben Hur',
            'Gone With the Wind',
            'Lawrence of Arabia',
            'James Bond',
            'Salt',
            'Mission: Impossible',
            'The Dark Knight Trilogy',
            'Hancock, Superman'],
            'Crime': ['Gangs of New York', 'City of God', 'Reservoir Dogs'],
            'Fantasy': ['Maleficent',
            'Into the Woods',
            'Jack the Giant Killer',
            'Lord of the Rings',
            'Chronicles of Narnia',
            'Beowulf'],
            'Romance': ['Casablanca', 'The English Patient', 'A Walk to Remember']


            timings: 4.84 µs vs 14.6 µs






            share|improve this answer



















            • 1




              Since it's Python 2.7, consider using iteritems instead of items. Also the immutable data at the top should be using tuples instead of lists.
              – Reinderien
              May 22 at 18:41










            • Also, in your inner loop, avoid writing newdict[theme]. You should cache the result of this index lookup in the level above.
              – Reinderien
              May 22 at 18:43












            up vote
            1
            down vote










            up vote
            1
            down vote









            Here's my solution (using defaultdict):



            movie_sub_themes = 
            'Epic': ['Ben Hur', 'Gone With the Wind', 'Lawrence of Arabia'],
            'Spy': ['James Bond', 'Salt', 'Mission: Impossible'],
            'Superhero': ['The Dark Knight Trilogy', 'Hancock, Superman'],
            'Gangster': ['Gangs of New York', 'City of God', 'Reservoir Dogs'],
            'Fairy Tale': ['Maleficent', 'Into the Woods', 'Jack the Giant Killer'],
            'Romantic':['Casablanca', 'The English Patient', 'A Walk to Remember'],
            'Epic Fantasy': ['Lord of the Rings', 'Chronicles of Narnia', 'Beowulf']

            movie_themes =
            'Action': ['Epic', 'Spy', 'Superhero'],
            'Crime' : ['Gangster'],
            'Fantasy' : ['Fairy Tale', 'Epic Fantasy'],
            'Romance' : ['Romantic']

            from collections import defaultdict
            newdict = defaultdict(list)

            for theme, sub_themes_list in movie_themes.items():
            for sub_theme in sub_themes_list:
            newdict[theme] += movie_sub_themes.get(sub_theme, )

            dict(newdict)

            >> 'Action': ['Ben Hur',
            'Gone With the Wind',
            'Lawrence of Arabia',
            'James Bond',
            'Salt',
            'Mission: Impossible',
            'The Dark Knight Trilogy',
            'Hancock, Superman'],
            'Crime': ['Gangs of New York', 'City of God', 'Reservoir Dogs'],
            'Fantasy': ['Maleficent',
            'Into the Woods',
            'Jack the Giant Killer',
            'Lord of the Rings',
            'Chronicles of Narnia',
            'Beowulf'],
            'Romance': ['Casablanca', 'The English Patient', 'A Walk to Remember']


            timings: 4.84 µs vs 14.6 µs






            share|improve this answer















            Here's my solution (using defaultdict):



            movie_sub_themes = 
            'Epic': ['Ben Hur', 'Gone With the Wind', 'Lawrence of Arabia'],
            'Spy': ['James Bond', 'Salt', 'Mission: Impossible'],
            'Superhero': ['The Dark Knight Trilogy', 'Hancock, Superman'],
            'Gangster': ['Gangs of New York', 'City of God', 'Reservoir Dogs'],
            'Fairy Tale': ['Maleficent', 'Into the Woods', 'Jack the Giant Killer'],
            'Romantic':['Casablanca', 'The English Patient', 'A Walk to Remember'],
            'Epic Fantasy': ['Lord of the Rings', 'Chronicles of Narnia', 'Beowulf']

            movie_themes =
            'Action': ['Epic', 'Spy', 'Superhero'],
            'Crime' : ['Gangster'],
            'Fantasy' : ['Fairy Tale', 'Epic Fantasy'],
            'Romance' : ['Romantic']

            from collections import defaultdict
            newdict = defaultdict(list)

            for theme, sub_themes_list in movie_themes.items():
            for sub_theme in sub_themes_list:
            newdict[theme] += movie_sub_themes.get(sub_theme, )

            dict(newdict)

            >> 'Action': ['Ben Hur',
            'Gone With the Wind',
            'Lawrence of Arabia',
            'James Bond',
            'Salt',
            'Mission: Impossible',
            'The Dark Knight Trilogy',
            'Hancock, Superman'],
            'Crime': ['Gangs of New York', 'City of God', 'Reservoir Dogs'],
            'Fantasy': ['Maleficent',
            'Into the Woods',
            'Jack the Giant Killer',
            'Lord of the Rings',
            'Chronicles of Narnia',
            'Beowulf'],
            'Romance': ['Casablanca', 'The English Patient', 'A Walk to Remember']


            timings: 4.84 µs vs 14.6 µs







            share|improve this answer















            share|improve this answer



            share|improve this answer








            edited May 22 at 17:15


























            answered May 22 at 16:49









            Highland Mark

            1739




            1739







            • 1




              Since it's Python 2.7, consider using iteritems instead of items. Also the immutable data at the top should be using tuples instead of lists.
              – Reinderien
              May 22 at 18:41










            • Also, in your inner loop, avoid writing newdict[theme]. You should cache the result of this index lookup in the level above.
              – Reinderien
              May 22 at 18:43












            • 1




              Since it's Python 2.7, consider using iteritems instead of items. Also the immutable data at the top should be using tuples instead of lists.
              – Reinderien
              May 22 at 18:41










            • Also, in your inner loop, avoid writing newdict[theme]. You should cache the result of this index lookup in the level above.
              – Reinderien
              May 22 at 18:43







            1




            1




            Since it's Python 2.7, consider using iteritems instead of items. Also the immutable data at the top should be using tuples instead of lists.
            – Reinderien
            May 22 at 18:41




            Since it's Python 2.7, consider using iteritems instead of items. Also the immutable data at the top should be using tuples instead of lists.
            – Reinderien
            May 22 at 18:41












            Also, in your inner loop, avoid writing newdict[theme]. You should cache the result of this index lookup in the level above.
            – Reinderien
            May 22 at 18:43




            Also, in your inner loop, avoid writing newdict[theme]. You should cache the result of this index lookup in the level above.
            – Reinderien
            May 22 at 18:43












             

            draft saved


            draft discarded


























             


            draft saved


            draft discarded














            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f194952%2freclassifying-movies-by-theme%23new-answer', 'question_page');

            );

            Post as a guest













































































            Popular posts from this blog

            Chat program with C++ and SFML

            Function to Return a JSON Like Objects Using VBA Collections and Arrays

            Will my employers contract hold up in court?