Flatten Dictionary Python challenge

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP





.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty margin-bottom:0;







up vote
1
down vote

favorite
1












I have a question about this coding challenge for "Flatten a Dictionary":




Given a dictionary dict, write a function flattenDictionary that
returns a flattened version of it .



If you’re using a compiled language such Java, C++, C#, Swift and Go,
you may want to use a Map/Dictionary/Hash Table that maps strings
(keys) to a generic type (e.g. Object in Java, AnyObject in Swift
etc.) to allow nested dictionaries.



Example:



Input:



dict = 
"Key1" : "1",
"Key2" :
"a" : "2",
"b" : "3",
"c" :
"d" : "3",
"e" : "1"





Output:




"Key1" : "1",
"Key2.a" : "2",
"Key2.b" : "3",
"Key2.c.d" : "3",
"Key2.c.e" : "1"



Important: when you concatenate keys, make sure to add the dot character between them. For instance concatenating Key2, c and d the result key would be Key2.c.d.




def flatten_dictionary(dictionary):

def items():
# loop through each item inside the dictionary k, v
#Appending
# check if the sub-key and sub-value are
# inside the flatten_dict(value)
# join on subkey array
# add to result
# clear out prev_keys
for key, value in dictionary.items():
if isinstance(value, dict):
for subkey, subvalue in flatten_dictionary(value).items():
if key == "":
yield subkey, subvalue
yield key + "." + subkey, subvalue

else:

yield key, value

return dict(items())


# test cases 1

dictionary2 =
"Key1" : "1",
"Key2" :
"a" : "2",
"b" : "3",
"c" :
"d" : "3",
"e" : "1"





# output:
# "Key1" : "1",
# "Key2.a" : "2",
# "Key2.b" : "3",
# "Key2.c.d" : "3",
# "Key2.c.e" : "1"
#

print(flatten_dictionary(dictionary2))






share|improve this question

















  • 4




    Flattening a dictionary is a weird request. How do you handle key collisions? Overwrites really don't make sense, because they are implementation dependent (at least before Python 3.4, I believe where dict behaves like OrderedDict), so you can't definitively say across python versions how it should behave. Your key + '.' + subkey also doesn't generalize, but keys are not always strings. Anything immutable that defines __hash__ and __eq__ can be a key.
    – Bailey Parker
    Jan 26 at 0:03






  • 1




    @BaileyParker Or the problem has to be properly described. The way it stands at the moment justifies all your concerns. This is what SO would classify as too broad. Imho always.
    – Ev. Kounis
    Jan 26 at 8:41







  • 3




    You should at least add a link to the problem description, where these ambiguities might be resolved already.
    – Graipher
    Jan 26 at 9:40










  • I updated the problem description so it is clear what the question is asking me to do.
    – NinjaG
    Jan 26 at 16:51










  • Your question currently leaves some things to be desired. I'd recommend taking a look at Simon's Guide to posting a good question. In particular, you could add a reason for why you are posting your code here, what do you want from a review? Also, how much have you tested this code?
    – Simon Forsberg♦
    Jan 26 at 16:55
















up vote
1
down vote

favorite
1












I have a question about this coding challenge for "Flatten a Dictionary":




Given a dictionary dict, write a function flattenDictionary that
returns a flattened version of it .



If you’re using a compiled language such Java, C++, C#, Swift and Go,
you may want to use a Map/Dictionary/Hash Table that maps strings
(keys) to a generic type (e.g. Object in Java, AnyObject in Swift
etc.) to allow nested dictionaries.



Example:



Input:



dict = 
"Key1" : "1",
"Key2" :
"a" : "2",
"b" : "3",
"c" :
"d" : "3",
"e" : "1"





Output:




"Key1" : "1",
"Key2.a" : "2",
"Key2.b" : "3",
"Key2.c.d" : "3",
"Key2.c.e" : "1"



Important: when you concatenate keys, make sure to add the dot character between them. For instance concatenating Key2, c and d the result key would be Key2.c.d.




def flatten_dictionary(dictionary):

def items():
# loop through each item inside the dictionary k, v
#Appending
# check if the sub-key and sub-value are
# inside the flatten_dict(value)
# join on subkey array
# add to result
# clear out prev_keys
for key, value in dictionary.items():
if isinstance(value, dict):
for subkey, subvalue in flatten_dictionary(value).items():
if key == "":
yield subkey, subvalue
yield key + "." + subkey, subvalue

else:

yield key, value

return dict(items())


# test cases 1

dictionary2 =
"Key1" : "1",
"Key2" :
"a" : "2",
"b" : "3",
"c" :
"d" : "3",
"e" : "1"





# output:
# "Key1" : "1",
# "Key2.a" : "2",
# "Key2.b" : "3",
# "Key2.c.d" : "3",
# "Key2.c.e" : "1"
#

print(flatten_dictionary(dictionary2))






share|improve this question

















  • 4




    Flattening a dictionary is a weird request. How do you handle key collisions? Overwrites really don't make sense, because they are implementation dependent (at least before Python 3.4, I believe where dict behaves like OrderedDict), so you can't definitively say across python versions how it should behave. Your key + '.' + subkey also doesn't generalize, but keys are not always strings. Anything immutable that defines __hash__ and __eq__ can be a key.
    – Bailey Parker
    Jan 26 at 0:03






  • 1




    @BaileyParker Or the problem has to be properly described. The way it stands at the moment justifies all your concerns. This is what SO would classify as too broad. Imho always.
    – Ev. Kounis
    Jan 26 at 8:41







  • 3




    You should at least add a link to the problem description, where these ambiguities might be resolved already.
    – Graipher
    Jan 26 at 9:40










  • I updated the problem description so it is clear what the question is asking me to do.
    – NinjaG
    Jan 26 at 16:51










  • Your question currently leaves some things to be desired. I'd recommend taking a look at Simon's Guide to posting a good question. In particular, you could add a reason for why you are posting your code here, what do you want from a review? Also, how much have you tested this code?
    – Simon Forsberg♦
    Jan 26 at 16:55












up vote
1
down vote

favorite
1









up vote
1
down vote

favorite
1






1





I have a question about this coding challenge for "Flatten a Dictionary":




Given a dictionary dict, write a function flattenDictionary that
returns a flattened version of it .



If you’re using a compiled language such Java, C++, C#, Swift and Go,
you may want to use a Map/Dictionary/Hash Table that maps strings
(keys) to a generic type (e.g. Object in Java, AnyObject in Swift
etc.) to allow nested dictionaries.



Example:



Input:



dict = 
"Key1" : "1",
"Key2" :
"a" : "2",
"b" : "3",
"c" :
"d" : "3",
"e" : "1"





Output:




"Key1" : "1",
"Key2.a" : "2",
"Key2.b" : "3",
"Key2.c.d" : "3",
"Key2.c.e" : "1"



Important: when you concatenate keys, make sure to add the dot character between them. For instance concatenating Key2, c and d the result key would be Key2.c.d.




def flatten_dictionary(dictionary):

def items():
# loop through each item inside the dictionary k, v
#Appending
# check if the sub-key and sub-value are
# inside the flatten_dict(value)
# join on subkey array
# add to result
# clear out prev_keys
for key, value in dictionary.items():
if isinstance(value, dict):
for subkey, subvalue in flatten_dictionary(value).items():
if key == "":
yield subkey, subvalue
yield key + "." + subkey, subvalue

else:

yield key, value

return dict(items())


# test cases 1

dictionary2 =
"Key1" : "1",
"Key2" :
"a" : "2",
"b" : "3",
"c" :
"d" : "3",
"e" : "1"





# output:
# "Key1" : "1",
# "Key2.a" : "2",
# "Key2.b" : "3",
# "Key2.c.d" : "3",
# "Key2.c.e" : "1"
#

print(flatten_dictionary(dictionary2))






share|improve this question













I have a question about this coding challenge for "Flatten a Dictionary":




Given a dictionary dict, write a function flattenDictionary that
returns a flattened version of it .



If you’re using a compiled language such Java, C++, C#, Swift and Go,
you may want to use a Map/Dictionary/Hash Table that maps strings
(keys) to a generic type (e.g. Object in Java, AnyObject in Swift
etc.) to allow nested dictionaries.



Example:



Input:



dict = 
"Key1" : "1",
"Key2" :
"a" : "2",
"b" : "3",
"c" :
"d" : "3",
"e" : "1"





Output:




"Key1" : "1",
"Key2.a" : "2",
"Key2.b" : "3",
"Key2.c.d" : "3",
"Key2.c.e" : "1"



Important: when you concatenate keys, make sure to add the dot character between them. For instance concatenating Key2, c and d the result key would be Key2.c.d.




def flatten_dictionary(dictionary):

def items():
# loop through each item inside the dictionary k, v
#Appending
# check if the sub-key and sub-value are
# inside the flatten_dict(value)
# join on subkey array
# add to result
# clear out prev_keys
for key, value in dictionary.items():
if isinstance(value, dict):
for subkey, subvalue in flatten_dictionary(value).items():
if key == "":
yield subkey, subvalue
yield key + "." + subkey, subvalue

else:

yield key, value

return dict(items())


# test cases 1

dictionary2 =
"Key1" : "1",
"Key2" :
"a" : "2",
"b" : "3",
"c" :
"d" : "3",
"e" : "1"





# output:
# "Key1" : "1",
# "Key2.a" : "2",
# "Key2.b" : "3",
# "Key2.c.d" : "3",
# "Key2.c.e" : "1"
#

print(flatten_dictionary(dictionary2))








share|improve this question












share|improve this question




share|improve this question








edited Jan 26 at 23:30









Jamal♦

30.1k11114225




30.1k11114225









asked Jan 25 at 22:52









NinjaG

756221




756221







  • 4




    Flattening a dictionary is a weird request. How do you handle key collisions? Overwrites really don't make sense, because they are implementation dependent (at least before Python 3.4, I believe where dict behaves like OrderedDict), so you can't definitively say across python versions how it should behave. Your key + '.' + subkey also doesn't generalize, but keys are not always strings. Anything immutable that defines __hash__ and __eq__ can be a key.
    – Bailey Parker
    Jan 26 at 0:03






  • 1




    @BaileyParker Or the problem has to be properly described. The way it stands at the moment justifies all your concerns. This is what SO would classify as too broad. Imho always.
    – Ev. Kounis
    Jan 26 at 8:41







  • 3




    You should at least add a link to the problem description, where these ambiguities might be resolved already.
    – Graipher
    Jan 26 at 9:40










  • I updated the problem description so it is clear what the question is asking me to do.
    – NinjaG
    Jan 26 at 16:51










  • Your question currently leaves some things to be desired. I'd recommend taking a look at Simon's Guide to posting a good question. In particular, you could add a reason for why you are posting your code here, what do you want from a review? Also, how much have you tested this code?
    – Simon Forsberg♦
    Jan 26 at 16:55












  • 4




    Flattening a dictionary is a weird request. How do you handle key collisions? Overwrites really don't make sense, because they are implementation dependent (at least before Python 3.4, I believe where dict behaves like OrderedDict), so you can't definitively say across python versions how it should behave. Your key + '.' + subkey also doesn't generalize, but keys are not always strings. Anything immutable that defines __hash__ and __eq__ can be a key.
    – Bailey Parker
    Jan 26 at 0:03






  • 1




    @BaileyParker Or the problem has to be properly described. The way it stands at the moment justifies all your concerns. This is what SO would classify as too broad. Imho always.
    – Ev. Kounis
    Jan 26 at 8:41







  • 3




    You should at least add a link to the problem description, where these ambiguities might be resolved already.
    – Graipher
    Jan 26 at 9:40










  • I updated the problem description so it is clear what the question is asking me to do.
    – NinjaG
    Jan 26 at 16:51










  • Your question currently leaves some things to be desired. I'd recommend taking a look at Simon's Guide to posting a good question. In particular, you could add a reason for why you are posting your code here, what do you want from a review? Also, how much have you tested this code?
    – Simon Forsberg♦
    Jan 26 at 16:55







4




4




Flattening a dictionary is a weird request. How do you handle key collisions? Overwrites really don't make sense, because they are implementation dependent (at least before Python 3.4, I believe where dict behaves like OrderedDict), so you can't definitively say across python versions how it should behave. Your key + '.' + subkey also doesn't generalize, but keys are not always strings. Anything immutable that defines __hash__ and __eq__ can be a key.
– Bailey Parker
Jan 26 at 0:03




Flattening a dictionary is a weird request. How do you handle key collisions? Overwrites really don't make sense, because they are implementation dependent (at least before Python 3.4, I believe where dict behaves like OrderedDict), so you can't definitively say across python versions how it should behave. Your key + '.' + subkey also doesn't generalize, but keys are not always strings. Anything immutable that defines __hash__ and __eq__ can be a key.
– Bailey Parker
Jan 26 at 0:03




1




1




@BaileyParker Or the problem has to be properly described. The way it stands at the moment justifies all your concerns. This is what SO would classify as too broad. Imho always.
– Ev. Kounis
Jan 26 at 8:41





@BaileyParker Or the problem has to be properly described. The way it stands at the moment justifies all your concerns. This is what SO would classify as too broad. Imho always.
– Ev. Kounis
Jan 26 at 8:41





3




3




You should at least add a link to the problem description, where these ambiguities might be resolved already.
– Graipher
Jan 26 at 9:40




You should at least add a link to the problem description, where these ambiguities might be resolved already.
– Graipher
Jan 26 at 9:40












I updated the problem description so it is clear what the question is asking me to do.
– NinjaG
Jan 26 at 16:51




I updated the problem description so it is clear what the question is asking me to do.
– NinjaG
Jan 26 at 16:51












Your question currently leaves some things to be desired. I'd recommend taking a look at Simon's Guide to posting a good question. In particular, you could add a reason for why you are posting your code here, what do you want from a review? Also, how much have you tested this code?
– Simon Forsberg♦
Jan 26 at 16:55




Your question currently leaves some things to be desired. I'd recommend taking a look at Simon's Guide to posting a good question. In particular, you could add a reason for why you are posting your code here, what do you want from a review? Also, how much have you tested this code?
– Simon Forsberg♦
Jan 26 at 16:55










1 Answer
1






active

oldest

votes

















up vote
2
down vote



accepted










1. Bugs




  1. The special case for empty keys:



    if key == "":
    yield subkey, subvalue
    yield key + "." + subkey, subvalue


    is missing an else: and so leads to items appearing twice:



    >>> flatten_dictionary("": "a":1)
    'a': 1, '.a': 1



  2. Even if we fix the bug by adding the else:, there's still a problem. Consider this example:



    >>> flatten_dictionary("a": "": "b": 1, "": "a": "b": 2)
    'a.b': 2


    What happened to the 1? Ignoring the empty string keys has led two different keys to be collapsed into one. It would be better to remove the special case for empty strings, so that:



    >>> flatten_dictionary("a": "": "b": 1, "": "a": "b": 2)
    'a..b': 1, '.a.b': 2



  3. Python has a limited stack, but dictionaries can be nested arbitrarily deeply:



    >>> d = 
    >>> for i in range(1000): d = 'a':d
    ...
    >>> flatten_dictionary(d)
    Traceback (most recent call last):
    File "<stdin>", line 1, in <module>
    File "cr186013.py", line 22, in flatten_dictionary
    return dict(items())
    [... many line omitted ...]
    File "cr186013.py", line 13, in items
    for subkey, subvalue in flatten_dictionary(value).items():
    RecursionError: maximum recursion depth exceeded


    This problem can be avoided using the stack of iterators pattern.



2. Other review points




  1. The comment does not match the code: there is nothing in the code corresponding to "check if the sub-key and sub-value are inside the flatten_dict(value)" or "join on subkey array" or "clear out prev_keys". Incorrect comments like this are worse than useless: they make it harder to understand and maintain the code.



    Did this comment describe an earlier version of the code, and then you changed the code but forgot to edit the comment? It is worth getting into the habit of changing the comment first so that you don't forget.




  2. The construction of the result keys has unnecessary quadratic runtime. For example, in this situation:



    >>> flatten_dictionary('a': 'a': 'a': 'a': 'a': 'a': 1)
    'a.a.a.a.a.a': 1


    there is only the need to concatenate one result key ('a.a.a.a.a.a'), but the code concatenates five result keys: not just the one that we need, but 'a.a', 'a.a.a', 'a.a.a.a' and 'a.a.a.a.a' as well. The way to avoid this is to keep a stack of dictionary keys currently being visited, and use str.join on the stack when you need to concatenate the result key.



3. Revised code



def flatten_dictionary(d):
result =
stack = [iter(d.items())]
keys =
while stack:
for k, v in stack[-1]:
keys.append(k)
if isinstance(v, dict):
stack.append(iter(v.items()))
break
else:
result['.'.join(keys)] = v
keys.pop()
else:
if keys:
keys.pop()
stack.pop()
return result





share|improve this answer





















  • I don't think it can pass this test cases: Input: "":"a":"1","b":"3" Expected: "a":"1","b":"3" Actual: '.a': '1', 'b': '3'
    – NinjaG
    Jan 28 at 4:38










  • See §1.2 for why special handling of empty strings is a bad idea.
    – Gareth Rees
    Jan 28 at 10:14











Your Answer




StackExchange.ifUsing("editor", function ()
return StackExchange.using("mathjaxEditing", function ()
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["\$", "\$"]]);
);
);
, "mathjax-editing");

StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "196"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
convertImagesToLinks: false,
noModals: false,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);



);








 

draft saved


draft discarded


















StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f186013%2fflatten-dictionary-python-challenge%23new-answer', 'question_page');

);

Post as a guest






























1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes








up vote
2
down vote



accepted










1. Bugs




  1. The special case for empty keys:



    if key == "":
    yield subkey, subvalue
    yield key + "." + subkey, subvalue


    is missing an else: and so leads to items appearing twice:



    >>> flatten_dictionary("": "a":1)
    'a': 1, '.a': 1



  2. Even if we fix the bug by adding the else:, there's still a problem. Consider this example:



    >>> flatten_dictionary("a": "": "b": 1, "": "a": "b": 2)
    'a.b': 2


    What happened to the 1? Ignoring the empty string keys has led two different keys to be collapsed into one. It would be better to remove the special case for empty strings, so that:



    >>> flatten_dictionary("a": "": "b": 1, "": "a": "b": 2)
    'a..b': 1, '.a.b': 2



  3. Python has a limited stack, but dictionaries can be nested arbitrarily deeply:



    >>> d = 
    >>> for i in range(1000): d = 'a':d
    ...
    >>> flatten_dictionary(d)
    Traceback (most recent call last):
    File "<stdin>", line 1, in <module>
    File "cr186013.py", line 22, in flatten_dictionary
    return dict(items())
    [... many line omitted ...]
    File "cr186013.py", line 13, in items
    for subkey, subvalue in flatten_dictionary(value).items():
    RecursionError: maximum recursion depth exceeded


    This problem can be avoided using the stack of iterators pattern.



2. Other review points




  1. The comment does not match the code: there is nothing in the code corresponding to "check if the sub-key and sub-value are inside the flatten_dict(value)" or "join on subkey array" or "clear out prev_keys". Incorrect comments like this are worse than useless: they make it harder to understand and maintain the code.



    Did this comment describe an earlier version of the code, and then you changed the code but forgot to edit the comment? It is worth getting into the habit of changing the comment first so that you don't forget.




  2. The construction of the result keys has unnecessary quadratic runtime. For example, in this situation:



    >>> flatten_dictionary('a': 'a': 'a': 'a': 'a': 'a': 1)
    'a.a.a.a.a.a': 1


    there is only the need to concatenate one result key ('a.a.a.a.a.a'), but the code concatenates five result keys: not just the one that we need, but 'a.a', 'a.a.a', 'a.a.a.a' and 'a.a.a.a.a' as well. The way to avoid this is to keep a stack of dictionary keys currently being visited, and use str.join on the stack when you need to concatenate the result key.



3. Revised code



def flatten_dictionary(d):
result =
stack = [iter(d.items())]
keys =
while stack:
for k, v in stack[-1]:
keys.append(k)
if isinstance(v, dict):
stack.append(iter(v.items()))
break
else:
result['.'.join(keys)] = v
keys.pop()
else:
if keys:
keys.pop()
stack.pop()
return result





share|improve this answer





















  • I don't think it can pass this test cases: Input: "":"a":"1","b":"3" Expected: "a":"1","b":"3" Actual: '.a': '1', 'b': '3'
    – NinjaG
    Jan 28 at 4:38










  • See §1.2 for why special handling of empty strings is a bad idea.
    – Gareth Rees
    Jan 28 at 10:14















up vote
2
down vote



accepted










1. Bugs




  1. The special case for empty keys:



    if key == "":
    yield subkey, subvalue
    yield key + "." + subkey, subvalue


    is missing an else: and so leads to items appearing twice:



    >>> flatten_dictionary("": "a":1)
    'a': 1, '.a': 1



  2. Even if we fix the bug by adding the else:, there's still a problem. Consider this example:



    >>> flatten_dictionary("a": "": "b": 1, "": "a": "b": 2)
    'a.b': 2


    What happened to the 1? Ignoring the empty string keys has led two different keys to be collapsed into one. It would be better to remove the special case for empty strings, so that:



    >>> flatten_dictionary("a": "": "b": 1, "": "a": "b": 2)
    'a..b': 1, '.a.b': 2



  3. Python has a limited stack, but dictionaries can be nested arbitrarily deeply:



    >>> d = 
    >>> for i in range(1000): d = 'a':d
    ...
    >>> flatten_dictionary(d)
    Traceback (most recent call last):
    File "<stdin>", line 1, in <module>
    File "cr186013.py", line 22, in flatten_dictionary
    return dict(items())
    [... many line omitted ...]
    File "cr186013.py", line 13, in items
    for subkey, subvalue in flatten_dictionary(value).items():
    RecursionError: maximum recursion depth exceeded


    This problem can be avoided using the stack of iterators pattern.



2. Other review points




  1. The comment does not match the code: there is nothing in the code corresponding to "check if the sub-key and sub-value are inside the flatten_dict(value)" or "join on subkey array" or "clear out prev_keys". Incorrect comments like this are worse than useless: they make it harder to understand and maintain the code.



    Did this comment describe an earlier version of the code, and then you changed the code but forgot to edit the comment? It is worth getting into the habit of changing the comment first so that you don't forget.




  2. The construction of the result keys has unnecessary quadratic runtime. For example, in this situation:



    >>> flatten_dictionary('a': 'a': 'a': 'a': 'a': 'a': 1)
    'a.a.a.a.a.a': 1


    there is only the need to concatenate one result key ('a.a.a.a.a.a'), but the code concatenates five result keys: not just the one that we need, but 'a.a', 'a.a.a', 'a.a.a.a' and 'a.a.a.a.a' as well. The way to avoid this is to keep a stack of dictionary keys currently being visited, and use str.join on the stack when you need to concatenate the result key.



3. Revised code



def flatten_dictionary(d):
result =
stack = [iter(d.items())]
keys =
while stack:
for k, v in stack[-1]:
keys.append(k)
if isinstance(v, dict):
stack.append(iter(v.items()))
break
else:
result['.'.join(keys)] = v
keys.pop()
else:
if keys:
keys.pop()
stack.pop()
return result





share|improve this answer





















  • I don't think it can pass this test cases: Input: "":"a":"1","b":"3" Expected: "a":"1","b":"3" Actual: '.a': '1', 'b': '3'
    – NinjaG
    Jan 28 at 4:38










  • See §1.2 for why special handling of empty strings is a bad idea.
    – Gareth Rees
    Jan 28 at 10:14













up vote
2
down vote



accepted







up vote
2
down vote



accepted






1. Bugs




  1. The special case for empty keys:



    if key == "":
    yield subkey, subvalue
    yield key + "." + subkey, subvalue


    is missing an else: and so leads to items appearing twice:



    >>> flatten_dictionary("": "a":1)
    'a': 1, '.a': 1



  2. Even if we fix the bug by adding the else:, there's still a problem. Consider this example:



    >>> flatten_dictionary("a": "": "b": 1, "": "a": "b": 2)
    'a.b': 2


    What happened to the 1? Ignoring the empty string keys has led two different keys to be collapsed into one. It would be better to remove the special case for empty strings, so that:



    >>> flatten_dictionary("a": "": "b": 1, "": "a": "b": 2)
    'a..b': 1, '.a.b': 2



  3. Python has a limited stack, but dictionaries can be nested arbitrarily deeply:



    >>> d = 
    >>> for i in range(1000): d = 'a':d
    ...
    >>> flatten_dictionary(d)
    Traceback (most recent call last):
    File "<stdin>", line 1, in <module>
    File "cr186013.py", line 22, in flatten_dictionary
    return dict(items())
    [... many line omitted ...]
    File "cr186013.py", line 13, in items
    for subkey, subvalue in flatten_dictionary(value).items():
    RecursionError: maximum recursion depth exceeded


    This problem can be avoided using the stack of iterators pattern.



2. Other review points




  1. The comment does not match the code: there is nothing in the code corresponding to "check if the sub-key and sub-value are inside the flatten_dict(value)" or "join on subkey array" or "clear out prev_keys". Incorrect comments like this are worse than useless: they make it harder to understand and maintain the code.



    Did this comment describe an earlier version of the code, and then you changed the code but forgot to edit the comment? It is worth getting into the habit of changing the comment first so that you don't forget.




  2. The construction of the result keys has unnecessary quadratic runtime. For example, in this situation:



    >>> flatten_dictionary('a': 'a': 'a': 'a': 'a': 'a': 1)
    'a.a.a.a.a.a': 1


    there is only the need to concatenate one result key ('a.a.a.a.a.a'), but the code concatenates five result keys: not just the one that we need, but 'a.a', 'a.a.a', 'a.a.a.a' and 'a.a.a.a.a' as well. The way to avoid this is to keep a stack of dictionary keys currently being visited, and use str.join on the stack when you need to concatenate the result key.



3. Revised code



def flatten_dictionary(d):
result =
stack = [iter(d.items())]
keys =
while stack:
for k, v in stack[-1]:
keys.append(k)
if isinstance(v, dict):
stack.append(iter(v.items()))
break
else:
result['.'.join(keys)] = v
keys.pop()
else:
if keys:
keys.pop()
stack.pop()
return result





share|improve this answer













1. Bugs




  1. The special case for empty keys:



    if key == "":
    yield subkey, subvalue
    yield key + "." + subkey, subvalue


    is missing an else: and so leads to items appearing twice:



    >>> flatten_dictionary("": "a":1)
    'a': 1, '.a': 1



  2. Even if we fix the bug by adding the else:, there's still a problem. Consider this example:



    >>> flatten_dictionary("a": "": "b": 1, "": "a": "b": 2)
    'a.b': 2


    What happened to the 1? Ignoring the empty string keys has led two different keys to be collapsed into one. It would be better to remove the special case for empty strings, so that:



    >>> flatten_dictionary("a": "": "b": 1, "": "a": "b": 2)
    'a..b': 1, '.a.b': 2



  3. Python has a limited stack, but dictionaries can be nested arbitrarily deeply:



    >>> d = 
    >>> for i in range(1000): d = 'a':d
    ...
    >>> flatten_dictionary(d)
    Traceback (most recent call last):
    File "<stdin>", line 1, in <module>
    File "cr186013.py", line 22, in flatten_dictionary
    return dict(items())
    [... many line omitted ...]
    File "cr186013.py", line 13, in items
    for subkey, subvalue in flatten_dictionary(value).items():
    RecursionError: maximum recursion depth exceeded


    This problem can be avoided using the stack of iterators pattern.



2. Other review points




  1. The comment does not match the code: there is nothing in the code corresponding to "check if the sub-key and sub-value are inside the flatten_dict(value)" or "join on subkey array" or "clear out prev_keys". Incorrect comments like this are worse than useless: they make it harder to understand and maintain the code.



    Did this comment describe an earlier version of the code, and then you changed the code but forgot to edit the comment? It is worth getting into the habit of changing the comment first so that you don't forget.




  2. The construction of the result keys has unnecessary quadratic runtime. For example, in this situation:



    >>> flatten_dictionary('a': 'a': 'a': 'a': 'a': 'a': 1)
    'a.a.a.a.a.a': 1


    there is only the need to concatenate one result key ('a.a.a.a.a.a'), but the code concatenates five result keys: not just the one that we need, but 'a.a', 'a.a.a', 'a.a.a.a' and 'a.a.a.a.a' as well. The way to avoid this is to keep a stack of dictionary keys currently being visited, and use str.join on the stack when you need to concatenate the result key.



3. Revised code



def flatten_dictionary(d):
result =
stack = [iter(d.items())]
keys =
while stack:
for k, v in stack[-1]:
keys.append(k)
if isinstance(v, dict):
stack.append(iter(v.items()))
break
else:
result['.'.join(keys)] = v
keys.pop()
else:
if keys:
keys.pop()
stack.pop()
return result






share|improve this answer













share|improve this answer



share|improve this answer











answered Jan 27 at 13:07









Gareth Rees

41.1k394168




41.1k394168











  • I don't think it can pass this test cases: Input: "":"a":"1","b":"3" Expected: "a":"1","b":"3" Actual: '.a': '1', 'b': '3'
    – NinjaG
    Jan 28 at 4:38










  • See §1.2 for why special handling of empty strings is a bad idea.
    – Gareth Rees
    Jan 28 at 10:14

















  • I don't think it can pass this test cases: Input: "":"a":"1","b":"3" Expected: "a":"1","b":"3" Actual: '.a': '1', 'b': '3'
    – NinjaG
    Jan 28 at 4:38










  • See §1.2 for why special handling of empty strings is a bad idea.
    – Gareth Rees
    Jan 28 at 10:14
















I don't think it can pass this test cases: Input: "":"a":"1","b":"3" Expected: "a":"1","b":"3" Actual: '.a': '1', 'b': '3'
– NinjaG
Jan 28 at 4:38




I don't think it can pass this test cases: Input: "":"a":"1","b":"3" Expected: "a":"1","b":"3" Actual: '.a': '1', 'b': '3'
– NinjaG
Jan 28 at 4:38












See §1.2 for why special handling of empty strings is a bad idea.
– Gareth Rees
Jan 28 at 10:14





See §1.2 for why special handling of empty strings is a bad idea.
– Gareth Rees
Jan 28 at 10:14













 

draft saved


draft discarded


























 


draft saved


draft discarded














StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f186013%2fflatten-dictionary-python-challenge%23new-answer', 'question_page');

);

Post as a guest













































































Popular posts from this blog

Chat program with C++ and SFML

Function to Return a JSON Like Objects Using VBA Collections and Arrays

Will my employers contract hold up in court?