Trie Data Structure Implementation in C++11 using Smart Pointers â follow-up
Clash Royale CLAN TAG#URR8PPP
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty margin-bottom:0;
up vote
11
down vote
favorite
Link to my first question.
I followed @JDÃ Âugosz recommendations. How does it look? Do you have any further recommendations? Is it better (if possible) to replace shared_ptr with unique_ptr? How could someone extend it to use the Unicode character set?
#pragma once
#include <iostream>
#include <memory>
#include <string>
namespace forest
class trie
private:
struct Node
std::shared_ptr<Node> children[26];
bool end = false;
;
std::shared_ptr<Node> root = std::make_shared<Node>();
public:
void insert(const std::string & key)
std::shared_ptr<Node> n = root;
for (auto c : key)
int index = c - 'a';
auto& slot = n->children[index];
if (!slot) slot = std::make_shared<Node>();
n = slot;
n->end = true;
bool search(const std::string & key)
std::shared_ptr<Node> n = root;
for (auto c : key)
int index = c - 'a';
auto& slot = n->children[index];
if (!slot) return false;
n = slot;
return n && n->end;
;
c++ performance c++11 pointers trie
add a comment |Â
up vote
11
down vote
favorite
Link to my first question.
I followed @JDÃ Âugosz recommendations. How does it look? Do you have any further recommendations? Is it better (if possible) to replace shared_ptr with unique_ptr? How could someone extend it to use the Unicode character set?
#pragma once
#include <iostream>
#include <memory>
#include <string>
namespace forest
class trie
private:
struct Node
std::shared_ptr<Node> children[26];
bool end = false;
;
std::shared_ptr<Node> root = std::make_shared<Node>();
public:
void insert(const std::string & key)
std::shared_ptr<Node> n = root;
for (auto c : key)
int index = c - 'a';
auto& slot = n->children[index];
if (!slot) slot = std::make_shared<Node>();
n = slot;
n->end = true;
bool search(const std::string & key)
std::shared_ptr<Node> n = root;
for (auto c : key)
int index = c - 'a';
auto& slot = n->children[index];
if (!slot) return false;
n = slot;
return n && n->end;
;
c++ performance c++11 pointers trie
Just added an answer with regard toshared_ptr
to the previous questions
â Harald Scheirich
Apr 6 at 14:30
add a comment |Â
up vote
11
down vote
favorite
up vote
11
down vote
favorite
Link to my first question.
I followed @JDÃ Âugosz recommendations. How does it look? Do you have any further recommendations? Is it better (if possible) to replace shared_ptr with unique_ptr? How could someone extend it to use the Unicode character set?
#pragma once
#include <iostream>
#include <memory>
#include <string>
namespace forest
class trie
private:
struct Node
std::shared_ptr<Node> children[26];
bool end = false;
;
std::shared_ptr<Node> root = std::make_shared<Node>();
public:
void insert(const std::string & key)
std::shared_ptr<Node> n = root;
for (auto c : key)
int index = c - 'a';
auto& slot = n->children[index];
if (!slot) slot = std::make_shared<Node>();
n = slot;
n->end = true;
bool search(const std::string & key)
std::shared_ptr<Node> n = root;
for (auto c : key)
int index = c - 'a';
auto& slot = n->children[index];
if (!slot) return false;
n = slot;
return n && n->end;
;
c++ performance c++11 pointers trie
Link to my first question.
I followed @JDÃ Âugosz recommendations. How does it look? Do you have any further recommendations? Is it better (if possible) to replace shared_ptr with unique_ptr? How could someone extend it to use the Unicode character set?
#pragma once
#include <iostream>
#include <memory>
#include <string>
namespace forest
class trie
private:
struct Node
std::shared_ptr<Node> children[26];
bool end = false;
;
std::shared_ptr<Node> root = std::make_shared<Node>();
public:
void insert(const std::string & key)
std::shared_ptr<Node> n = root;
for (auto c : key)
int index = c - 'a';
auto& slot = n->children[index];
if (!slot) slot = std::make_shared<Node>();
n = slot;
n->end = true;
bool search(const std::string & key)
std::shared_ptr<Node> n = root;
for (auto c : key)
int index = c - 'a';
auto& slot = n->children[index];
if (!slot) return false;
n = slot;
return n && n->end;
;
c++ performance c++11 pointers trie
asked Apr 6 at 14:25
xorz57
32912
32912
Just added an answer with regard toshared_ptr
to the previous questions
â Harald Scheirich
Apr 6 at 14:30
add a comment |Â
Just added an answer with regard toshared_ptr
to the previous questions
â Harald Scheirich
Apr 6 at 14:30
Just added an answer with regard to
shared_ptr
to the previous questionsâ Harald Scheirich
Apr 6 at 14:30
Just added an answer with regard to
shared_ptr
to the previous questionsâ Harald Scheirich
Apr 6 at 14:30
add a comment |Â
2 Answers
2
active
oldest
votes
up vote
3
down vote
accepted
pointers
I supposed the use of shared pointers was the possibility of sharing tree representation across instances, "persistence", and transactions. IâÂÂve just watched some presentations on persistent data structures (and on GoogleâÂÂs trie, for that matter) so that was in my mind.
I agree with Frank about pointers. When calling code that operates on an object, it doesnâÂÂt care how the object is owned, so making it take an argument of type shared_ptr means it canâÂÂt take objects owned by unique_ptr, or that are direct members of larger structures, or on the stack, etc. So these arguments are passed as a reference.
In the Standard Guidelines, pointers are always non-owning. You mark them with owner<> to indicate otherwise.
I agree that the root does not need to be dynamically allocated. But you need to avoid having an ugly special case for the first node.
Node* n = &root;
for (auto c : key)
int index = to_index(c);
auto& slot = n->children[index];
if (!slot) slot = make_unique<Node>();
n= slot.get();
duplicated traversal code
I note that both functions have the same logic to traverse the tree, in their core. Normally, like in standard containers, there will be a single function that does this, and it is used by all the other functions.
If these are the only two functions you have, itâÂÂs probably not worth the effort. But if you have more (remove, find the closest match, etc.) then you should do that.
26
The first thing I noticed on your update was that you replaced the evil macro with a magic number, rather than a better way of defining a constant.
static constexpr size_t nodesize = 1+'z'-'a';
std::unique_ptr<Node> children[nodesize];
bad key
int index = c - 'a'; // note signed result wanted
if (index<0 || index>=nodesize) throw invalid_argument("oops");
Both functions go over a string in the same manner, so make that a common function.
int index = to_index(c);
Portability of character encoding
ItâÂÂs been noted that the letters are not necessarily contiguous in the source character set. However, if you are writing in (original) EBCDIC you have worse problems and would not be able to type the
characters into the source file. (IâÂÂve discussed C++ on a primitive type of forum software running on an EBCDIC system that was lacking [ ]
and a few others, and it is not simple.
The execution character set is distinct from the source character set, and depends on the locale. More generally, you can see that it depends on the source of the strings such as a saved file â if the file uses a character set that doesnâÂÂt use the same codes for letters as we expected, then things will go bad.
So, part of the specification is that the input strings will always be in UTF-8, or (sufficient for our purposes) is compatible with ASCII.
What about at compile-time though? The standard says that the value of a character literal 'a'
is in the execution character set, not the source character set, which is good. Except that the execution character set is not known until run time, so how can it do that?
However, you can specify that a character is using UTF-8, regardless of any locale or whatnot going on in the compiler or target system.
static constexpr size_t nodesize = 1+u8'z'-u8'a';
Is it possible to avoid that std::make_unique because it uses c++14? I am asking this because initially, i wanted my library to use c++11 only. Does std::shared_ptr really hurt in this implementation?
â xorz57
Apr 7 at 0:55
If at some point an exception is thrown the string would be half written I fear... that's something else that bothers me.
â xorz57
Apr 7 at 0:57
make_unique
doesn't require C++14 to implement.std::make_unique
not being in C++11 was an oversight.std::shared_ptr
will work, but it's overkill. Each subtrie in the node array is owned by one parent trie. You don't need thestd::shared_ptr
control block/ref-counting. As for providing the strong exception guarantee on insertion, Create a new tree for the insertion term and merge. You could even optimize this by prefix searching and only creating/transferring the new suffix.
â Snowhawk
Apr 7 at 1:29
@xorz57 just add the missing def yourself, and use that. Performance: benchmark it both ways and see!
â JDà Âugosz
Apr 7 at 3:20
Btw,'z'-'a'
is not guaranteed to be 25 in C / C++. Only the order of 0 - 9 is guaranteed. It may not make a huge difference in practice, but I generally try to use"abcdefghijklmnopqrstuvwxyz"
in some way (defining it somewhere and doingcountof
orstrlen
in this case).
â Millie Smith
Apr 7 at 3:34
 |Â
show 6 more comments
up vote
11
down vote
Input sanitization
Your functions take a std::string
as parameter, as such, they should be "well behaved" for any possible std::string
passed to them. Note that well behaved does not mean that it has to "work", just that it should not break anything.
Specifically, what would happen if I passed the string "Hello"
to that function? 'H' - 'a'
is -25
, ruh roh!
There's a few different ways to tackle this.
You could convert all upper-case letters to lower case, but that does not fix punctuation marks, spaces and numbers. I also personally think that wether the trie is case-sensitive or not should be the user's problem, not the trie itself.
You could bail out of the function if a unhandled character is hit.
Just expand
children
to 256 instead of 26 so that all legal values ofchar
are handled properly. Sure your trie is going to be 5 times as big, but that's a fairly minor detail since it grows logarithmically.
edit: This last approach also makes the trie operate on raw data instead of characters, which makes it encoding-agnostic (which gives you your unicode support)
Avoid using shared_ptr unless absolutely necessary
shared pointers are clunky and heavy affairs. On top of this, using them sends a message to people reading the code: Ownership of the object is potentially shared between multiple owners, which is simply not the case here.
in your case, a std::unique_ptr<>
is absolutely fine.
Your root does not need to be dynamically allocated
It's created at construction unconditionally, and destroyed at destruction unconditionally. On top of that, it does not make use of type erasure (polymorphism). As such, there's no reason for it not to be a normal member of trie.
std::shared_ptr<Node> root = std::make_shared<Node>();
becomes:
Node root;
This will require you to change your search and insert functions so that the first line becomes:
const Node* n = &root;
But that's fine because that's preferable once you move to unique_ptr
anyways.
Edit: On that note:
Raw pointers are not evil
std::shared_ptr<Node> n = root;
We tend to teach people that "you should never use raw pointers". But I find that extremely naive. The real rule is "You should never have raw pointers with ownership".
There is absolutely nothing wrong with using raw pointers as long as it's understood that:
- The pointer only refers to the pointed object without owning it.
- The lifetime of the pointer assignment to the object is fully enclosed by the lifetime of whatever DOES own the object.
In your code using shared_ptr
, using the following would have been 100% ok, and much better in my opinion:
const Node* n = root.get();
Mark non-mutating functions as const.
Your search()
member function should not alter the trie in any way, so you should mark it as const
like this:
bool search(const std::string & key) const {
There's a few subtle advantages for the compiler, but the main one is that if you goof up and accidentally do something that changes the trie, the compiler will tell you.
nitpick: private:
is redundant here.
The namespace of a class is private by default. Fun fact: that's the only difference between a class
and a struct
.
nitpick: redundant null-checking
in your search function, when I read the last line:
return n && n->end;
My impression was "Oh! n
can be possibly null
in certain cases", which led me to search for the scenario where that might be the case. It's misleading to the reader.
Defensive programming can be useful at times, but this is just excessive.
Great answer. My only complaint: you are too timid. YouâÂÂre saying that âÂÂin [your] opinionâ it would be better to use a raw pointer (rather than a smart pointer) in situation you point out. But this isnâÂÂt a matter of opinion. Using astd::shared_ptr
here is wrong (because it implies incorrect ownership semantics). At a stretch you could use astd::weak_ptr
here but IâÂÂd argue that even that is wrong since that, too, implies a semantic that doesnâÂÂt hold: namely that the owner might relinquish the resource while weâÂÂre accessing it. No. This is a raw pointer.
â Konrad Rudolph
Apr 6 at 16:38
@KonradRudolph A strong wrong would imply that the code is not well formed and fundamentally incorrect (like the out of bounds error I pointed out). This is not the case here. It's bad code, and should be changed, but it's not wrong code, it works. But I tend to agree, it's pretty black-and-white here.
â Frank
Apr 6 at 16:43
@Frank Ill-formedness and runtime errors are merely specific subsets of âÂÂwrongâÂÂ. For instance, a program can be well-formed, have no runtime errors, but give the wrong result. Or it can give the right result purely by accident. This is an instance of the latter: using a shared pointer happens to work here but it violates the use-case that it was designed for (due to the implicit specification of the algorithm, which has no ownership here). This use-case canâÂÂt be enforced programmatically but that doesnâÂÂt make the usage less wrong, merely unenforceable.
â Konrad Rudolph
Apr 6 at 16:48
@KonradRudolph But temporarily extending the lifetime of a shared object during transient processing is a legit (and common if you ever used ASIO) use-case. Using that technique here falls under the overkill, excessively defensive category of errors, similar to that null check.
â Frank
Apr 6 at 16:54
@Frank ThereâÂÂs no lifetime extension happening here. This is a pleonasm. Pleonasms, unless they serve a purpose in documenting expected behaviour (= defensive coding), are logic errors that simply happen to have no effect on the correctness of the result.
â Konrad Rudolph
Apr 6 at 17:21
add a comment |Â
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
3
down vote
accepted
pointers
I supposed the use of shared pointers was the possibility of sharing tree representation across instances, "persistence", and transactions. IâÂÂve just watched some presentations on persistent data structures (and on GoogleâÂÂs trie, for that matter) so that was in my mind.
I agree with Frank about pointers. When calling code that operates on an object, it doesnâÂÂt care how the object is owned, so making it take an argument of type shared_ptr means it canâÂÂt take objects owned by unique_ptr, or that are direct members of larger structures, or on the stack, etc. So these arguments are passed as a reference.
In the Standard Guidelines, pointers are always non-owning. You mark them with owner<> to indicate otherwise.
I agree that the root does not need to be dynamically allocated. But you need to avoid having an ugly special case for the first node.
Node* n = &root;
for (auto c : key)
int index = to_index(c);
auto& slot = n->children[index];
if (!slot) slot = make_unique<Node>();
n= slot.get();
duplicated traversal code
I note that both functions have the same logic to traverse the tree, in their core. Normally, like in standard containers, there will be a single function that does this, and it is used by all the other functions.
If these are the only two functions you have, itâÂÂs probably not worth the effort. But if you have more (remove, find the closest match, etc.) then you should do that.
26
The first thing I noticed on your update was that you replaced the evil macro with a magic number, rather than a better way of defining a constant.
static constexpr size_t nodesize = 1+'z'-'a';
std::unique_ptr<Node> children[nodesize];
bad key
int index = c - 'a'; // note signed result wanted
if (index<0 || index>=nodesize) throw invalid_argument("oops");
Both functions go over a string in the same manner, so make that a common function.
int index = to_index(c);
Portability of character encoding
ItâÂÂs been noted that the letters are not necessarily contiguous in the source character set. However, if you are writing in (original) EBCDIC you have worse problems and would not be able to type the
characters into the source file. (IâÂÂve discussed C++ on a primitive type of forum software running on an EBCDIC system that was lacking [ ]
and a few others, and it is not simple.
The execution character set is distinct from the source character set, and depends on the locale. More generally, you can see that it depends on the source of the strings such as a saved file â if the file uses a character set that doesnâÂÂt use the same codes for letters as we expected, then things will go bad.
So, part of the specification is that the input strings will always be in UTF-8, or (sufficient for our purposes) is compatible with ASCII.
What about at compile-time though? The standard says that the value of a character literal 'a'
is in the execution character set, not the source character set, which is good. Except that the execution character set is not known until run time, so how can it do that?
However, you can specify that a character is using UTF-8, regardless of any locale or whatnot going on in the compiler or target system.
static constexpr size_t nodesize = 1+u8'z'-u8'a';
Is it possible to avoid that std::make_unique because it uses c++14? I am asking this because initially, i wanted my library to use c++11 only. Does std::shared_ptr really hurt in this implementation?
â xorz57
Apr 7 at 0:55
If at some point an exception is thrown the string would be half written I fear... that's something else that bothers me.
â xorz57
Apr 7 at 0:57
make_unique
doesn't require C++14 to implement.std::make_unique
not being in C++11 was an oversight.std::shared_ptr
will work, but it's overkill. Each subtrie in the node array is owned by one parent trie. You don't need thestd::shared_ptr
control block/ref-counting. As for providing the strong exception guarantee on insertion, Create a new tree for the insertion term and merge. You could even optimize this by prefix searching and only creating/transferring the new suffix.
â Snowhawk
Apr 7 at 1:29
@xorz57 just add the missing def yourself, and use that. Performance: benchmark it both ways and see!
â JDà Âugosz
Apr 7 at 3:20
Btw,'z'-'a'
is not guaranteed to be 25 in C / C++. Only the order of 0 - 9 is guaranteed. It may not make a huge difference in practice, but I generally try to use"abcdefghijklmnopqrstuvwxyz"
in some way (defining it somewhere and doingcountof
orstrlen
in this case).
â Millie Smith
Apr 7 at 3:34
 |Â
show 6 more comments
up vote
3
down vote
accepted
pointers
I supposed the use of shared pointers was the possibility of sharing tree representation across instances, "persistence", and transactions. IâÂÂve just watched some presentations on persistent data structures (and on GoogleâÂÂs trie, for that matter) so that was in my mind.
I agree with Frank about pointers. When calling code that operates on an object, it doesnâÂÂt care how the object is owned, so making it take an argument of type shared_ptr means it canâÂÂt take objects owned by unique_ptr, or that are direct members of larger structures, or on the stack, etc. So these arguments are passed as a reference.
In the Standard Guidelines, pointers are always non-owning. You mark them with owner<> to indicate otherwise.
I agree that the root does not need to be dynamically allocated. But you need to avoid having an ugly special case for the first node.
Node* n = &root;
for (auto c : key)
int index = to_index(c);
auto& slot = n->children[index];
if (!slot) slot = make_unique<Node>();
n= slot.get();
duplicated traversal code
I note that both functions have the same logic to traverse the tree, in their core. Normally, like in standard containers, there will be a single function that does this, and it is used by all the other functions.
If these are the only two functions you have, itâÂÂs probably not worth the effort. But if you have more (remove, find the closest match, etc.) then you should do that.
26
The first thing I noticed on your update was that you replaced the evil macro with a magic number, rather than a better way of defining a constant.
static constexpr size_t nodesize = 1+'z'-'a';
std::unique_ptr<Node> children[nodesize];
bad key
int index = c - 'a'; // note signed result wanted
if (index<0 || index>=nodesize) throw invalid_argument("oops");
Both functions go over a string in the same manner, so make that a common function.
int index = to_index(c);
Portability of character encoding
ItâÂÂs been noted that the letters are not necessarily contiguous in the source character set. However, if you are writing in (original) EBCDIC you have worse problems and would not be able to type the
characters into the source file. (IâÂÂve discussed C++ on a primitive type of forum software running on an EBCDIC system that was lacking [ ]
and a few others, and it is not simple.
The execution character set is distinct from the source character set, and depends on the locale. More generally, you can see that it depends on the source of the strings such as a saved file â if the file uses a character set that doesnâÂÂt use the same codes for letters as we expected, then things will go bad.
So, part of the specification is that the input strings will always be in UTF-8, or (sufficient for our purposes) is compatible with ASCII.
What about at compile-time though? The standard says that the value of a character literal 'a'
is in the execution character set, not the source character set, which is good. Except that the execution character set is not known until run time, so how can it do that?
However, you can specify that a character is using UTF-8, regardless of any locale or whatnot going on in the compiler or target system.
static constexpr size_t nodesize = 1+u8'z'-u8'a';
Is it possible to avoid that std::make_unique because it uses c++14? I am asking this because initially, i wanted my library to use c++11 only. Does std::shared_ptr really hurt in this implementation?
â xorz57
Apr 7 at 0:55
If at some point an exception is thrown the string would be half written I fear... that's something else that bothers me.
â xorz57
Apr 7 at 0:57
make_unique
doesn't require C++14 to implement.std::make_unique
not being in C++11 was an oversight.std::shared_ptr
will work, but it's overkill. Each subtrie in the node array is owned by one parent trie. You don't need thestd::shared_ptr
control block/ref-counting. As for providing the strong exception guarantee on insertion, Create a new tree for the insertion term and merge. You could even optimize this by prefix searching and only creating/transferring the new suffix.
â Snowhawk
Apr 7 at 1:29
@xorz57 just add the missing def yourself, and use that. Performance: benchmark it both ways and see!
â JDà Âugosz
Apr 7 at 3:20
Btw,'z'-'a'
is not guaranteed to be 25 in C / C++. Only the order of 0 - 9 is guaranteed. It may not make a huge difference in practice, but I generally try to use"abcdefghijklmnopqrstuvwxyz"
in some way (defining it somewhere and doingcountof
orstrlen
in this case).
â Millie Smith
Apr 7 at 3:34
 |Â
show 6 more comments
up vote
3
down vote
accepted
up vote
3
down vote
accepted
pointers
I supposed the use of shared pointers was the possibility of sharing tree representation across instances, "persistence", and transactions. IâÂÂve just watched some presentations on persistent data structures (and on GoogleâÂÂs trie, for that matter) so that was in my mind.
I agree with Frank about pointers. When calling code that operates on an object, it doesnâÂÂt care how the object is owned, so making it take an argument of type shared_ptr means it canâÂÂt take objects owned by unique_ptr, or that are direct members of larger structures, or on the stack, etc. So these arguments are passed as a reference.
In the Standard Guidelines, pointers are always non-owning. You mark them with owner<> to indicate otherwise.
I agree that the root does not need to be dynamically allocated. But you need to avoid having an ugly special case for the first node.
Node* n = &root;
for (auto c : key)
int index = to_index(c);
auto& slot = n->children[index];
if (!slot) slot = make_unique<Node>();
n= slot.get();
duplicated traversal code
I note that both functions have the same logic to traverse the tree, in their core. Normally, like in standard containers, there will be a single function that does this, and it is used by all the other functions.
If these are the only two functions you have, itâÂÂs probably not worth the effort. But if you have more (remove, find the closest match, etc.) then you should do that.
26
The first thing I noticed on your update was that you replaced the evil macro with a magic number, rather than a better way of defining a constant.
static constexpr size_t nodesize = 1+'z'-'a';
std::unique_ptr<Node> children[nodesize];
bad key
int index = c - 'a'; // note signed result wanted
if (index<0 || index>=nodesize) throw invalid_argument("oops");
Both functions go over a string in the same manner, so make that a common function.
int index = to_index(c);
Portability of character encoding
ItâÂÂs been noted that the letters are not necessarily contiguous in the source character set. However, if you are writing in (original) EBCDIC you have worse problems and would not be able to type the
characters into the source file. (IâÂÂve discussed C++ on a primitive type of forum software running on an EBCDIC system that was lacking [ ]
and a few others, and it is not simple.
The execution character set is distinct from the source character set, and depends on the locale. More generally, you can see that it depends on the source of the strings such as a saved file â if the file uses a character set that doesnâÂÂt use the same codes for letters as we expected, then things will go bad.
So, part of the specification is that the input strings will always be in UTF-8, or (sufficient for our purposes) is compatible with ASCII.
What about at compile-time though? The standard says that the value of a character literal 'a'
is in the execution character set, not the source character set, which is good. Except that the execution character set is not known until run time, so how can it do that?
However, you can specify that a character is using UTF-8, regardless of any locale or whatnot going on in the compiler or target system.
static constexpr size_t nodesize = 1+u8'z'-u8'a';
pointers
I supposed the use of shared pointers was the possibility of sharing tree representation across instances, "persistence", and transactions. IâÂÂve just watched some presentations on persistent data structures (and on GoogleâÂÂs trie, for that matter) so that was in my mind.
I agree with Frank about pointers. When calling code that operates on an object, it doesnâÂÂt care how the object is owned, so making it take an argument of type shared_ptr means it canâÂÂt take objects owned by unique_ptr, or that are direct members of larger structures, or on the stack, etc. So these arguments are passed as a reference.
In the Standard Guidelines, pointers are always non-owning. You mark them with owner<> to indicate otherwise.
I agree that the root does not need to be dynamically allocated. But you need to avoid having an ugly special case for the first node.
Node* n = &root;
for (auto c : key)
int index = to_index(c);
auto& slot = n->children[index];
if (!slot) slot = make_unique<Node>();
n= slot.get();
duplicated traversal code
I note that both functions have the same logic to traverse the tree, in their core. Normally, like in standard containers, there will be a single function that does this, and it is used by all the other functions.
If these are the only two functions you have, itâÂÂs probably not worth the effort. But if you have more (remove, find the closest match, etc.) then you should do that.
26
The first thing I noticed on your update was that you replaced the evil macro with a magic number, rather than a better way of defining a constant.
static constexpr size_t nodesize = 1+'z'-'a';
std::unique_ptr<Node> children[nodesize];
bad key
int index = c - 'a'; // note signed result wanted
if (index<0 || index>=nodesize) throw invalid_argument("oops");
Both functions go over a string in the same manner, so make that a common function.
int index = to_index(c);
Portability of character encoding
ItâÂÂs been noted that the letters are not necessarily contiguous in the source character set. However, if you are writing in (original) EBCDIC you have worse problems and would not be able to type the
characters into the source file. (IâÂÂve discussed C++ on a primitive type of forum software running on an EBCDIC system that was lacking [ ]
and a few others, and it is not simple.
The execution character set is distinct from the source character set, and depends on the locale. More generally, you can see that it depends on the source of the strings such as a saved file â if the file uses a character set that doesnâÂÂt use the same codes for letters as we expected, then things will go bad.
So, part of the specification is that the input strings will always be in UTF-8, or (sufficient for our purposes) is compatible with ASCII.
What about at compile-time though? The standard says that the value of a character literal 'a'
is in the execution character set, not the source character set, which is good. Except that the execution character set is not known until run time, so how can it do that?
However, you can specify that a character is using UTF-8, regardless of any locale or whatnot going on in the compiler or target system.
static constexpr size_t nodesize = 1+u8'z'-u8'a';
edited Apr 7 at 23:04
answered Apr 6 at 23:59
JDÃ Âugosz
5,047731
5,047731
Is it possible to avoid that std::make_unique because it uses c++14? I am asking this because initially, i wanted my library to use c++11 only. Does std::shared_ptr really hurt in this implementation?
â xorz57
Apr 7 at 0:55
If at some point an exception is thrown the string would be half written I fear... that's something else that bothers me.
â xorz57
Apr 7 at 0:57
make_unique
doesn't require C++14 to implement.std::make_unique
not being in C++11 was an oversight.std::shared_ptr
will work, but it's overkill. Each subtrie in the node array is owned by one parent trie. You don't need thestd::shared_ptr
control block/ref-counting. As for providing the strong exception guarantee on insertion, Create a new tree for the insertion term and merge. You could even optimize this by prefix searching and only creating/transferring the new suffix.
â Snowhawk
Apr 7 at 1:29
@xorz57 just add the missing def yourself, and use that. Performance: benchmark it both ways and see!
â JDà Âugosz
Apr 7 at 3:20
Btw,'z'-'a'
is not guaranteed to be 25 in C / C++. Only the order of 0 - 9 is guaranteed. It may not make a huge difference in practice, but I generally try to use"abcdefghijklmnopqrstuvwxyz"
in some way (defining it somewhere and doingcountof
orstrlen
in this case).
â Millie Smith
Apr 7 at 3:34
 |Â
show 6 more comments
Is it possible to avoid that std::make_unique because it uses c++14? I am asking this because initially, i wanted my library to use c++11 only. Does std::shared_ptr really hurt in this implementation?
â xorz57
Apr 7 at 0:55
If at some point an exception is thrown the string would be half written I fear... that's something else that bothers me.
â xorz57
Apr 7 at 0:57
make_unique
doesn't require C++14 to implement.std::make_unique
not being in C++11 was an oversight.std::shared_ptr
will work, but it's overkill. Each subtrie in the node array is owned by one parent trie. You don't need thestd::shared_ptr
control block/ref-counting. As for providing the strong exception guarantee on insertion, Create a new tree for the insertion term and merge. You could even optimize this by prefix searching and only creating/transferring the new suffix.
â Snowhawk
Apr 7 at 1:29
@xorz57 just add the missing def yourself, and use that. Performance: benchmark it both ways and see!
â JDà Âugosz
Apr 7 at 3:20
Btw,'z'-'a'
is not guaranteed to be 25 in C / C++. Only the order of 0 - 9 is guaranteed. It may not make a huge difference in practice, but I generally try to use"abcdefghijklmnopqrstuvwxyz"
in some way (defining it somewhere and doingcountof
orstrlen
in this case).
â Millie Smith
Apr 7 at 3:34
Is it possible to avoid that std::make_unique because it uses c++14? I am asking this because initially, i wanted my library to use c++11 only. Does std::shared_ptr really hurt in this implementation?
â xorz57
Apr 7 at 0:55
Is it possible to avoid that std::make_unique because it uses c++14? I am asking this because initially, i wanted my library to use c++11 only. Does std::shared_ptr really hurt in this implementation?
â xorz57
Apr 7 at 0:55
If at some point an exception is thrown the string would be half written I fear... that's something else that bothers me.
â xorz57
Apr 7 at 0:57
If at some point an exception is thrown the string would be half written I fear... that's something else that bothers me.
â xorz57
Apr 7 at 0:57
make_unique
doesn't require C++14 to implement. std::make_unique
not being in C++11 was an oversight. std::shared_ptr
will work, but it's overkill. Each subtrie in the node array is owned by one parent trie. You don't need the std::shared_ptr
control block/ref-counting. As for providing the strong exception guarantee on insertion, Create a new tree for the insertion term and merge. You could even optimize this by prefix searching and only creating/transferring the new suffix.â Snowhawk
Apr 7 at 1:29
make_unique
doesn't require C++14 to implement. std::make_unique
not being in C++11 was an oversight. std::shared_ptr
will work, but it's overkill. Each subtrie in the node array is owned by one parent trie. You don't need the std::shared_ptr
control block/ref-counting. As for providing the strong exception guarantee on insertion, Create a new tree for the insertion term and merge. You could even optimize this by prefix searching and only creating/transferring the new suffix.â Snowhawk
Apr 7 at 1:29
@xorz57 just add the missing def yourself, and use that. Performance: benchmark it both ways and see!
â JDà Âugosz
Apr 7 at 3:20
@xorz57 just add the missing def yourself, and use that. Performance: benchmark it both ways and see!
â JDà Âugosz
Apr 7 at 3:20
Btw,
'z'-'a'
is not guaranteed to be 25 in C / C++. Only the order of 0 - 9 is guaranteed. It may not make a huge difference in practice, but I generally try to use "abcdefghijklmnopqrstuvwxyz"
in some way (defining it somewhere and doing countof
or strlen
in this case).â Millie Smith
Apr 7 at 3:34
Btw,
'z'-'a'
is not guaranteed to be 25 in C / C++. Only the order of 0 - 9 is guaranteed. It may not make a huge difference in practice, but I generally try to use "abcdefghijklmnopqrstuvwxyz"
in some way (defining it somewhere and doing countof
or strlen
in this case).â Millie Smith
Apr 7 at 3:34
 |Â
show 6 more comments
up vote
11
down vote
Input sanitization
Your functions take a std::string
as parameter, as such, they should be "well behaved" for any possible std::string
passed to them. Note that well behaved does not mean that it has to "work", just that it should not break anything.
Specifically, what would happen if I passed the string "Hello"
to that function? 'H' - 'a'
is -25
, ruh roh!
There's a few different ways to tackle this.
You could convert all upper-case letters to lower case, but that does not fix punctuation marks, spaces and numbers. I also personally think that wether the trie is case-sensitive or not should be the user's problem, not the trie itself.
You could bail out of the function if a unhandled character is hit.
Just expand
children
to 256 instead of 26 so that all legal values ofchar
are handled properly. Sure your trie is going to be 5 times as big, but that's a fairly minor detail since it grows logarithmically.
edit: This last approach also makes the trie operate on raw data instead of characters, which makes it encoding-agnostic (which gives you your unicode support)
Avoid using shared_ptr unless absolutely necessary
shared pointers are clunky and heavy affairs. On top of this, using them sends a message to people reading the code: Ownership of the object is potentially shared between multiple owners, which is simply not the case here.
in your case, a std::unique_ptr<>
is absolutely fine.
Your root does not need to be dynamically allocated
It's created at construction unconditionally, and destroyed at destruction unconditionally. On top of that, it does not make use of type erasure (polymorphism). As such, there's no reason for it not to be a normal member of trie.
std::shared_ptr<Node> root = std::make_shared<Node>();
becomes:
Node root;
This will require you to change your search and insert functions so that the first line becomes:
const Node* n = &root;
But that's fine because that's preferable once you move to unique_ptr
anyways.
Edit: On that note:
Raw pointers are not evil
std::shared_ptr<Node> n = root;
We tend to teach people that "you should never use raw pointers". But I find that extremely naive. The real rule is "You should never have raw pointers with ownership".
There is absolutely nothing wrong with using raw pointers as long as it's understood that:
- The pointer only refers to the pointed object without owning it.
- The lifetime of the pointer assignment to the object is fully enclosed by the lifetime of whatever DOES own the object.
In your code using shared_ptr
, using the following would have been 100% ok, and much better in my opinion:
const Node* n = root.get();
Mark non-mutating functions as const.
Your search()
member function should not alter the trie in any way, so you should mark it as const
like this:
bool search(const std::string & key) const {
There's a few subtle advantages for the compiler, but the main one is that if you goof up and accidentally do something that changes the trie, the compiler will tell you.
nitpick: private:
is redundant here.
The namespace of a class is private by default. Fun fact: that's the only difference between a class
and a struct
.
nitpick: redundant null-checking
in your search function, when I read the last line:
return n && n->end;
My impression was "Oh! n
can be possibly null
in certain cases", which led me to search for the scenario where that might be the case. It's misleading to the reader.
Defensive programming can be useful at times, but this is just excessive.
Great answer. My only complaint: you are too timid. YouâÂÂre saying that âÂÂin [your] opinionâ it would be better to use a raw pointer (rather than a smart pointer) in situation you point out. But this isnâÂÂt a matter of opinion. Using astd::shared_ptr
here is wrong (because it implies incorrect ownership semantics). At a stretch you could use astd::weak_ptr
here but IâÂÂd argue that even that is wrong since that, too, implies a semantic that doesnâÂÂt hold: namely that the owner might relinquish the resource while weâÂÂre accessing it. No. This is a raw pointer.
â Konrad Rudolph
Apr 6 at 16:38
@KonradRudolph A strong wrong would imply that the code is not well formed and fundamentally incorrect (like the out of bounds error I pointed out). This is not the case here. It's bad code, and should be changed, but it's not wrong code, it works. But I tend to agree, it's pretty black-and-white here.
â Frank
Apr 6 at 16:43
@Frank Ill-formedness and runtime errors are merely specific subsets of âÂÂwrongâÂÂ. For instance, a program can be well-formed, have no runtime errors, but give the wrong result. Or it can give the right result purely by accident. This is an instance of the latter: using a shared pointer happens to work here but it violates the use-case that it was designed for (due to the implicit specification of the algorithm, which has no ownership here). This use-case canâÂÂt be enforced programmatically but that doesnâÂÂt make the usage less wrong, merely unenforceable.
â Konrad Rudolph
Apr 6 at 16:48
@KonradRudolph But temporarily extending the lifetime of a shared object during transient processing is a legit (and common if you ever used ASIO) use-case. Using that technique here falls under the overkill, excessively defensive category of errors, similar to that null check.
â Frank
Apr 6 at 16:54
@Frank ThereâÂÂs no lifetime extension happening here. This is a pleonasm. Pleonasms, unless they serve a purpose in documenting expected behaviour (= defensive coding), are logic errors that simply happen to have no effect on the correctness of the result.
â Konrad Rudolph
Apr 6 at 17:21
add a comment |Â
up vote
11
down vote
Input sanitization
Your functions take a std::string
as parameter, as such, they should be "well behaved" for any possible std::string
passed to them. Note that well behaved does not mean that it has to "work", just that it should not break anything.
Specifically, what would happen if I passed the string "Hello"
to that function? 'H' - 'a'
is -25
, ruh roh!
There's a few different ways to tackle this.
You could convert all upper-case letters to lower case, but that does not fix punctuation marks, spaces and numbers. I also personally think that wether the trie is case-sensitive or not should be the user's problem, not the trie itself.
You could bail out of the function if a unhandled character is hit.
Just expand
children
to 256 instead of 26 so that all legal values ofchar
are handled properly. Sure your trie is going to be 5 times as big, but that's a fairly minor detail since it grows logarithmically.
edit: This last approach also makes the trie operate on raw data instead of characters, which makes it encoding-agnostic (which gives you your unicode support)
Avoid using shared_ptr unless absolutely necessary
shared pointers are clunky and heavy affairs. On top of this, using them sends a message to people reading the code: Ownership of the object is potentially shared between multiple owners, which is simply not the case here.
in your case, a std::unique_ptr<>
is absolutely fine.
Your root does not need to be dynamically allocated
It's created at construction unconditionally, and destroyed at destruction unconditionally. On top of that, it does not make use of type erasure (polymorphism). As such, there's no reason for it not to be a normal member of trie.
std::shared_ptr<Node> root = std::make_shared<Node>();
becomes:
Node root;
This will require you to change your search and insert functions so that the first line becomes:
const Node* n = &root;
But that's fine because that's preferable once you move to unique_ptr
anyways.
Edit: On that note:
Raw pointers are not evil
std::shared_ptr<Node> n = root;
We tend to teach people that "you should never use raw pointers". But I find that extremely naive. The real rule is "You should never have raw pointers with ownership".
There is absolutely nothing wrong with using raw pointers as long as it's understood that:
- The pointer only refers to the pointed object without owning it.
- The lifetime of the pointer assignment to the object is fully enclosed by the lifetime of whatever DOES own the object.
In your code using shared_ptr
, using the following would have been 100% ok, and much better in my opinion:
const Node* n = root.get();
Mark non-mutating functions as const.
Your search()
member function should not alter the trie in any way, so you should mark it as const
like this:
bool search(const std::string & key) const {
There's a few subtle advantages for the compiler, but the main one is that if you goof up and accidentally do something that changes the trie, the compiler will tell you.
nitpick: private:
is redundant here.
The namespace of a class is private by default. Fun fact: that's the only difference between a class
and a struct
.
nitpick: redundant null-checking
in your search function, when I read the last line:
return n && n->end;
My impression was "Oh! n
can be possibly null
in certain cases", which led me to search for the scenario where that might be the case. It's misleading to the reader.
Defensive programming can be useful at times, but this is just excessive.
Great answer. My only complaint: you are too timid. YouâÂÂre saying that âÂÂin [your] opinionâ it would be better to use a raw pointer (rather than a smart pointer) in situation you point out. But this isnâÂÂt a matter of opinion. Using astd::shared_ptr
here is wrong (because it implies incorrect ownership semantics). At a stretch you could use astd::weak_ptr
here but IâÂÂd argue that even that is wrong since that, too, implies a semantic that doesnâÂÂt hold: namely that the owner might relinquish the resource while weâÂÂre accessing it. No. This is a raw pointer.
â Konrad Rudolph
Apr 6 at 16:38
@KonradRudolph A strong wrong would imply that the code is not well formed and fundamentally incorrect (like the out of bounds error I pointed out). This is not the case here. It's bad code, and should be changed, but it's not wrong code, it works. But I tend to agree, it's pretty black-and-white here.
â Frank
Apr 6 at 16:43
@Frank Ill-formedness and runtime errors are merely specific subsets of âÂÂwrongâÂÂ. For instance, a program can be well-formed, have no runtime errors, but give the wrong result. Or it can give the right result purely by accident. This is an instance of the latter: using a shared pointer happens to work here but it violates the use-case that it was designed for (due to the implicit specification of the algorithm, which has no ownership here). This use-case canâÂÂt be enforced programmatically but that doesnâÂÂt make the usage less wrong, merely unenforceable.
â Konrad Rudolph
Apr 6 at 16:48
@KonradRudolph But temporarily extending the lifetime of a shared object during transient processing is a legit (and common if you ever used ASIO) use-case. Using that technique here falls under the overkill, excessively defensive category of errors, similar to that null check.
â Frank
Apr 6 at 16:54
@Frank ThereâÂÂs no lifetime extension happening here. This is a pleonasm. Pleonasms, unless they serve a purpose in documenting expected behaviour (= defensive coding), are logic errors that simply happen to have no effect on the correctness of the result.
â Konrad Rudolph
Apr 6 at 17:21
add a comment |Â
up vote
11
down vote
up vote
11
down vote
Input sanitization
Your functions take a std::string
as parameter, as such, they should be "well behaved" for any possible std::string
passed to them. Note that well behaved does not mean that it has to "work", just that it should not break anything.
Specifically, what would happen if I passed the string "Hello"
to that function? 'H' - 'a'
is -25
, ruh roh!
There's a few different ways to tackle this.
You could convert all upper-case letters to lower case, but that does not fix punctuation marks, spaces and numbers. I also personally think that wether the trie is case-sensitive or not should be the user's problem, not the trie itself.
You could bail out of the function if a unhandled character is hit.
Just expand
children
to 256 instead of 26 so that all legal values ofchar
are handled properly. Sure your trie is going to be 5 times as big, but that's a fairly minor detail since it grows logarithmically.
edit: This last approach also makes the trie operate on raw data instead of characters, which makes it encoding-agnostic (which gives you your unicode support)
Avoid using shared_ptr unless absolutely necessary
shared pointers are clunky and heavy affairs. On top of this, using them sends a message to people reading the code: Ownership of the object is potentially shared between multiple owners, which is simply not the case here.
in your case, a std::unique_ptr<>
is absolutely fine.
Your root does not need to be dynamically allocated
It's created at construction unconditionally, and destroyed at destruction unconditionally. On top of that, it does not make use of type erasure (polymorphism). As such, there's no reason for it not to be a normal member of trie.
std::shared_ptr<Node> root = std::make_shared<Node>();
becomes:
Node root;
This will require you to change your search and insert functions so that the first line becomes:
const Node* n = &root;
But that's fine because that's preferable once you move to unique_ptr
anyways.
Edit: On that note:
Raw pointers are not evil
std::shared_ptr<Node> n = root;
We tend to teach people that "you should never use raw pointers". But I find that extremely naive. The real rule is "You should never have raw pointers with ownership".
There is absolutely nothing wrong with using raw pointers as long as it's understood that:
- The pointer only refers to the pointed object without owning it.
- The lifetime of the pointer assignment to the object is fully enclosed by the lifetime of whatever DOES own the object.
In your code using shared_ptr
, using the following would have been 100% ok, and much better in my opinion:
const Node* n = root.get();
Mark non-mutating functions as const.
Your search()
member function should not alter the trie in any way, so you should mark it as const
like this:
bool search(const std::string & key) const {
There's a few subtle advantages for the compiler, but the main one is that if you goof up and accidentally do something that changes the trie, the compiler will tell you.
nitpick: private:
is redundant here.
The namespace of a class is private by default. Fun fact: that's the only difference between a class
and a struct
.
nitpick: redundant null-checking
in your search function, when I read the last line:
return n && n->end;
My impression was "Oh! n
can be possibly null
in certain cases", which led me to search for the scenario where that might be the case. It's misleading to the reader.
Defensive programming can be useful at times, but this is just excessive.
Input sanitization
Your functions take a std::string
as parameter, as such, they should be "well behaved" for any possible std::string
passed to them. Note that well behaved does not mean that it has to "work", just that it should not break anything.
Specifically, what would happen if I passed the string "Hello"
to that function? 'H' - 'a'
is -25
, ruh roh!
There's a few different ways to tackle this.
You could convert all upper-case letters to lower case, but that does not fix punctuation marks, spaces and numbers. I also personally think that wether the trie is case-sensitive or not should be the user's problem, not the trie itself.
You could bail out of the function if a unhandled character is hit.
Just expand
children
to 256 instead of 26 so that all legal values ofchar
are handled properly. Sure your trie is going to be 5 times as big, but that's a fairly minor detail since it grows logarithmically.
edit: This last approach also makes the trie operate on raw data instead of characters, which makes it encoding-agnostic (which gives you your unicode support)
Avoid using shared_ptr unless absolutely necessary
shared pointers are clunky and heavy affairs. On top of this, using them sends a message to people reading the code: Ownership of the object is potentially shared between multiple owners, which is simply not the case here.
in your case, a std::unique_ptr<>
is absolutely fine.
Your root does not need to be dynamically allocated
It's created at construction unconditionally, and destroyed at destruction unconditionally. On top of that, it does not make use of type erasure (polymorphism). As such, there's no reason for it not to be a normal member of trie.
std::shared_ptr<Node> root = std::make_shared<Node>();
becomes:
Node root;
This will require you to change your search and insert functions so that the first line becomes:
const Node* n = &root;
But that's fine because that's preferable once you move to unique_ptr
anyways.
Edit: On that note:
Raw pointers are not evil
std::shared_ptr<Node> n = root;
We tend to teach people that "you should never use raw pointers". But I find that extremely naive. The real rule is "You should never have raw pointers with ownership".
There is absolutely nothing wrong with using raw pointers as long as it's understood that:
- The pointer only refers to the pointed object without owning it.
- The lifetime of the pointer assignment to the object is fully enclosed by the lifetime of whatever DOES own the object.
In your code using shared_ptr
, using the following would have been 100% ok, and much better in my opinion:
const Node* n = root.get();
Mark non-mutating functions as const.
Your search()
member function should not alter the trie in any way, so you should mark it as const
like this:
bool search(const std::string & key) const {
There's a few subtle advantages for the compiler, but the main one is that if you goof up and accidentally do something that changes the trie, the compiler will tell you.
nitpick: private:
is redundant here.
The namespace of a class is private by default. Fun fact: that's the only difference between a class
and a struct
.
nitpick: redundant null-checking
in your search function, when I read the last line:
return n && n->end;
My impression was "Oh! n
can be possibly null
in certain cases", which led me to search for the scenario where that might be the case. It's misleading to the reader.
Defensive programming can be useful at times, but this is just excessive.
edited Apr 6 at 16:36
answered Apr 6 at 14:54
Frank
2,947319
2,947319
Great answer. My only complaint: you are too timid. YouâÂÂre saying that âÂÂin [your] opinionâ it would be better to use a raw pointer (rather than a smart pointer) in situation you point out. But this isnâÂÂt a matter of opinion. Using astd::shared_ptr
here is wrong (because it implies incorrect ownership semantics). At a stretch you could use astd::weak_ptr
here but IâÂÂd argue that even that is wrong since that, too, implies a semantic that doesnâÂÂt hold: namely that the owner might relinquish the resource while weâÂÂre accessing it. No. This is a raw pointer.
â Konrad Rudolph
Apr 6 at 16:38
@KonradRudolph A strong wrong would imply that the code is not well formed and fundamentally incorrect (like the out of bounds error I pointed out). This is not the case here. It's bad code, and should be changed, but it's not wrong code, it works. But I tend to agree, it's pretty black-and-white here.
â Frank
Apr 6 at 16:43
@Frank Ill-formedness and runtime errors are merely specific subsets of âÂÂwrongâÂÂ. For instance, a program can be well-formed, have no runtime errors, but give the wrong result. Or it can give the right result purely by accident. This is an instance of the latter: using a shared pointer happens to work here but it violates the use-case that it was designed for (due to the implicit specification of the algorithm, which has no ownership here). This use-case canâÂÂt be enforced programmatically but that doesnâÂÂt make the usage less wrong, merely unenforceable.
â Konrad Rudolph
Apr 6 at 16:48
@KonradRudolph But temporarily extending the lifetime of a shared object during transient processing is a legit (and common if you ever used ASIO) use-case. Using that technique here falls under the overkill, excessively defensive category of errors, similar to that null check.
â Frank
Apr 6 at 16:54
@Frank ThereâÂÂs no lifetime extension happening here. This is a pleonasm. Pleonasms, unless they serve a purpose in documenting expected behaviour (= defensive coding), are logic errors that simply happen to have no effect on the correctness of the result.
â Konrad Rudolph
Apr 6 at 17:21
add a comment |Â
Great answer. My only complaint: you are too timid. YouâÂÂre saying that âÂÂin [your] opinionâ it would be better to use a raw pointer (rather than a smart pointer) in situation you point out. But this isnâÂÂt a matter of opinion. Using astd::shared_ptr
here is wrong (because it implies incorrect ownership semantics). At a stretch you could use astd::weak_ptr
here but IâÂÂd argue that even that is wrong since that, too, implies a semantic that doesnâÂÂt hold: namely that the owner might relinquish the resource while weâÂÂre accessing it. No. This is a raw pointer.
â Konrad Rudolph
Apr 6 at 16:38
@KonradRudolph A strong wrong would imply that the code is not well formed and fundamentally incorrect (like the out of bounds error I pointed out). This is not the case here. It's bad code, and should be changed, but it's not wrong code, it works. But I tend to agree, it's pretty black-and-white here.
â Frank
Apr 6 at 16:43
@Frank Ill-formedness and runtime errors are merely specific subsets of âÂÂwrongâÂÂ. For instance, a program can be well-formed, have no runtime errors, but give the wrong result. Or it can give the right result purely by accident. This is an instance of the latter: using a shared pointer happens to work here but it violates the use-case that it was designed for (due to the implicit specification of the algorithm, which has no ownership here). This use-case canâÂÂt be enforced programmatically but that doesnâÂÂt make the usage less wrong, merely unenforceable.
â Konrad Rudolph
Apr 6 at 16:48
@KonradRudolph But temporarily extending the lifetime of a shared object during transient processing is a legit (and common if you ever used ASIO) use-case. Using that technique here falls under the overkill, excessively defensive category of errors, similar to that null check.
â Frank
Apr 6 at 16:54
@Frank ThereâÂÂs no lifetime extension happening here. This is a pleonasm. Pleonasms, unless they serve a purpose in documenting expected behaviour (= defensive coding), are logic errors that simply happen to have no effect on the correctness of the result.
â Konrad Rudolph
Apr 6 at 17:21
Great answer. My only complaint: you are too timid. YouâÂÂre saying that âÂÂin [your] opinionâ it would be better to use a raw pointer (rather than a smart pointer) in situation you point out. But this isnâÂÂt a matter of opinion. Using a
std::shared_ptr
here is wrong (because it implies incorrect ownership semantics). At a stretch you could use a std::weak_ptr
here but IâÂÂd argue that even that is wrong since that, too, implies a semantic that doesnâÂÂt hold: namely that the owner might relinquish the resource while weâÂÂre accessing it. No. This is a raw pointer.â Konrad Rudolph
Apr 6 at 16:38
Great answer. My only complaint: you are too timid. YouâÂÂre saying that âÂÂin [your] opinionâ it would be better to use a raw pointer (rather than a smart pointer) in situation you point out. But this isnâÂÂt a matter of opinion. Using a
std::shared_ptr
here is wrong (because it implies incorrect ownership semantics). At a stretch you could use a std::weak_ptr
here but IâÂÂd argue that even that is wrong since that, too, implies a semantic that doesnâÂÂt hold: namely that the owner might relinquish the resource while weâÂÂre accessing it. No. This is a raw pointer.â Konrad Rudolph
Apr 6 at 16:38
@KonradRudolph A strong wrong would imply that the code is not well formed and fundamentally incorrect (like the out of bounds error I pointed out). This is not the case here. It's bad code, and should be changed, but it's not wrong code, it works. But I tend to agree, it's pretty black-and-white here.
â Frank
Apr 6 at 16:43
@KonradRudolph A strong wrong would imply that the code is not well formed and fundamentally incorrect (like the out of bounds error I pointed out). This is not the case here. It's bad code, and should be changed, but it's not wrong code, it works. But I tend to agree, it's pretty black-and-white here.
â Frank
Apr 6 at 16:43
@Frank Ill-formedness and runtime errors are merely specific subsets of âÂÂwrongâÂÂ. For instance, a program can be well-formed, have no runtime errors, but give the wrong result. Or it can give the right result purely by accident. This is an instance of the latter: using a shared pointer happens to work here but it violates the use-case that it was designed for (due to the implicit specification of the algorithm, which has no ownership here). This use-case canâÂÂt be enforced programmatically but that doesnâÂÂt make the usage less wrong, merely unenforceable.
â Konrad Rudolph
Apr 6 at 16:48
@Frank Ill-formedness and runtime errors are merely specific subsets of âÂÂwrongâÂÂ. For instance, a program can be well-formed, have no runtime errors, but give the wrong result. Or it can give the right result purely by accident. This is an instance of the latter: using a shared pointer happens to work here but it violates the use-case that it was designed for (due to the implicit specification of the algorithm, which has no ownership here). This use-case canâÂÂt be enforced programmatically but that doesnâÂÂt make the usage less wrong, merely unenforceable.
â Konrad Rudolph
Apr 6 at 16:48
@KonradRudolph But temporarily extending the lifetime of a shared object during transient processing is a legit (and common if you ever used ASIO) use-case. Using that technique here falls under the overkill, excessively defensive category of errors, similar to that null check.
â Frank
Apr 6 at 16:54
@KonradRudolph But temporarily extending the lifetime of a shared object during transient processing is a legit (and common if you ever used ASIO) use-case. Using that technique here falls under the overkill, excessively defensive category of errors, similar to that null check.
â Frank
Apr 6 at 16:54
@Frank ThereâÂÂs no lifetime extension happening here. This is a pleonasm. Pleonasms, unless they serve a purpose in documenting expected behaviour (= defensive coding), are logic errors that simply happen to have no effect on the correctness of the result.
â Konrad Rudolph
Apr 6 at 17:21
@Frank ThereâÂÂs no lifetime extension happening here. This is a pleonasm. Pleonasms, unless they serve a purpose in documenting expected behaviour (= defensive coding), are logic errors that simply happen to have no effect on the correctness of the result.
â Konrad Rudolph
Apr 6 at 17:21
add a comment |Â
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f191413%2ftrie-data-structure-implementation-in-c11-using-smart-pointers-follow-up%23new-answer', 'question_page');
);
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Just added an answer with regard to
shared_ptr
to the previous questionsâ Harald Scheirich
Apr 6 at 14:30