Generic Merge Sort in C++
Clash Royale CLAN TAG#URR8PPP
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty margin-bottom:0;
up vote
2
down vote
favorite
I wanted to implement merge sort in a generic way. I was not sure whether I should use the iterator or the container for the template type. I did not find a way to infer the container type using iterator_traits, so I decided to use the templated type for the container. Please let me know what you think!
#include <iostream>
#include <vector>
using namespace std;
template<typename T>
void printArray(const T& a)
for (auto i: a)
cout << i << " ";
cout << endl;
template<typename T>
T mergeSort(typename T::iterator begin, typename T::iterator end)
auto n = end - begin;
if (n < 2)
return T(begin, end);
auto firstHalf = mergeSort<T>(begin, begin + n/2);
auto secondHalf = mergeSort<T>(begin + n/2, end);
auto firstIt = firstHalf.begin();
auto secondIt = secondHalf.begin();
T merged;
while (firstIt != firstHalf.end() && secondIt != secondHalf.end())
if (*firstIt < *secondIt)
merged.push_back(*firstIt);
++firstIt;
else
merged.push_back(*secondIt);
++secondIt;
while(firstIt != firstHalf.end())
merged.push_back(*firstIt);
++firstIt;
while(secondIt != secondHalf.end())
merged.push_back(*secondIt);
++secondIt;
return merged;
int main()
vector<int> a = 3, 0, 7, 5, 7, 8, 3, 1;
printArray(mergeSort<vector<int>>(a.begin(), a.end()));
return 0;
c++ sorting mergesort
add a comment |Â
up vote
2
down vote
favorite
I wanted to implement merge sort in a generic way. I was not sure whether I should use the iterator or the container for the template type. I did not find a way to infer the container type using iterator_traits, so I decided to use the templated type for the container. Please let me know what you think!
#include <iostream>
#include <vector>
using namespace std;
template<typename T>
void printArray(const T& a)
for (auto i: a)
cout << i << " ";
cout << endl;
template<typename T>
T mergeSort(typename T::iterator begin, typename T::iterator end)
auto n = end - begin;
if (n < 2)
return T(begin, end);
auto firstHalf = mergeSort<T>(begin, begin + n/2);
auto secondHalf = mergeSort<T>(begin + n/2, end);
auto firstIt = firstHalf.begin();
auto secondIt = secondHalf.begin();
T merged;
while (firstIt != firstHalf.end() && secondIt != secondHalf.end())
if (*firstIt < *secondIt)
merged.push_back(*firstIt);
++firstIt;
else
merged.push_back(*secondIt);
++secondIt;
while(firstIt != firstHalf.end())
merged.push_back(*firstIt);
++firstIt;
while(secondIt != secondHalf.end())
merged.push_back(*secondIt);
++secondIt;
return merged;
int main()
vector<int> a = 3, 0, 7, 5, 7, 8, 3, 1;
printArray(mergeSort<vector<int>>(a.begin(), a.end()));
return 0;
c++ sorting mergesort
add a comment |Â
up vote
2
down vote
favorite
up vote
2
down vote
favorite
I wanted to implement merge sort in a generic way. I was not sure whether I should use the iterator or the container for the template type. I did not find a way to infer the container type using iterator_traits, so I decided to use the templated type for the container. Please let me know what you think!
#include <iostream>
#include <vector>
using namespace std;
template<typename T>
void printArray(const T& a)
for (auto i: a)
cout << i << " ";
cout << endl;
template<typename T>
T mergeSort(typename T::iterator begin, typename T::iterator end)
auto n = end - begin;
if (n < 2)
return T(begin, end);
auto firstHalf = mergeSort<T>(begin, begin + n/2);
auto secondHalf = mergeSort<T>(begin + n/2, end);
auto firstIt = firstHalf.begin();
auto secondIt = secondHalf.begin();
T merged;
while (firstIt != firstHalf.end() && secondIt != secondHalf.end())
if (*firstIt < *secondIt)
merged.push_back(*firstIt);
++firstIt;
else
merged.push_back(*secondIt);
++secondIt;
while(firstIt != firstHalf.end())
merged.push_back(*firstIt);
++firstIt;
while(secondIt != secondHalf.end())
merged.push_back(*secondIt);
++secondIt;
return merged;
int main()
vector<int> a = 3, 0, 7, 5, 7, 8, 3, 1;
printArray(mergeSort<vector<int>>(a.begin(), a.end()));
return 0;
c++ sorting mergesort
I wanted to implement merge sort in a generic way. I was not sure whether I should use the iterator or the container for the template type. I did not find a way to infer the container type using iterator_traits, so I decided to use the templated type for the container. Please let me know what you think!
#include <iostream>
#include <vector>
using namespace std;
template<typename T>
void printArray(const T& a)
for (auto i: a)
cout << i << " ";
cout << endl;
template<typename T>
T mergeSort(typename T::iterator begin, typename T::iterator end)
auto n = end - begin;
if (n < 2)
return T(begin, end);
auto firstHalf = mergeSort<T>(begin, begin + n/2);
auto secondHalf = mergeSort<T>(begin + n/2, end);
auto firstIt = firstHalf.begin();
auto secondIt = secondHalf.begin();
T merged;
while (firstIt != firstHalf.end() && secondIt != secondHalf.end())
if (*firstIt < *secondIt)
merged.push_back(*firstIt);
++firstIt;
else
merged.push_back(*secondIt);
++secondIt;
while(firstIt != firstHalf.end())
merged.push_back(*firstIt);
++firstIt;
while(secondIt != secondHalf.end())
merged.push_back(*secondIt);
++secondIt;
return merged;
int main()
vector<int> a = 3, 0, 7, 5, 7, 8, 3, 1;
printArray(mergeSort<vector<int>>(a.begin(), a.end()));
return 0;
c++ sorting mergesort
edited Jan 5 at 22:07
Sam Onela
5,88461545
5,88461545
asked Jan 5 at 21:49
Nils
2631210
2631210
add a comment |Â
add a comment |Â
2 Answers
2
active
oldest
votes
up vote
3
down vote
accepted
Design
int main()
vector<int> a = 3, 0, 7, 5, 7, 8, 3, 1;
printArray(mergeSort<vector<int>>(a.begin(), a.end()));
return 0;
My problem here is that you have to specify the type as part of the name of the function mergeSort<vector<int>>()
. It would be much nicer if the compiler could work out the type for you, this would make modifying the code much easier in the long run.
In the above code if I change the type of a
then I also need to change the type of mergeSort<vector<int>>()
. In certain situations this could cause issues when coupled with auto-conversion.
I also dislike that you create a new container as a result of the sort. Why not sort in-place. A sort in-place can be used in conjunction with a copy to create a newly sorted container (but not the other way around).
Also your way of doing the sort allocates a lot of small arrays dynamically. This is very expensive. It would be better if you could allocate a temporary buffer that can be re-used by the sorting code.
The last thing in the design is that you use a single function to do all the work. In self documenting code it is much nicer to break your code into short self named (and thus explained) operations.
Code Review
Namespace using
Please don't do this:
using namespace std;
Thx! I never use "using namespace std;" for real code, this was just an exercise for myself, that's also why I wanted to implement merge for myself.
Thats not a good excuse. The trouble with doing it for small programs is that it becomes a reflex and you then accidentally do it all the time. Best not to form bad habits in the first place.
Also this code (if it had reviewed well) could have just been copied into a larger project and then polluted it.
Using const reference consistency.
Can you spot the issue here?
template<typename T>
void printArray(const T& a)
for (auto i: a)
cout << i << " ";
cout << endl;
What about i
. Every iteration of the loop you are copying the value out of the container into the variable i
. This is fine for POD or simple types. But what about complex types (or expensive to copy types).
for (auto const& i: a)
// ^^^^^^
cout << i << " ";
Assumptions Kill Generic abilities.
This assumes that the iterators are Random Access Iterators
.
auto n = end - begin;
You could use std::distance()
this will allow you to find the count in the most efficient way no matter what the type of the iterator.
You only use Copy Semantics
Your merge moves the objects between containers using copy semantics.
merged.push_back(*firstIt);
Since C++11 we have had move semantics. This (move) is never more expensive than a copy and usually much more efficient than a copy. None of your intermediate arrays are re-used. So there is no need to preserve the data in these intermediate containers. So you should be using move semantics when moving the objects internally.
Use standard algorithms were they are available.
while(firstIt != firstHalf.end())
merged.push_back(*firstIt);
++firstIt;
This is obviously a copy.
std::copy(firstIt, firstHalf.end(), std::back_inserter(merged));
Example
#include <vector>
#include <iterator>
#include <algorithm>
#include <utility>
#include <iostream>
template<typename I, typename BI>
void mergeTogether(I begin, I mid, I end, BI beginBuffer)
I low = begin;
I hig = mid;
BI buf = beginBuffer;
while(low < mid && hig < end)
if (*low <= *hig)
std::iter_swap(low, buf);
++low;
else
std::iter_swap(hig, buf);
++hig;
++buf;
std::move(low, mid, buf);
std::move(hig, end, buf);
std::size_t size = std::distance(begin, end);
std::move(beginBuffer, std::next(beginBuffer, size), begin);
template<typename I, typename BI>
void mergeSortUsingBuffer(I begin, I end, BI beginBuffer)
std::size_t size = std::distance(begin, end);
if (size < 2)
return;
std::size_t split = size/2;
I mid = std::next(begin, split);
BI midBuffer = std::next(beginBuffer, split);
mergeSortUsingBuffer(begin, mid, beginBuffer);
mergeSortUsingBuffer(mid, end, midBuffer);
mergeTogether(begin, mid, end, beginBuffer);
template<typename I>
void mergeSort(I begin, I end)
using ValueType = typename std::iterator_traits<I>::value_type;
std::size_t size = std::distance(begin, end);
std::vector<ValueType> buffer(size);
mergeSortUsingBuffer(begin, end, std::begin(buffer));
int main()
std::vector<int> a = 3, 0, 7, 5, 7, 8, 3, 1;
mergeSort(std::begin(a), std::end(a));
std::copy(std::begin(a), std::end(a),
std::ostream_iterator<int>(std::cout, ", "));
std::cout << "n";
1
Good advice, just a line explaining how the move semantics should be done here would help this answer.
â Felix Dombek
Jan 6 at 23:56
Thx for your review! > It would be much nicer if the compiler could work out the type for you, this would make modifying the code much easier in the long run. Sure, but how would you do it? If using the template type for iterators I couldn't figure out how to get the container type later in the function.
â Nils
Jan 7 at 20:30
1
@Nils You would template on the type of the parameters to the function not the parent type. So make the template parameter about iterators not about containers.
â Martin York
Jan 8 at 19:01
1
@Nils Added an example of how I would do based on all my comments.
â Martin York
Jan 8 at 20:02
Ah ok in this case the buffer will be a vector, but the input could use a different container. Thx for sharing your implementation!
â Nils
Jan 9 at 14:02
 |Â
show 3 more comments
up vote
3
down vote
While your merge step in mergeSort
works it should be put into its own function:
// Exercise: make `my_merge` even more generic
template <typename Iterator, typename OutputIterator>
void my_merge(Iterator first1, Iterator last1,
Iterator first2, Iterator last2,
OutputIterator out)
while (first1 != last1 && first2 != last2)
if (*first1 < *first2)
*out++ = *first1++;
else
*out++ = *first2++;
while(first1 != last1)
*out++ = *first1++;
while(first2 != last2)
*out++ = *first2++;
template<typename T>
T mergeSort(typename T::iterator begin, typename T::iterator end)
auto n = end - begin;
if (n < 2)
return T(begin, end);
auto firstHalf = mergeSort<T>(begin, begin + n/2);
auto secondHalf = mergeSort<T>(begin + n/2, end);
T merged;
my_merge(firstHalf.begin(), firstHalf.end(),
secondHalf.begin(), secondHalf.end(),
std::back_inserter(merged));
return merged;
Note that I had to call my_merge
that way since std
already contains std::merge
, which we can use instead:
template<typename T>
T mergeSort(typename T::iterator begin, typename T::iterator end)
auto n = end - begin;
if (n < 2)
return T(begin, end);
auto firstHalf = mergeSort<T>(begin, begin + n/2);
auto secondHalf = mergeSort<T>(begin + n/2, end);
T merged;
std::merge(firstHalf.begin(), firstHalf.end(),
secondHalf.begin(), secondHalf.end(),
std::back_inserter(merged));
return merged;
The merge
and the parameters begin
and end
are good examples why using namespace std
has to be handled with care. All those names are already in std
. Use using namespace std
only if you know the risks.
That being said, your generic variant will only work with random access iterators. If you want to make it more generic, you have to use std::distance
and std::next
instead of end - begin
and begin + n/2
:
template<typename T>
T mergeSort(typename T::iterator begin, typename T::iterator end)
const auto n = std::distance(begin, end);
if (n < 2)
return T(begin, end);
const auto midpoint = std::next(begin, n / 2);
const auto firstHalf = mergeSort<T>(begin, midpoint);
const auto secondHalf = mergeSort<T>(midpoint, end);
T merged;
std::merge(firstHalf.begin(), firstHalf.end(),
secondHalf.begin(), secondHalf.end(),
std::back_inserter(merged));
return merged;
Even if you don't want to use std::next
re-use the midpoint
.
Thx! I never use "using namespace std;" for real code, this was just an exercise for myself, that's also why I wanted to implement merge for myself.
â Nils
Jan 5 at 22:37
1
@Nils I was almost asleep when I wrote that review. I forgot to say that you can provide your ownmerge
version with a similar interface and thereby splitmergeSort
into two functions: one that splits, callsmerge
and calls itself recursively, and one that just merges.
â Zeta
Jan 6 at 11:10
I agree with the split, makes it easier to read.
â Nils
Jan 7 at 20:25
add a comment |Â
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
3
down vote
accepted
Design
int main()
vector<int> a = 3, 0, 7, 5, 7, 8, 3, 1;
printArray(mergeSort<vector<int>>(a.begin(), a.end()));
return 0;
My problem here is that you have to specify the type as part of the name of the function mergeSort<vector<int>>()
. It would be much nicer if the compiler could work out the type for you, this would make modifying the code much easier in the long run.
In the above code if I change the type of a
then I also need to change the type of mergeSort<vector<int>>()
. In certain situations this could cause issues when coupled with auto-conversion.
I also dislike that you create a new container as a result of the sort. Why not sort in-place. A sort in-place can be used in conjunction with a copy to create a newly sorted container (but not the other way around).
Also your way of doing the sort allocates a lot of small arrays dynamically. This is very expensive. It would be better if you could allocate a temporary buffer that can be re-used by the sorting code.
The last thing in the design is that you use a single function to do all the work. In self documenting code it is much nicer to break your code into short self named (and thus explained) operations.
Code Review
Namespace using
Please don't do this:
using namespace std;
Thx! I never use "using namespace std;" for real code, this was just an exercise for myself, that's also why I wanted to implement merge for myself.
Thats not a good excuse. The trouble with doing it for small programs is that it becomes a reflex and you then accidentally do it all the time. Best not to form bad habits in the first place.
Also this code (if it had reviewed well) could have just been copied into a larger project and then polluted it.
Using const reference consistency.
Can you spot the issue here?
template<typename T>
void printArray(const T& a)
for (auto i: a)
cout << i << " ";
cout << endl;
What about i
. Every iteration of the loop you are copying the value out of the container into the variable i
. This is fine for POD or simple types. But what about complex types (or expensive to copy types).
for (auto const& i: a)
// ^^^^^^
cout << i << " ";
Assumptions Kill Generic abilities.
This assumes that the iterators are Random Access Iterators
.
auto n = end - begin;
You could use std::distance()
this will allow you to find the count in the most efficient way no matter what the type of the iterator.
You only use Copy Semantics
Your merge moves the objects between containers using copy semantics.
merged.push_back(*firstIt);
Since C++11 we have had move semantics. This (move) is never more expensive than a copy and usually much more efficient than a copy. None of your intermediate arrays are re-used. So there is no need to preserve the data in these intermediate containers. So you should be using move semantics when moving the objects internally.
Use standard algorithms were they are available.
while(firstIt != firstHalf.end())
merged.push_back(*firstIt);
++firstIt;
This is obviously a copy.
std::copy(firstIt, firstHalf.end(), std::back_inserter(merged));
Example
#include <vector>
#include <iterator>
#include <algorithm>
#include <utility>
#include <iostream>
template<typename I, typename BI>
void mergeTogether(I begin, I mid, I end, BI beginBuffer)
I low = begin;
I hig = mid;
BI buf = beginBuffer;
while(low < mid && hig < end)
if (*low <= *hig)
std::iter_swap(low, buf);
++low;
else
std::iter_swap(hig, buf);
++hig;
++buf;
std::move(low, mid, buf);
std::move(hig, end, buf);
std::size_t size = std::distance(begin, end);
std::move(beginBuffer, std::next(beginBuffer, size), begin);
template<typename I, typename BI>
void mergeSortUsingBuffer(I begin, I end, BI beginBuffer)
std::size_t size = std::distance(begin, end);
if (size < 2)
return;
std::size_t split = size/2;
I mid = std::next(begin, split);
BI midBuffer = std::next(beginBuffer, split);
mergeSortUsingBuffer(begin, mid, beginBuffer);
mergeSortUsingBuffer(mid, end, midBuffer);
mergeTogether(begin, mid, end, beginBuffer);
template<typename I>
void mergeSort(I begin, I end)
using ValueType = typename std::iterator_traits<I>::value_type;
std::size_t size = std::distance(begin, end);
std::vector<ValueType> buffer(size);
mergeSortUsingBuffer(begin, end, std::begin(buffer));
int main()
std::vector<int> a = 3, 0, 7, 5, 7, 8, 3, 1;
mergeSort(std::begin(a), std::end(a));
std::copy(std::begin(a), std::end(a),
std::ostream_iterator<int>(std::cout, ", "));
std::cout << "n";
1
Good advice, just a line explaining how the move semantics should be done here would help this answer.
â Felix Dombek
Jan 6 at 23:56
Thx for your review! > It would be much nicer if the compiler could work out the type for you, this would make modifying the code much easier in the long run. Sure, but how would you do it? If using the template type for iterators I couldn't figure out how to get the container type later in the function.
â Nils
Jan 7 at 20:30
1
@Nils You would template on the type of the parameters to the function not the parent type. So make the template parameter about iterators not about containers.
â Martin York
Jan 8 at 19:01
1
@Nils Added an example of how I would do based on all my comments.
â Martin York
Jan 8 at 20:02
Ah ok in this case the buffer will be a vector, but the input could use a different container. Thx for sharing your implementation!
â Nils
Jan 9 at 14:02
 |Â
show 3 more comments
up vote
3
down vote
accepted
Design
int main()
vector<int> a = 3, 0, 7, 5, 7, 8, 3, 1;
printArray(mergeSort<vector<int>>(a.begin(), a.end()));
return 0;
My problem here is that you have to specify the type as part of the name of the function mergeSort<vector<int>>()
. It would be much nicer if the compiler could work out the type for you, this would make modifying the code much easier in the long run.
In the above code if I change the type of a
then I also need to change the type of mergeSort<vector<int>>()
. In certain situations this could cause issues when coupled with auto-conversion.
I also dislike that you create a new container as a result of the sort. Why not sort in-place. A sort in-place can be used in conjunction with a copy to create a newly sorted container (but not the other way around).
Also your way of doing the sort allocates a lot of small arrays dynamically. This is very expensive. It would be better if you could allocate a temporary buffer that can be re-used by the sorting code.
The last thing in the design is that you use a single function to do all the work. In self documenting code it is much nicer to break your code into short self named (and thus explained) operations.
Code Review
Namespace using
Please don't do this:
using namespace std;
Thx! I never use "using namespace std;" for real code, this was just an exercise for myself, that's also why I wanted to implement merge for myself.
Thats not a good excuse. The trouble with doing it for small programs is that it becomes a reflex and you then accidentally do it all the time. Best not to form bad habits in the first place.
Also this code (if it had reviewed well) could have just been copied into a larger project and then polluted it.
Using const reference consistency.
Can you spot the issue here?
template<typename T>
void printArray(const T& a)
for (auto i: a)
cout << i << " ";
cout << endl;
What about i
. Every iteration of the loop you are copying the value out of the container into the variable i
. This is fine for POD or simple types. But what about complex types (or expensive to copy types).
for (auto const& i: a)
// ^^^^^^
cout << i << " ";
Assumptions Kill Generic abilities.
This assumes that the iterators are Random Access Iterators
.
auto n = end - begin;
You could use std::distance()
this will allow you to find the count in the most efficient way no matter what the type of the iterator.
You only use Copy Semantics
Your merge moves the objects between containers using copy semantics.
merged.push_back(*firstIt);
Since C++11 we have had move semantics. This (move) is never more expensive than a copy and usually much more efficient than a copy. None of your intermediate arrays are re-used. So there is no need to preserve the data in these intermediate containers. So you should be using move semantics when moving the objects internally.
Use standard algorithms were they are available.
while(firstIt != firstHalf.end())
merged.push_back(*firstIt);
++firstIt;
This is obviously a copy.
std::copy(firstIt, firstHalf.end(), std::back_inserter(merged));
Example
#include <vector>
#include <iterator>
#include <algorithm>
#include <utility>
#include <iostream>
template<typename I, typename BI>
void mergeTogether(I begin, I mid, I end, BI beginBuffer)
I low = begin;
I hig = mid;
BI buf = beginBuffer;
while(low < mid && hig < end)
if (*low <= *hig)
std::iter_swap(low, buf);
++low;
else
std::iter_swap(hig, buf);
++hig;
++buf;
std::move(low, mid, buf);
std::move(hig, end, buf);
std::size_t size = std::distance(begin, end);
std::move(beginBuffer, std::next(beginBuffer, size), begin);
template<typename I, typename BI>
void mergeSortUsingBuffer(I begin, I end, BI beginBuffer)
std::size_t size = std::distance(begin, end);
if (size < 2)
return;
std::size_t split = size/2;
I mid = std::next(begin, split);
BI midBuffer = std::next(beginBuffer, split);
mergeSortUsingBuffer(begin, mid, beginBuffer);
mergeSortUsingBuffer(mid, end, midBuffer);
mergeTogether(begin, mid, end, beginBuffer);
template<typename I>
void mergeSort(I begin, I end)
using ValueType = typename std::iterator_traits<I>::value_type;
std::size_t size = std::distance(begin, end);
std::vector<ValueType> buffer(size);
mergeSortUsingBuffer(begin, end, std::begin(buffer));
int main()
std::vector<int> a = 3, 0, 7, 5, 7, 8, 3, 1;
mergeSort(std::begin(a), std::end(a));
std::copy(std::begin(a), std::end(a),
std::ostream_iterator<int>(std::cout, ", "));
std::cout << "n";
1
Good advice, just a line explaining how the move semantics should be done here would help this answer.
â Felix Dombek
Jan 6 at 23:56
Thx for your review! > It would be much nicer if the compiler could work out the type for you, this would make modifying the code much easier in the long run. Sure, but how would you do it? If using the template type for iterators I couldn't figure out how to get the container type later in the function.
â Nils
Jan 7 at 20:30
1
@Nils You would template on the type of the parameters to the function not the parent type. So make the template parameter about iterators not about containers.
â Martin York
Jan 8 at 19:01
1
@Nils Added an example of how I would do based on all my comments.
â Martin York
Jan 8 at 20:02
Ah ok in this case the buffer will be a vector, but the input could use a different container. Thx for sharing your implementation!
â Nils
Jan 9 at 14:02
 |Â
show 3 more comments
up vote
3
down vote
accepted
up vote
3
down vote
accepted
Design
int main()
vector<int> a = 3, 0, 7, 5, 7, 8, 3, 1;
printArray(mergeSort<vector<int>>(a.begin(), a.end()));
return 0;
My problem here is that you have to specify the type as part of the name of the function mergeSort<vector<int>>()
. It would be much nicer if the compiler could work out the type for you, this would make modifying the code much easier in the long run.
In the above code if I change the type of a
then I also need to change the type of mergeSort<vector<int>>()
. In certain situations this could cause issues when coupled with auto-conversion.
I also dislike that you create a new container as a result of the sort. Why not sort in-place. A sort in-place can be used in conjunction with a copy to create a newly sorted container (but not the other way around).
Also your way of doing the sort allocates a lot of small arrays dynamically. This is very expensive. It would be better if you could allocate a temporary buffer that can be re-used by the sorting code.
The last thing in the design is that you use a single function to do all the work. In self documenting code it is much nicer to break your code into short self named (and thus explained) operations.
Code Review
Namespace using
Please don't do this:
using namespace std;
Thx! I never use "using namespace std;" for real code, this was just an exercise for myself, that's also why I wanted to implement merge for myself.
Thats not a good excuse. The trouble with doing it for small programs is that it becomes a reflex and you then accidentally do it all the time. Best not to form bad habits in the first place.
Also this code (if it had reviewed well) could have just been copied into a larger project and then polluted it.
Using const reference consistency.
Can you spot the issue here?
template<typename T>
void printArray(const T& a)
for (auto i: a)
cout << i << " ";
cout << endl;
What about i
. Every iteration of the loop you are copying the value out of the container into the variable i
. This is fine for POD or simple types. But what about complex types (or expensive to copy types).
for (auto const& i: a)
// ^^^^^^
cout << i << " ";
Assumptions Kill Generic abilities.
This assumes that the iterators are Random Access Iterators
.
auto n = end - begin;
You could use std::distance()
this will allow you to find the count in the most efficient way no matter what the type of the iterator.
You only use Copy Semantics
Your merge moves the objects between containers using copy semantics.
merged.push_back(*firstIt);
Since C++11 we have had move semantics. This (move) is never more expensive than a copy and usually much more efficient than a copy. None of your intermediate arrays are re-used. So there is no need to preserve the data in these intermediate containers. So you should be using move semantics when moving the objects internally.
Use standard algorithms were they are available.
while(firstIt != firstHalf.end())
merged.push_back(*firstIt);
++firstIt;
This is obviously a copy.
std::copy(firstIt, firstHalf.end(), std::back_inserter(merged));
Example
#include <vector>
#include <iterator>
#include <algorithm>
#include <utility>
#include <iostream>
template<typename I, typename BI>
void mergeTogether(I begin, I mid, I end, BI beginBuffer)
I low = begin;
I hig = mid;
BI buf = beginBuffer;
while(low < mid && hig < end)
if (*low <= *hig)
std::iter_swap(low, buf);
++low;
else
std::iter_swap(hig, buf);
++hig;
++buf;
std::move(low, mid, buf);
std::move(hig, end, buf);
std::size_t size = std::distance(begin, end);
std::move(beginBuffer, std::next(beginBuffer, size), begin);
template<typename I, typename BI>
void mergeSortUsingBuffer(I begin, I end, BI beginBuffer)
std::size_t size = std::distance(begin, end);
if (size < 2)
return;
std::size_t split = size/2;
I mid = std::next(begin, split);
BI midBuffer = std::next(beginBuffer, split);
mergeSortUsingBuffer(begin, mid, beginBuffer);
mergeSortUsingBuffer(mid, end, midBuffer);
mergeTogether(begin, mid, end, beginBuffer);
template<typename I>
void mergeSort(I begin, I end)
using ValueType = typename std::iterator_traits<I>::value_type;
std::size_t size = std::distance(begin, end);
std::vector<ValueType> buffer(size);
mergeSortUsingBuffer(begin, end, std::begin(buffer));
int main()
std::vector<int> a = 3, 0, 7, 5, 7, 8, 3, 1;
mergeSort(std::begin(a), std::end(a));
std::copy(std::begin(a), std::end(a),
std::ostream_iterator<int>(std::cout, ", "));
std::cout << "n";
Design
int main()
vector<int> a = 3, 0, 7, 5, 7, 8, 3, 1;
printArray(mergeSort<vector<int>>(a.begin(), a.end()));
return 0;
My problem here is that you have to specify the type as part of the name of the function mergeSort<vector<int>>()
. It would be much nicer if the compiler could work out the type for you, this would make modifying the code much easier in the long run.
In the above code if I change the type of a
then I also need to change the type of mergeSort<vector<int>>()
. In certain situations this could cause issues when coupled with auto-conversion.
I also dislike that you create a new container as a result of the sort. Why not sort in-place. A sort in-place can be used in conjunction with a copy to create a newly sorted container (but not the other way around).
Also your way of doing the sort allocates a lot of small arrays dynamically. This is very expensive. It would be better if you could allocate a temporary buffer that can be re-used by the sorting code.
The last thing in the design is that you use a single function to do all the work. In self documenting code it is much nicer to break your code into short self named (and thus explained) operations.
Code Review
Namespace using
Please don't do this:
using namespace std;
Thx! I never use "using namespace std;" for real code, this was just an exercise for myself, that's also why I wanted to implement merge for myself.
Thats not a good excuse. The trouble with doing it for small programs is that it becomes a reflex and you then accidentally do it all the time. Best not to form bad habits in the first place.
Also this code (if it had reviewed well) could have just been copied into a larger project and then polluted it.
Using const reference consistency.
Can you spot the issue here?
template<typename T>
void printArray(const T& a)
for (auto i: a)
cout << i << " ";
cout << endl;
What about i
. Every iteration of the loop you are copying the value out of the container into the variable i
. This is fine for POD or simple types. But what about complex types (or expensive to copy types).
for (auto const& i: a)
// ^^^^^^
cout << i << " ";
Assumptions Kill Generic abilities.
This assumes that the iterators are Random Access Iterators
.
auto n = end - begin;
You could use std::distance()
this will allow you to find the count in the most efficient way no matter what the type of the iterator.
You only use Copy Semantics
Your merge moves the objects between containers using copy semantics.
merged.push_back(*firstIt);
Since C++11 we have had move semantics. This (move) is never more expensive than a copy and usually much more efficient than a copy. None of your intermediate arrays are re-used. So there is no need to preserve the data in these intermediate containers. So you should be using move semantics when moving the objects internally.
Use standard algorithms were they are available.
while(firstIt != firstHalf.end())
merged.push_back(*firstIt);
++firstIt;
This is obviously a copy.
std::copy(firstIt, firstHalf.end(), std::back_inserter(merged));
Example
#include <vector>
#include <iterator>
#include <algorithm>
#include <utility>
#include <iostream>
template<typename I, typename BI>
void mergeTogether(I begin, I mid, I end, BI beginBuffer)
I low = begin;
I hig = mid;
BI buf = beginBuffer;
while(low < mid && hig < end)
if (*low <= *hig)
std::iter_swap(low, buf);
++low;
else
std::iter_swap(hig, buf);
++hig;
++buf;
std::move(low, mid, buf);
std::move(hig, end, buf);
std::size_t size = std::distance(begin, end);
std::move(beginBuffer, std::next(beginBuffer, size), begin);
template<typename I, typename BI>
void mergeSortUsingBuffer(I begin, I end, BI beginBuffer)
std::size_t size = std::distance(begin, end);
if (size < 2)
return;
std::size_t split = size/2;
I mid = std::next(begin, split);
BI midBuffer = std::next(beginBuffer, split);
mergeSortUsingBuffer(begin, mid, beginBuffer);
mergeSortUsingBuffer(mid, end, midBuffer);
mergeTogether(begin, mid, end, beginBuffer);
template<typename I>
void mergeSort(I begin, I end)
using ValueType = typename std::iterator_traits<I>::value_type;
std::size_t size = std::distance(begin, end);
std::vector<ValueType> buffer(size);
mergeSortUsingBuffer(begin, end, std::begin(buffer));
int main()
std::vector<int> a = 3, 0, 7, 5, 7, 8, 3, 1;
mergeSort(std::begin(a), std::end(a));
std::copy(std::begin(a), std::end(a),
std::ostream_iterator<int>(std::cout, ", "));
std::cout << "n";
edited Jan 8 at 19:56
answered Jan 6 at 21:02
Martin York
70.9k481244
70.9k481244
1
Good advice, just a line explaining how the move semantics should be done here would help this answer.
â Felix Dombek
Jan 6 at 23:56
Thx for your review! > It would be much nicer if the compiler could work out the type for you, this would make modifying the code much easier in the long run. Sure, but how would you do it? If using the template type for iterators I couldn't figure out how to get the container type later in the function.
â Nils
Jan 7 at 20:30
1
@Nils You would template on the type of the parameters to the function not the parent type. So make the template parameter about iterators not about containers.
â Martin York
Jan 8 at 19:01
1
@Nils Added an example of how I would do based on all my comments.
â Martin York
Jan 8 at 20:02
Ah ok in this case the buffer will be a vector, but the input could use a different container. Thx for sharing your implementation!
â Nils
Jan 9 at 14:02
 |Â
show 3 more comments
1
Good advice, just a line explaining how the move semantics should be done here would help this answer.
â Felix Dombek
Jan 6 at 23:56
Thx for your review! > It would be much nicer if the compiler could work out the type for you, this would make modifying the code much easier in the long run. Sure, but how would you do it? If using the template type for iterators I couldn't figure out how to get the container type later in the function.
â Nils
Jan 7 at 20:30
1
@Nils You would template on the type of the parameters to the function not the parent type. So make the template parameter about iterators not about containers.
â Martin York
Jan 8 at 19:01
1
@Nils Added an example of how I would do based on all my comments.
â Martin York
Jan 8 at 20:02
Ah ok in this case the buffer will be a vector, but the input could use a different container. Thx for sharing your implementation!
â Nils
Jan 9 at 14:02
1
1
Good advice, just a line explaining how the move semantics should be done here would help this answer.
â Felix Dombek
Jan 6 at 23:56
Good advice, just a line explaining how the move semantics should be done here would help this answer.
â Felix Dombek
Jan 6 at 23:56
Thx for your review! > It would be much nicer if the compiler could work out the type for you, this would make modifying the code much easier in the long run. Sure, but how would you do it? If using the template type for iterators I couldn't figure out how to get the container type later in the function.
â Nils
Jan 7 at 20:30
Thx for your review! > It would be much nicer if the compiler could work out the type for you, this would make modifying the code much easier in the long run. Sure, but how would you do it? If using the template type for iterators I couldn't figure out how to get the container type later in the function.
â Nils
Jan 7 at 20:30
1
1
@Nils You would template on the type of the parameters to the function not the parent type. So make the template parameter about iterators not about containers.
â Martin York
Jan 8 at 19:01
@Nils You would template on the type of the parameters to the function not the parent type. So make the template parameter about iterators not about containers.
â Martin York
Jan 8 at 19:01
1
1
@Nils Added an example of how I would do based on all my comments.
â Martin York
Jan 8 at 20:02
@Nils Added an example of how I would do based on all my comments.
â Martin York
Jan 8 at 20:02
Ah ok in this case the buffer will be a vector, but the input could use a different container. Thx for sharing your implementation!
â Nils
Jan 9 at 14:02
Ah ok in this case the buffer will be a vector, but the input could use a different container. Thx for sharing your implementation!
â Nils
Jan 9 at 14:02
 |Â
show 3 more comments
up vote
3
down vote
While your merge step in mergeSort
works it should be put into its own function:
// Exercise: make `my_merge` even more generic
template <typename Iterator, typename OutputIterator>
void my_merge(Iterator first1, Iterator last1,
Iterator first2, Iterator last2,
OutputIterator out)
while (first1 != last1 && first2 != last2)
if (*first1 < *first2)
*out++ = *first1++;
else
*out++ = *first2++;
while(first1 != last1)
*out++ = *first1++;
while(first2 != last2)
*out++ = *first2++;
template<typename T>
T mergeSort(typename T::iterator begin, typename T::iterator end)
auto n = end - begin;
if (n < 2)
return T(begin, end);
auto firstHalf = mergeSort<T>(begin, begin + n/2);
auto secondHalf = mergeSort<T>(begin + n/2, end);
T merged;
my_merge(firstHalf.begin(), firstHalf.end(),
secondHalf.begin(), secondHalf.end(),
std::back_inserter(merged));
return merged;
Note that I had to call my_merge
that way since std
already contains std::merge
, which we can use instead:
template<typename T>
T mergeSort(typename T::iterator begin, typename T::iterator end)
auto n = end - begin;
if (n < 2)
return T(begin, end);
auto firstHalf = mergeSort<T>(begin, begin + n/2);
auto secondHalf = mergeSort<T>(begin + n/2, end);
T merged;
std::merge(firstHalf.begin(), firstHalf.end(),
secondHalf.begin(), secondHalf.end(),
std::back_inserter(merged));
return merged;
The merge
and the parameters begin
and end
are good examples why using namespace std
has to be handled with care. All those names are already in std
. Use using namespace std
only if you know the risks.
That being said, your generic variant will only work with random access iterators. If you want to make it more generic, you have to use std::distance
and std::next
instead of end - begin
and begin + n/2
:
template<typename T>
T mergeSort(typename T::iterator begin, typename T::iterator end)
const auto n = std::distance(begin, end);
if (n < 2)
return T(begin, end);
const auto midpoint = std::next(begin, n / 2);
const auto firstHalf = mergeSort<T>(begin, midpoint);
const auto secondHalf = mergeSort<T>(midpoint, end);
T merged;
std::merge(firstHalf.begin(), firstHalf.end(),
secondHalf.begin(), secondHalf.end(),
std::back_inserter(merged));
return merged;
Even if you don't want to use std::next
re-use the midpoint
.
Thx! I never use "using namespace std;" for real code, this was just an exercise for myself, that's also why I wanted to implement merge for myself.
â Nils
Jan 5 at 22:37
1
@Nils I was almost asleep when I wrote that review. I forgot to say that you can provide your ownmerge
version with a similar interface and thereby splitmergeSort
into two functions: one that splits, callsmerge
and calls itself recursively, and one that just merges.
â Zeta
Jan 6 at 11:10
I agree with the split, makes it easier to read.
â Nils
Jan 7 at 20:25
add a comment |Â
up vote
3
down vote
While your merge step in mergeSort
works it should be put into its own function:
// Exercise: make `my_merge` even more generic
template <typename Iterator, typename OutputIterator>
void my_merge(Iterator first1, Iterator last1,
Iterator first2, Iterator last2,
OutputIterator out)
while (first1 != last1 && first2 != last2)
if (*first1 < *first2)
*out++ = *first1++;
else
*out++ = *first2++;
while(first1 != last1)
*out++ = *first1++;
while(first2 != last2)
*out++ = *first2++;
template<typename T>
T mergeSort(typename T::iterator begin, typename T::iterator end)
auto n = end - begin;
if (n < 2)
return T(begin, end);
auto firstHalf = mergeSort<T>(begin, begin + n/2);
auto secondHalf = mergeSort<T>(begin + n/2, end);
T merged;
my_merge(firstHalf.begin(), firstHalf.end(),
secondHalf.begin(), secondHalf.end(),
std::back_inserter(merged));
return merged;
Note that I had to call my_merge
that way since std
already contains std::merge
, which we can use instead:
template<typename T>
T mergeSort(typename T::iterator begin, typename T::iterator end)
auto n = end - begin;
if (n < 2)
return T(begin, end);
auto firstHalf = mergeSort<T>(begin, begin + n/2);
auto secondHalf = mergeSort<T>(begin + n/2, end);
T merged;
std::merge(firstHalf.begin(), firstHalf.end(),
secondHalf.begin(), secondHalf.end(),
std::back_inserter(merged));
return merged;
The merge
and the parameters begin
and end
are good examples why using namespace std
has to be handled with care. All those names are already in std
. Use using namespace std
only if you know the risks.
That being said, your generic variant will only work with random access iterators. If you want to make it more generic, you have to use std::distance
and std::next
instead of end - begin
and begin + n/2
:
template<typename T>
T mergeSort(typename T::iterator begin, typename T::iterator end)
const auto n = std::distance(begin, end);
if (n < 2)
return T(begin, end);
const auto midpoint = std::next(begin, n / 2);
const auto firstHalf = mergeSort<T>(begin, midpoint);
const auto secondHalf = mergeSort<T>(midpoint, end);
T merged;
std::merge(firstHalf.begin(), firstHalf.end(),
secondHalf.begin(), secondHalf.end(),
std::back_inserter(merged));
return merged;
Even if you don't want to use std::next
re-use the midpoint
.
Thx! I never use "using namespace std;" for real code, this was just an exercise for myself, that's also why I wanted to implement merge for myself.
â Nils
Jan 5 at 22:37
1
@Nils I was almost asleep when I wrote that review. I forgot to say that you can provide your ownmerge
version with a similar interface and thereby splitmergeSort
into two functions: one that splits, callsmerge
and calls itself recursively, and one that just merges.
â Zeta
Jan 6 at 11:10
I agree with the split, makes it easier to read.
â Nils
Jan 7 at 20:25
add a comment |Â
up vote
3
down vote
up vote
3
down vote
While your merge step in mergeSort
works it should be put into its own function:
// Exercise: make `my_merge` even more generic
template <typename Iterator, typename OutputIterator>
void my_merge(Iterator first1, Iterator last1,
Iterator first2, Iterator last2,
OutputIterator out)
while (first1 != last1 && first2 != last2)
if (*first1 < *first2)
*out++ = *first1++;
else
*out++ = *first2++;
while(first1 != last1)
*out++ = *first1++;
while(first2 != last2)
*out++ = *first2++;
template<typename T>
T mergeSort(typename T::iterator begin, typename T::iterator end)
auto n = end - begin;
if (n < 2)
return T(begin, end);
auto firstHalf = mergeSort<T>(begin, begin + n/2);
auto secondHalf = mergeSort<T>(begin + n/2, end);
T merged;
my_merge(firstHalf.begin(), firstHalf.end(),
secondHalf.begin(), secondHalf.end(),
std::back_inserter(merged));
return merged;
Note that I had to call my_merge
that way since std
already contains std::merge
, which we can use instead:
template<typename T>
T mergeSort(typename T::iterator begin, typename T::iterator end)
auto n = end - begin;
if (n < 2)
return T(begin, end);
auto firstHalf = mergeSort<T>(begin, begin + n/2);
auto secondHalf = mergeSort<T>(begin + n/2, end);
T merged;
std::merge(firstHalf.begin(), firstHalf.end(),
secondHalf.begin(), secondHalf.end(),
std::back_inserter(merged));
return merged;
The merge
and the parameters begin
and end
are good examples why using namespace std
has to be handled with care. All those names are already in std
. Use using namespace std
only if you know the risks.
That being said, your generic variant will only work with random access iterators. If you want to make it more generic, you have to use std::distance
and std::next
instead of end - begin
and begin + n/2
:
template<typename T>
T mergeSort(typename T::iterator begin, typename T::iterator end)
const auto n = std::distance(begin, end);
if (n < 2)
return T(begin, end);
const auto midpoint = std::next(begin, n / 2);
const auto firstHalf = mergeSort<T>(begin, midpoint);
const auto secondHalf = mergeSort<T>(midpoint, end);
T merged;
std::merge(firstHalf.begin(), firstHalf.end(),
secondHalf.begin(), secondHalf.end(),
std::back_inserter(merged));
return merged;
Even if you don't want to use std::next
re-use the midpoint
.
While your merge step in mergeSort
works it should be put into its own function:
// Exercise: make `my_merge` even more generic
template <typename Iterator, typename OutputIterator>
void my_merge(Iterator first1, Iterator last1,
Iterator first2, Iterator last2,
OutputIterator out)
while (first1 != last1 && first2 != last2)
if (*first1 < *first2)
*out++ = *first1++;
else
*out++ = *first2++;
while(first1 != last1)
*out++ = *first1++;
while(first2 != last2)
*out++ = *first2++;
template<typename T>
T mergeSort(typename T::iterator begin, typename T::iterator end)
auto n = end - begin;
if (n < 2)
return T(begin, end);
auto firstHalf = mergeSort<T>(begin, begin + n/2);
auto secondHalf = mergeSort<T>(begin + n/2, end);
T merged;
my_merge(firstHalf.begin(), firstHalf.end(),
secondHalf.begin(), secondHalf.end(),
std::back_inserter(merged));
return merged;
Note that I had to call my_merge
that way since std
already contains std::merge
, which we can use instead:
template<typename T>
T mergeSort(typename T::iterator begin, typename T::iterator end)
auto n = end - begin;
if (n < 2)
return T(begin, end);
auto firstHalf = mergeSort<T>(begin, begin + n/2);
auto secondHalf = mergeSort<T>(begin + n/2, end);
T merged;
std::merge(firstHalf.begin(), firstHalf.end(),
secondHalf.begin(), secondHalf.end(),
std::back_inserter(merged));
return merged;
The merge
and the parameters begin
and end
are good examples why using namespace std
has to be handled with care. All those names are already in std
. Use using namespace std
only if you know the risks.
That being said, your generic variant will only work with random access iterators. If you want to make it more generic, you have to use std::distance
and std::next
instead of end - begin
and begin + n/2
:
template<typename T>
T mergeSort(typename T::iterator begin, typename T::iterator end)
const auto n = std::distance(begin, end);
if (n < 2)
return T(begin, end);
const auto midpoint = std::next(begin, n / 2);
const auto firstHalf = mergeSort<T>(begin, midpoint);
const auto secondHalf = mergeSort<T>(midpoint, end);
T merged;
std::merge(firstHalf.begin(), firstHalf.end(),
secondHalf.begin(), secondHalf.end(),
std::back_inserter(merged));
return merged;
Even if you don't want to use std::next
re-use the midpoint
.
edited Jan 6 at 11:18
answered Jan 5 at 22:24
Zeta
14.3k23267
14.3k23267
Thx! I never use "using namespace std;" for real code, this was just an exercise for myself, that's also why I wanted to implement merge for myself.
â Nils
Jan 5 at 22:37
1
@Nils I was almost asleep when I wrote that review. I forgot to say that you can provide your ownmerge
version with a similar interface and thereby splitmergeSort
into two functions: one that splits, callsmerge
and calls itself recursively, and one that just merges.
â Zeta
Jan 6 at 11:10
I agree with the split, makes it easier to read.
â Nils
Jan 7 at 20:25
add a comment |Â
Thx! I never use "using namespace std;" for real code, this was just an exercise for myself, that's also why I wanted to implement merge for myself.
â Nils
Jan 5 at 22:37
1
@Nils I was almost asleep when I wrote that review. I forgot to say that you can provide your ownmerge
version with a similar interface and thereby splitmergeSort
into two functions: one that splits, callsmerge
and calls itself recursively, and one that just merges.
â Zeta
Jan 6 at 11:10
I agree with the split, makes it easier to read.
â Nils
Jan 7 at 20:25
Thx! I never use "using namespace std;" for real code, this was just an exercise for myself, that's also why I wanted to implement merge for myself.
â Nils
Jan 5 at 22:37
Thx! I never use "using namespace std;" for real code, this was just an exercise for myself, that's also why I wanted to implement merge for myself.
â Nils
Jan 5 at 22:37
1
1
@Nils I was almost asleep when I wrote that review. I forgot to say that you can provide your own
merge
version with a similar interface and thereby split mergeSort
into two functions: one that splits, calls merge
and calls itself recursively, and one that just merges.â Zeta
Jan 6 at 11:10
@Nils I was almost asleep when I wrote that review. I forgot to say that you can provide your own
merge
version with a similar interface and thereby split mergeSort
into two functions: one that splits, calls merge
and calls itself recursively, and one that just merges.â Zeta
Jan 6 at 11:10
I agree with the split, makes it easier to read.
â Nils
Jan 7 at 20:25
I agree with the split, makes it easier to read.
â Nils
Jan 7 at 20:25
add a comment |Â
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f184402%2fgeneric-merge-sort-in-c%23new-answer', 'question_page');
);
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password