trjhtr

Question

I have been playing around with these two functions for a bit, dunno if this is practical though. I have only been learning C++ for a couple of days, coming from a Java background.

There we had a function that does this for us; I tried making something similar.

#include <iostream>
#include <string>
#include <vector>

using namespace std;

vector<string>parse(string test, string Deli);
int main()


vector<string> x = parse("random text to test splitting apart ", " ");
// note , the deliminator have to be after the text not before it.
for (string &e : x)

 cout << e << endl;

return 0;

vector<string>parse(string test, string Deli) 
 int count = 0; int token = 0;
 vector<string>parsed;
 for (size_t i = 0; i <= count; i++)
 
 string x = (test.substr(token, test.find(Deli, token)-token));
 parsed.push_back(x);
 token += test.find(Deli, token +1) - (token-1);
 test.find(Deli, token) != std::string::npos ? count++ : count;
 
 return parsed;

There is a test case, yes. And besides a warning its seemingly working. — Apr 29 at 13:04
@bruglesco , i tested it couple times and it did work . but I'm not sure what scenario could make it fail if there's one — Apr 29 at 13:07
@NoorNizar, I recommend waiting with accepting an answer. People sometimes feel discouraged to post their own if one is already accepted. The optimal time to accept is a day or two after posting. — Apr 29 at 15:49

Toby Speight 17.4k13489 · Accepted Answer · 2018-04-30 16:35:59Z

Bug

If the original string doesn't have trailing delimiter, the code leaves out the last token.

Unqualified calls

Unqualified call is any function call which doesn't have preceding ::. People oppose using namespace std;. In my opinion, it is not as much of a problem if done carefully (for small programs or at function/block scope). What is dangerous though, is unqualified call. It can be very evil. Anyway, my recommendation is to just type std:: unless you want to have a fallback into standard library functions at the current scope. I know this sounds confusing, but may be as time passes by it will become clearer.

Formatting

There are formatting conventions many people use to make code readable among themselves. Sometimes it helps to understand control flow, templates, pointers, etc. I don't have any strong opinions on any particular one though, as long as it is consistent. Formatting the code in question used is rather unusual. I use snake_case, for everything except template parameter names and concepts. The more common version though is CamelCase for type names and camelCase for variable names, functions. ALL_CAPS are reserved for macros, as they're quite evil.

Accepting by value

Although sometimes it makes sense, in the case shown in question, it doesn't. When something is used for read-only purposes, pass by const T&. In this case, it might even make sense to take by std::string_view. At the moment the code copies the arguments, which may be very costly if strings are big.

Peculiar way of looping

count and i are at the moment sort of a flag, either true or false (it takes many values, but squashing i and count into one would produce the flag). It would be better to loop until find returns npos.

size_t

For indexing, size_t is usually used. Some people prefer ptrdiff_t. The advantage they have over int is that they will as big as indexable memory is (though max of ptrdiff_t might be less that max of size_t). int might not be big enough in some cases.

Small stuff

Use const auto& if looping in read only mode (for (const auto& e: x)).

return 0 is redundant.

Don't use std::endl unless immediate printing is necessary. An example of immediate printing can be real-time games, where users need to see output immediately.

Don't put extra () anywhere unless compiler doesn't understand you. Most of the time they are redundant, in worst cases they produce bugs and dangling references.

My approach

I'm a lover of standard library algorithms. Here is a good talk to give a starter if you know basic algorithms (Disclaimer: I'm not familiar with the speaker, though ACCU is a good conference).

Lets develop the algorithm first.

prev_pos <- start of the string

next_pos <- next delimiter location in string

if next_pos is end of the string, go to 7

append prev_pos, next_pos to results

increase prev_pos by next_pos + delimiter.size().

Go to 2.

if prev_pos is not end of the string, add the remaining portion of the string to results.

(This is a very bad parody to Knuth's style)

From the algorithm, it is already clear that the variables prev_pos and next_pos are not well named. I left it out as example, to demonstrate when variables could be given a good names. Good algorithm is complemented by good names, and sometimes staggered by bad ones.

Now the implementation. When starting out with a language, it is very hard to find direct equivalent to actions in the algorithm. Fortunately, C++ allows coming close to it, though building blocks are in the standard library already. I recommend using some good, up-to-date documentation. I use cppreference.com .

The first step is understanding iterators. I don't really have a good reference for them, but iterators are the glue that binds containers and algorithms in C++. For now, let's assume it is a pointer into the string. Every standard library container has begin() and end(). The former is iterator to the beginning of container, and the latter is one past the end iterator. In case of "word", begin would point to w, and end to whatever is after d.

The algorithm that searches for the occurrence of a sequence in a bigger sequence is std::search(). If "very long string" is std::searched for "string", it will return iterator pointing to s.

std::vector has emplace_back(), which constructs element in-place. It takes whatever element type's constructor takes (in this case std::string). In the code above, a string is created after substring, and only then copied into result. Using iterator pair will eliminate the unnecessary copies.

Code

#include <algorithm>
#include <vector>
#include <iterator>
#include <iostream>
#include <string_view>

std::vector<std::string> split_string(std::string_view content, std::string_view delimeter)

 std::vector<std::string> result;
 auto prev_pos = content.begin();
 auto next_pos = std::search(prev_pos, content.end(),
 delimeter.begin(), delimeter.end());
 while (next_pos != content.end())
 
 result.emplace_back(prev_pos, next_pos);
 prev_pos = next_pos + delimeter.size();
 next_pos = std::search(prev_pos, content.end(), 
 delimeter.begin(), delimeter.end());
 

 if (prev_pos != content.end())
 
 result.emplace_back(prev_pos, content.end());
 
 return result;


int main()

 auto tokens = split_string("just a bunch of words", " ");

 std::copy(tokens.begin(), tokens.end(),
 std::ostream_iterator<std::string>std::cout, ", ");

Demo.

Printing is done by copying the results into std::cout. Not everybody likes this, so just looping

for (const auto& token: tokens)

will do too.

Some readers might ask a question "Why not return std::vector<std::string_view>"? The answer is that the views might dangle if the conversion is done from temporary string. A good example would be

split_string(std::string"string to split", "a");

The views would point to wherever the temporary is allocated, but after the function ends, the resulting views will dangle. One way to deal with this would be to create a function which copies, and name it split_string_copy, and let the current one to return views (hoping that IDE will put both version close, so developer would see the right one).

Yet another approach

Evangelists of iostream library might come up with a solution that modifies current locale, namely std::ctype<char>. Although it is really good at parsing simple CSV files, in a place where the function gets called multiple times it might be inefficient. Maybe in stateful cases it might prove as a better alternative.

I couldn't use English as well as C++ in this post. If you see any misspelling, or place where wording could be optimized, please comment or edit directly. I believe the code works, but any fixes are welcome too. — Apr 29 at 15:44
Nice answer, could you maybe add a small explanation what exactly you find so unusual about the formatting in OPs code? I know it's just rough example code but I think you can probably choose a better variable name than s ;) — Apr 29 at 16:01
@yuri, syncrhronizing the code in the post and the one in the link is very tiresome process. I just thought that it might be confusing if two versions don't match. EDIT: I expanded the paragraph about formatting. I'll try to replace s with something better, like content. EDIT2: done. — Apr 29 at 16:03
Thank your for your answer. Learned alot of things from it actually . But one thing is . Why use string view ? Excuse my lack of understandings but isn't it doing the same as passing a string reference , With the &operator ? — Apr 29 at 17:14
@NoorNizar, std::string needs to copy into its own allocated memory (potentially calling new), whereas string_view is very close to const char* pair which points to beginning and end of the string. Standard refers to this as "materialization", when temporary is passed to a function taking const &. Though it has some other, more important feature. It is related to template metaprogramming, topics of which is quite hard to explain. — Apr 29 at 17:22

score 6 · Answer 2 · 2018-04-29 19:17:40Z

In real life, I use BoostÃ¢Â€Â™s split, located in the string algos library.

Not in general that you should be familiar with std and with a number of Boost libraries as ever-present common code.

Coming to C++ from most other languages, you should know that you shouldnÃ¢Â€Â™t be playing around with substr. The string class is a bit of an odd duck because it was being developed Ã¢Â€ÂœconventionallyÃ¢Â€Â by standardizing existing practice and experience with other languages, and then all of a sudden STL comes along.

I was involved in implementing the string class from an early draft of the standards process around 1994, and it was even more conventional, using index positions and substrings everything.

Once STL was made the foundation of the Standard Library, the string class was thrown out and a simple one made thatÃ¢Â€Â™s similar to a vector but with handy support for string literals. That didnÃ¢Â€Â™t fly. The compromise was what we have today, which is a fully proper STL Container, and has some support for traditional string operations, thus allowing people to easily adopt it by changing out their home-made string class with minimal fuss, as opposed to having to completely rewrite the code to use STL algorithms.

for (size_t i = 0; i <= count; i++)

 string x = (test.substr(token, test.find(Deli, token)-token));
 parsed.push_back(x);
 token += test.find(Deli, token +1) - (token-1);
 test.find(Deli, token) != std::string::npos ? count++ : count;

Keeping the general idea you have of find the delimiter and then extract the range before that but after the previous one found, use iterators rather than string index positions.

So the starting place, rather than index 0, is the begin (or cbegin) iterator. Your for loop is structured oddly; it is not really a for style loop at all so using the for keyword is confusing; and you have to do the find twice.

Keeping the same general idea, just express it cleaner in a way you will find common in C++ STL:

IÃ¢Â€Â™ll start with the easier case where the delimiter is a single character. So we have parameters (const string& test, const char delim)

Set things up:

using std::cbegin();
using std::cend(); // "two-step"; required for more generic code

auto start = cbegin(test);
auto End = cend(test); // so I donÃ¢Â€Â™t have to keep calling it

The idiom of using unqualified non-member forms for begin etc. is preferred (ref 1,ref 2), and will make your code work with templates. (and make all code look the same rather than writing it one way if not using templates and another way in a template) Code often evolves by taking what you did once as a plain function and generalizing it somehow; or, the type of something in a large project is changed and you have to hunt down all uses and fix things, so the same techniques used when you donÃ¢Â€Â™t really know the exact type in the first place will help you here too!

Anyway, here is your starting point, a pointer to the beginning of the whole string as the start of the first token. Now you loop, finding the delimiter beyond this position, make that the End, extract whatÃ¢Â€Â™s between, then update the start to resume where the End was on this iteration.

while ( ??? how do I know when IÃ¢Â€Â™, done ???) {
 auto token_end = std::find (start, End, delim);

The find algorithm will stop with the iterator pointing at the found character, or at End. Either way is good for us! No special testing needed. Note that ranges are delimited as half-open: including the begin, excluding the end. That is, an end iterator points one past the last char to keep. That means everything works naturally without any adjustments or fiddling:

 string token start, token_end ;

Note that a constructor takes a pair of iterators; exactly what we have! You can see now why substr is not needed; you can just create a string with a range directly, not needing a special function call.

Now you want to collect the results. Keeping things (fairly) simple:

 parsed.push_back(std::move(token));

Wrapping the argument in move makes it more efficient. The whole thing about move semantics is another subject to learn.

Now to advance:

 start = token_end + 1;

The next token starts just after the delim; we donÃ¢Â€Â™t want to look at that character again.

But here we see a boundary condition. If token_end was the end of the string, this is an error. In fact, it signals the end of the loop! No more work needs to be done. So now I can go back and change the while to an indefinite loop and write the test here.

for (;;) 
 Ã¢Â‹Â®
 if (token_end == End) break;
 start= token_end + 1;

and that should be it.

If the delimiter is a string of characters, not a single char, it is slightly more complex.

The suitable algorithm is search, and thatÃ¢Â€Â™s a trivial change. Checking the end and updating the iteration is a tad more complex, though.

vector<string> parse (const string& test, const string& sep)

vector<string> retval;
using std::cbegin();
using std::cend(); // "two-step"; required for more generic code
using std::length();

auto start = cbegin(test);
auto End = cend(test); // so I donÃ¢Â€Â™t have to keep calling it
for (;;) 
 auto token_end = std::search (start, End, cbegin(sep), cend(sep));
 retval.emplace_back(start,token_end);
 if (token_end == End) break;
 start= token_end + length(sep);

return retval;

Hmm, so it wasnÃ¢Â€Â™t any harder after all; just replace the +1 with the proper length of the separator. ThatÃ¢Â€Â™s a good sign that the algorithm was structured well to match the way iterators and the standard algorithms work.

NowÃ¢Â€Â¦ you notice that the only things you do with the parameters are getting the iterators into them. You do not rely on any members of std::string at all. So, is would be perfect to use the rather new (C++17) string_view here instead.

vector<string> parse (const string_view test, const string_view sep)

None of the code has to be changed, but now when you call it with a lexical string literal like: parse("this is a test", " ") it does not have to construct a temporary std::string object and copy the literal string into it. ThatÃ¢Â€Â™s the point of string_view, since this is a common thing.

Good luck, and keep trying to get into C++ !

Very clear code explanation . Thank you !
â€“Â Noor
Apr 29 at 19:13 — Apr 29 at 19:13

score 2 · Answer 3 · 2018-04-30 11:12:21Z

@Incomputable provides a comprehensive answer i would add small alternative approach by using std::stringstream as shown below, in order to make it accept a delimiter rather than whitespaces and escape characters, it needs to covert the given delimiter to acceptable character to std::stringstream object.

#include <iostream>
#include <string>
#include <sstream>

std::string covert(const std::string& str, char delim)

 if (delim == ' ' 

int main()

 std::stringstream sso(covert("random,text,to,test,splitting,apart", ','));
 std::string temp;
 while (sso >> temp) 
 std::cout << temp << 'n';

It seems like the split_string aims to be general splitter, e.g. not only whitespaces. I guess std::ctype<char> needs to be modified for that. I added the approach to the end of my post, with slight hints at the possible implementation. +1 though, your comment led me to the alternative approach. — Apr 29 at 15:56
@Incomputable that true this only works for whitespaces and escape characters, there is work around to make it suitable for std::stringstream object. — Apr 30 at 11:06

Toby Speight 17.4k13489 · Accepted Answer · 2018-04-30 16:35:59Z

Bug

If the original string doesn't have trailing delimiter, the code leaves out the last token.

Unqualified calls

Unqualified call is any function call which doesn't have preceding ::. People oppose using namespace std;. In my opinion, it is not as much of a problem if done carefully (for small programs or at function/block scope). What is dangerous though, is unqualified call. It can be very evil. Anyway, my recommendation is to just type std:: unless you want to have a fallback into standard library functions at the current scope. I know this sounds confusing, but may be as time passes by it will become clearer.

Formatting

There are formatting conventions many people use to make code readable among themselves. Sometimes it helps to understand control flow, templates, pointers, etc. I don't have any strong opinions on any particular one though, as long as it is consistent. Formatting the code in question used is rather unusual. I use snake_case, for everything except template parameter names and concepts. The more common version though is CamelCase for type names and camelCase for variable names, functions. ALL_CAPS are reserved for macros, as they're quite evil.

Accepting by value

Although sometimes it makes sense, in the case shown in question, it doesn't. When something is used for read-only purposes, pass by const T&. In this case, it might even make sense to take by std::string_view. At the moment the code copies the arguments, which may be very costly if strings are big.

Peculiar way of looping

count and i are at the moment sort of a flag, either true or false (it takes many values, but squashing i and count into one would produce the flag). It would be better to loop until find returns npos.

size_t

For indexing, size_t is usually used. Some people prefer ptrdiff_t. The advantage they have over int is that they will as big as indexable memory is (though max of ptrdiff_t might be less that max of size_t). int might not be big enough in some cases.

Small stuff

Use const auto& if looping in read only mode (for (const auto& e: x)).

return 0 is redundant.

Don't use std::endl unless immediate printing is necessary. An example of immediate printing can be real-time games, where users need to see output immediately.

Don't put extra () anywhere unless compiler doesn't understand you. Most of the time they are redundant, in worst cases they produce bugs and dangling references.

My approach

I'm a lover of standard library algorithms. Here is a good talk to give a starter if you know basic algorithms (Disclaimer: I'm not familiar with the speaker, though ACCU is a good conference).

Lets develop the algorithm first.

prev_pos <- start of the string

next_pos <- next delimiter location in string

if next_pos is end of the string, go to 7

append prev_pos, next_pos to results

increase prev_pos by next_pos + delimiter.size().

Go to 2.

if prev_pos is not end of the string, add the remaining portion of the string to results.

(This is a very bad parody to Knuth's style)

From the algorithm, it is already clear that the variables prev_pos and next_pos are not well named. I left it out as example, to demonstrate when variables could be given a good names. Good algorithm is complemented by good names, and sometimes staggered by bad ones.

Now the implementation. When starting out with a language, it is very hard to find direct equivalent to actions in the algorithm. Fortunately, C++ allows coming close to it, though building blocks are in the standard library already. I recommend using some good, up-to-date documentation. I use cppreference.com .

The first step is understanding iterators. I don't really have a good reference for them, but iterators are the glue that binds containers and algorithms in C++. For now, let's assume it is a pointer into the string. Every standard library container has begin() and end(). The former is iterator to the beginning of container, and the latter is one past the end iterator. In case of "word", begin would point to w, and end to whatever is after d.

The algorithm that searches for the occurrence of a sequence in a bigger sequence is std::search(). If "very long string" is std::searched for "string", it will return iterator pointing to s.

std::vector has emplace_back(), which constructs element in-place. It takes whatever element type's constructor takes (in this case std::string). In the code above, a string is created after substring, and only then copied into result. Using iterator pair will eliminate the unnecessary copies.

Code

#include <algorithm>
#include <vector>
#include <iterator>
#include <iostream>
#include <string_view>

std::vector<std::string> split_string(std::string_view content, std::string_view delimeter)

 std::vector<std::string> result;
 auto prev_pos = content.begin();
 auto next_pos = std::search(prev_pos, content.end(),
 delimeter.begin(), delimeter.end());
 while (next_pos != content.end())
 
 result.emplace_back(prev_pos, next_pos);
 prev_pos = next_pos + delimeter.size();
 next_pos = std::search(prev_pos, content.end(), 
 delimeter.begin(), delimeter.end());
 

 if (prev_pos != content.end())
 
 result.emplace_back(prev_pos, content.end());
 
 return result;


int main()

 auto tokens = split_string("just a bunch of words", " ");

 std::copy(tokens.begin(), tokens.end(),
 std::ostream_iterator<std::string>std::cout, ", ");

Demo.

Printing is done by copying the results into std::cout. Not everybody likes this, so just looping

for (const auto& token: tokens)

will do too.

Some readers might ask a question "Why not return std::vector<std::string_view>"? The answer is that the views might dangle if the conversion is done from temporary string. A good example would be

split_string(std::string"string to split", "a");

The views would point to wherever the temporary is allocated, but after the function ends, the resulting views will dangle. One way to deal with this would be to create a function which copies, and name it split_string_copy, and let the current one to return views (hoping that IDE will put both version close, so developer would see the right one).

Yet another approach

Evangelists of iostream library might come up with a solution that modifies current locale, namely std::ctype<char>. Although it is really good at parsing simple CSV files, in a place where the function gets called multiple times it might be inefficient. Maybe in stateful cases it might prove as a better alternative.

I couldn't use English as well as C++ in this post. If you see any misspelling, or place where wording could be optimized, please comment or edit directly. I believe the code works, but any fixes are welcome too. — Apr 29 at 15:44
Nice answer, could you maybe add a small explanation what exactly you find so unusual about the formatting in OPs code? I know it's just rough example code but I think you can probably choose a better variable name than s ;) — Apr 29 at 16:01
@yuri, syncrhronizing the code in the post and the one in the link is very tiresome process. I just thought that it might be confusing if two versions don't match. EDIT: I expanded the paragraph about formatting. I'll try to replace s with something better, like content. EDIT2: done. — Apr 29 at 16:03
Thank your for your answer. Learned alot of things from it actually . But one thing is . Why use string view ? Excuse my lack of understandings but isn't it doing the same as passing a string reference , With the &operator ? — Apr 29 at 17:14
@NoorNizar, std::string needs to copy into its own allocated memory (potentially calling new), whereas string_view is very close to const char* pair which points to beginning and end of the string. Standard refers to this as "materialization", when temporary is passed to a function taking const &. Though it has some other, more important feature. It is related to template metaprogramming, topics of which is quite hard to explain. — Apr 29 at 17:22

score 6 · Answer 5 · 2018-04-29 19:17:40Z

In real life, I use BoostÃ¢Â€Â™s split, located in the string algos library.

Not in general that you should be familiar with std and with a number of Boost libraries as ever-present common code.

Coming to C++ from most other languages, you should know that you shouldnÃ¢Â€Â™t be playing around with substr. The string class is a bit of an odd duck because it was being developed Ã¢Â€ÂœconventionallyÃ¢Â€Â by standardizing existing practice and experience with other languages, and then all of a sudden STL comes along.

I was involved in implementing the string class from an early draft of the standards process around 1994, and it was even more conventional, using index positions and substrings everything.

Once STL was made the foundation of the Standard Library, the string class was thrown out and a simple one made thatÃ¢Â€Â™s similar to a vector but with handy support for string literals. That didnÃ¢Â€Â™t fly. The compromise was what we have today, which is a fully proper STL Container, and has some support for traditional string operations, thus allowing people to easily adopt it by changing out their home-made string class with minimal fuss, as opposed to having to completely rewrite the code to use STL algorithms.

for (size_t i = 0; i <= count; i++)

 string x = (test.substr(token, test.find(Deli, token)-token));
 parsed.push_back(x);
 token += test.find(Deli, token +1) - (token-1);
 test.find(Deli, token) != std::string::npos ? count++ : count;

Keeping the general idea you have of find the delimiter and then extract the range before that but after the previous one found, use iterators rather than string index positions.

So the starting place, rather than index 0, is the begin (or cbegin) iterator. Your for loop is structured oddly; it is not really a for style loop at all so using the for keyword is confusing; and you have to do the find twice.

Keeping the same general idea, just express it cleaner in a way you will find common in C++ STL:

IÃ¢Â€Â™ll start with the easier case where the delimiter is a single character. So we have parameters (const string& test, const char delim)

Set things up:

using std::cbegin();
using std::cend(); // "two-step"; required for more generic code

auto start = cbegin(test);
auto End = cend(test); // so I donÃ¢Â€Â™t have to keep calling it

The idiom of using unqualified non-member forms for begin etc. is preferred (ref 1,ref 2), and will make your code work with templates. (and make all code look the same rather than writing it one way if not using templates and another way in a template) Code often evolves by taking what you did once as a plain function and generalizing it somehow; or, the type of something in a large project is changed and you have to hunt down all uses and fix things, so the same techniques used when you donÃ¢Â€Â™t really know the exact type in the first place will help you here too!

Anyway, here is your starting point, a pointer to the beginning of the whole string as the start of the first token. Now you loop, finding the delimiter beyond this position, make that the End, extract whatÃ¢Â€Â™s between, then update the start to resume where the End was on this iteration.

while ( ??? how do I know when IÃ¢Â€Â™, done ???) {
 auto token_end = std::find (start, End, delim);

The find algorithm will stop with the iterator pointing at the found character, or at End. Either way is good for us! No special testing needed. Note that ranges are delimited as half-open: including the begin, excluding the end. That is, an end iterator points one past the last char to keep. That means everything works naturally without any adjustments or fiddling:

 string token start, token_end ;

Note that a constructor takes a pair of iterators; exactly what we have! You can see now why substr is not needed; you can just create a string with a range directly, not needing a special function call.

Now you want to collect the results. Keeping things (fairly) simple:

 parsed.push_back(std::move(token));

Wrapping the argument in move makes it more efficient. The whole thing about move semantics is another subject to learn.

Now to advance:

 start = token_end + 1;

The next token starts just after the delim; we donÃ¢Â€Â™t want to look at that character again.

But here we see a boundary condition. If token_end was the end of the string, this is an error. In fact, it signals the end of the loop! No more work needs to be done. So now I can go back and change the while to an indefinite loop and write the test here.

for (;;) 
 Ã¢Â‹Â®
 if (token_end == End) break;
 start= token_end + 1;

and that should be it.

If the delimiter is a string of characters, not a single char, it is slightly more complex.

The suitable algorithm is search, and thatÃ¢Â€Â™s a trivial change. Checking the end and updating the iteration is a tad more complex, though.

vector<string> parse (const string& test, const string& sep)

vector<string> retval;
using std::cbegin();
using std::cend(); // "two-step"; required for more generic code
using std::length();

auto start = cbegin(test);
auto End = cend(test); // so I donÃ¢Â€Â™t have to keep calling it
for (;;) 
 auto token_end = std::search (start, End, cbegin(sep), cend(sep));
 retval.emplace_back(start,token_end);
 if (token_end == End) break;
 start= token_end + length(sep);

return retval;

Hmm, so it wasnÃ¢Â€Â™t any harder after all; just replace the +1 with the proper length of the separator. ThatÃ¢Â€Â™s a good sign that the algorithm was structured well to match the way iterators and the standard algorithms work.

NowÃ¢Â€Â¦ you notice that the only things you do with the parameters are getting the iterators into them. You do not rely on any members of std::string at all. So, is would be perfect to use the rather new (C++17) string_view here instead.

vector<string> parse (const string_view test, const string_view sep)

None of the code has to be changed, but now when you call it with a lexical string literal like: parse("this is a test", " ") it does not have to construct a temporary std::string object and copy the literal string into it. ThatÃ¢Â€Â™s the point of string_view, since this is a common thing.

Good luck, and keep trying to get into C++ !

Very clear code explanation . Thank you !
â€“Â Noor
Apr 29 at 19:13 — Apr 29 at 19:13

score 2 · Answer 6 · 2018-04-30 11:12:21Z

@Incomputable provides a comprehensive answer i would add small alternative approach by using std::stringstream as shown below, in order to make it accept a delimiter rather than whitespaces and escape characters, it needs to covert the given delimiter to acceptable character to std::stringstream object.

#include <iostream>
#include <string>
#include <sstream>

std::string covert(const std::string& str, char delim)

 if (delim == ' ' 

int main()

 std::stringstream sso(covert("random,text,to,test,splitting,apart", ','));
 std::string temp;
 while (sso >> temp) 
 std::cout << temp << 'n';

It seems like the split_string aims to be general splitter, e.g. not only whitespaces. I guess std::ctype<char> needs to be modified for that. I added the approach to the end of my post, with slight hints at the possible implementation. +1 though, your comment led me to the alternative approach. — Apr 29 at 15:56
@Incomputable that true this only works for whitespaces and escape characters, there is work around to make it suitable for std::stringstream object. — Apr 30 at 11:06

Splitting std::string based on delimiter using only find and substr

3 Answers 3

Bug

Unqualified calls

Formatting

Accepting by value

Peculiar way of looping

size_t

Small stuff

My approach

Code

Yet another approach

Your Answer

Sign up or log in

Post as a guest

Post as a guest

3 Answers 3

3 Answers 3

Bug

Unqualified calls

Formatting

Accepting by value

Peculiar way of looping

size_t

Small stuff

My approach

Code

Yet another approach

Bug

Unqualified calls

Formatting

Accepting by value

Peculiar way of looping

size_t

Small stuff

My approach

Code

Yet another approach

Bug

Unqualified calls

Formatting

Accepting by value

Peculiar way of looping

size_t

Small stuff

My approach

Code

Yet another approach

Bug

Unqualified calls

Formatting

Accepting by value

Peculiar way of looping

size_t

Small stuff

My approach

Code

Yet another approach

Sign up or log in

Post as a guest

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Popular posts from this blog

Chat program with C++ and SFML

Read an image with ADNS2610 optical sensor and Arduino Uno

Read files from a directory using Promises

3 Answers
3

3 Answers
3

3 Answers
3