Function which removes a given word from a string

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP





.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty margin-bottom:0;







up vote
1
down vote

favorite












Any improvements? Please review.



It could be faster if I use own strstr function which searches from the end (less memory to move) but I wanted to use only the standard functions.



char *removestr(char *str, const char *word)

char *ptr = str;
size_t len = strlen(word);
while((ptr = strstr(ptr, word)))

return str;







share|improve this question

















  • 3




    Welcome to Code Review! Are you able to edit your post to include some example usages for this function?
    – Sam Onela
    May 8 at 22:28
















up vote
1
down vote

favorite












Any improvements? Please review.



It could be faster if I use own strstr function which searches from the end (less memory to move) but I wanted to use only the standard functions.



char *removestr(char *str, const char *word)

char *ptr = str;
size_t len = strlen(word);
while((ptr = strstr(ptr, word)))

return str;







share|improve this question

















  • 3




    Welcome to Code Review! Are you able to edit your post to include some example usages for this function?
    – Sam Onela
    May 8 at 22:28












up vote
1
down vote

favorite









up vote
1
down vote

favorite











Any improvements? Please review.



It could be faster if I use own strstr function which searches from the end (less memory to move) but I wanted to use only the standard functions.



char *removestr(char *str, const char *word)

char *ptr = str;
size_t len = strlen(word);
while((ptr = strstr(ptr, word)))

return str;







share|improve this question













Any improvements? Please review.



It could be faster if I use own strstr function which searches from the end (less memory to move) but I wanted to use only the standard functions.



char *removestr(char *str, const char *word)

char *ptr = str;
size_t len = strlen(word);
while((ptr = strstr(ptr, word)))

return str;









share|improve this question












share|improve this question




share|improve this question








edited May 8 at 22:23









Sam Onela

5,77461543




5,77461543









asked May 8 at 21:56









P__J__

1104




1104







  • 3




    Welcome to Code Review! Are you able to edit your post to include some example usages for this function?
    – Sam Onela
    May 8 at 22:28












  • 3




    Welcome to Code Review! Are you able to edit your post to include some example usages for this function?
    – Sam Onela
    May 8 at 22:28







3




3




Welcome to Code Review! Are you able to edit your post to include some example usages for this function?
– Sam Onela
May 8 at 22:28




Welcome to Code Review! Are you able to edit your post to include some example usages for this function?
– Sam Onela
May 8 at 22:28










2 Answers
2






active

oldest

votes

















up vote
7
down vote













I would suggest renaming your variables. str, ptr, and len don't tell us much about what they're meant to be.



size_t len = strlen(word);


Since this is never going to change (because it's abstractly a property of a const input) you might as well mark it const.



while((ptr = strstr(ptr, word)))


I would put a comment here, not least as a note that the assignment is deliberate. (Because so often a single equals sign in an if or while implies a bug). Really you could do with a few more comments throughout to say why you're doing things: For example the if(isalnum... could be commented to explain that you only want to remove whole words.



strlen(ptr + len)


This may work fine, but it's a moderately expensive thing to do because it has to run down the length of the remaining list. I would be inclined to measure the length str at the start outside the loop, and track it across updates.



memmove(ptr, ptr + len, strlen(ptr + len) + 1);


Again this works fine, but it has to copy (quite slowly and carefully) the whole remaining string. Because it's in a loop, this becomes an $ O(n^2) $ function. One solution would be to only move back the string up to the next time word appears. This would mean a bit more complexity tracking the size of the gap that you're building up, but it would reduce the overall complexity of the function to $ O(n) $.






share|improve this answer



















  • 3




    I disagree with the need to comment the single = sign. That is what the extra set of braces inside the while() is doing. It is quite common (and considered good practice by most (not me personally)) to use assignment in both if and while in C. The extra braces are there to show this is an assignment that must be done before expression is evaluated for a condition.
    – Martin York
    May 9 at 0:03






  • 1




    I usually make that case more explicit (because double parens are easy to overlook) by doing if/while ((x = foo(whatever)) != NULL) so there's still a comparison that's obvious.
    – user1118321
    May 9 at 3:38







  • 1




    As well as the variables, the function could do with a better name (e.g. remove_word()). I couldn't at first see why isalnum() was required.
    – Toby Speight
    May 9 at 7:53

















up vote
2
down vote













Order of complexity higher than needed



With memmove(), which execution time varies linearly with strlen(str), inside a loop which iteration count can depend on strlen(str), this algorithm is at least $ O(n^2) $ and a $ O(n) $ is possible. Use separate pointers to read from str and write to str can accomplish $ O(n) $ - still in the forward direction. See below.



What if arguments overlap?



word could exist at the end of str, and so removestr(char *str, const char *word) must account that word may change anytime str changes. To inform the compiler this situation is not relevant employ restrict.



// char *removestr(char *str, const char *word)
char *removestr(char *restrict str, const char *restrict word)


This may improve performance a bit as it can allow various compiler optimizations



Avoid UB



is...(x) functions are UB when x < 0 && x != EOF as they are designed for unsigned char and EOF. As a char may be negative, cast to (unsigned char) to cope with this pesky C nuance.



// isalnum(*(ptr + len)
isalnum((unsigned char) *(ptr + len))



Sample lightly tested $ O(n) $ code following OP's lead of while((ptr = strstr(ptr, word)))



(Really $ O(strlen(str) * strlen(word)$ vs. OP's $ O(strlen(str)^2 * strlen(word)$).



// Remove all "stand-alone" occurrences of `word` from `str`.
// For `word` to stand-alone", it must not be preceded nor followed by letter/digit
char *removestr(char * restrict str, const char *restrict word)
size_t w_len = strlen(word);
if (w_len == 0)
return str;

const char *src = str;
char *dest = str;
char *token;

while ((token = strstr(src, word)) != NULL)
(src > str && isalnum((unsigned char) src[-1])))
// `word` match is not "stand-alone"
*dest++ = *src++;
else
// skip `word`
src += w_len;


// copy rest of `str`
while ((*dest++ = *src++) != '');
return str;



Tests



void removestr_test(const char *str, const char *word) 
char *test_str = strdup(str);
char *result = removestr(test_str, word);
printf("%d <%s> <%s> --> <%s>n", result == test_str, str, word, test_str);


int main(void)
removestr_test("", "the");
removestr_test("the", "the");
removestr_test("the beginning", "the");
removestr_test("in the beginning", "the");
removestr_test("end the", "the");
removestr_test("thenot thenot notthe notthe", "the");
removestr_test("xx the xx the xx the xx the xx the xx the", "the");
return 0;



Output



1 <> <the> --> <>
1 <the> <the> --> <>
1 <the beginning> <the> --> < beginning>
1 <in the beginning> <the> --> <in beginning>
1 <end the> <the> --> <end >
1 <thenot thenot notthe notthe> <the> --> <thenot thenot notthe notthe>
1 <xx the xx the xx the xx the xx the xx the> <the> --> <xx xx xx xx xx xx >





share|improve this answer























    Your Answer




    StackExchange.ifUsing("editor", function ()
    return StackExchange.using("mathjaxEditing", function ()
    StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
    StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["\$", "\$"]]);
    );
    );
    , "mathjax-editing");

    StackExchange.ifUsing("editor", function ()
    StackExchange.using("externalEditor", function ()
    StackExchange.using("snippets", function ()
    StackExchange.snippets.init();
    );
    );
    , "code-snippets");

    StackExchange.ready(function()
    var channelOptions =
    tags: "".split(" "),
    id: "196"
    ;
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function()
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled)
    StackExchange.using("snippets", function()
    createEditor();
    );

    else
    createEditor();

    );

    function createEditor()
    StackExchange.prepareEditor(
    heartbeatType: 'answer',
    convertImagesToLinks: false,
    noModals: false,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: null,
    bindNavPrevention: true,
    postfix: "",
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    );



    );








     

    draft saved


    draft discarded


















    StackExchange.ready(
    function ()
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f193961%2ffunction-which-removes-a-given-word-from-a-string%23new-answer', 'question_page');

    );

    Post as a guest






























    2 Answers
    2






    active

    oldest

    votes








    2 Answers
    2






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes








    up vote
    7
    down vote













    I would suggest renaming your variables. str, ptr, and len don't tell us much about what they're meant to be.



    size_t len = strlen(word);


    Since this is never going to change (because it's abstractly a property of a const input) you might as well mark it const.



    while((ptr = strstr(ptr, word)))


    I would put a comment here, not least as a note that the assignment is deliberate. (Because so often a single equals sign in an if or while implies a bug). Really you could do with a few more comments throughout to say why you're doing things: For example the if(isalnum... could be commented to explain that you only want to remove whole words.



    strlen(ptr + len)


    This may work fine, but it's a moderately expensive thing to do because it has to run down the length of the remaining list. I would be inclined to measure the length str at the start outside the loop, and track it across updates.



    memmove(ptr, ptr + len, strlen(ptr + len) + 1);


    Again this works fine, but it has to copy (quite slowly and carefully) the whole remaining string. Because it's in a loop, this becomes an $ O(n^2) $ function. One solution would be to only move back the string up to the next time word appears. This would mean a bit more complexity tracking the size of the gap that you're building up, but it would reduce the overall complexity of the function to $ O(n) $.






    share|improve this answer



















    • 3




      I disagree with the need to comment the single = sign. That is what the extra set of braces inside the while() is doing. It is quite common (and considered good practice by most (not me personally)) to use assignment in both if and while in C. The extra braces are there to show this is an assignment that must be done before expression is evaluated for a condition.
      – Martin York
      May 9 at 0:03






    • 1




      I usually make that case more explicit (because double parens are easy to overlook) by doing if/while ((x = foo(whatever)) != NULL) so there's still a comparison that's obvious.
      – user1118321
      May 9 at 3:38







    • 1




      As well as the variables, the function could do with a better name (e.g. remove_word()). I couldn't at first see why isalnum() was required.
      – Toby Speight
      May 9 at 7:53














    up vote
    7
    down vote













    I would suggest renaming your variables. str, ptr, and len don't tell us much about what they're meant to be.



    size_t len = strlen(word);


    Since this is never going to change (because it's abstractly a property of a const input) you might as well mark it const.



    while((ptr = strstr(ptr, word)))


    I would put a comment here, not least as a note that the assignment is deliberate. (Because so often a single equals sign in an if or while implies a bug). Really you could do with a few more comments throughout to say why you're doing things: For example the if(isalnum... could be commented to explain that you only want to remove whole words.



    strlen(ptr + len)


    This may work fine, but it's a moderately expensive thing to do because it has to run down the length of the remaining list. I would be inclined to measure the length str at the start outside the loop, and track it across updates.



    memmove(ptr, ptr + len, strlen(ptr + len) + 1);


    Again this works fine, but it has to copy (quite slowly and carefully) the whole remaining string. Because it's in a loop, this becomes an $ O(n^2) $ function. One solution would be to only move back the string up to the next time word appears. This would mean a bit more complexity tracking the size of the gap that you're building up, but it would reduce the overall complexity of the function to $ O(n) $.






    share|improve this answer



















    • 3




      I disagree with the need to comment the single = sign. That is what the extra set of braces inside the while() is doing. It is quite common (and considered good practice by most (not me personally)) to use assignment in both if and while in C. The extra braces are there to show this is an assignment that must be done before expression is evaluated for a condition.
      – Martin York
      May 9 at 0:03






    • 1




      I usually make that case more explicit (because double parens are easy to overlook) by doing if/while ((x = foo(whatever)) != NULL) so there's still a comparison that's obvious.
      – user1118321
      May 9 at 3:38







    • 1




      As well as the variables, the function could do with a better name (e.g. remove_word()). I couldn't at first see why isalnum() was required.
      – Toby Speight
      May 9 at 7:53












    up vote
    7
    down vote










    up vote
    7
    down vote









    I would suggest renaming your variables. str, ptr, and len don't tell us much about what they're meant to be.



    size_t len = strlen(word);


    Since this is never going to change (because it's abstractly a property of a const input) you might as well mark it const.



    while((ptr = strstr(ptr, word)))


    I would put a comment here, not least as a note that the assignment is deliberate. (Because so often a single equals sign in an if or while implies a bug). Really you could do with a few more comments throughout to say why you're doing things: For example the if(isalnum... could be commented to explain that you only want to remove whole words.



    strlen(ptr + len)


    This may work fine, but it's a moderately expensive thing to do because it has to run down the length of the remaining list. I would be inclined to measure the length str at the start outside the loop, and track it across updates.



    memmove(ptr, ptr + len, strlen(ptr + len) + 1);


    Again this works fine, but it has to copy (quite slowly and carefully) the whole remaining string. Because it's in a loop, this becomes an $ O(n^2) $ function. One solution would be to only move back the string up to the next time word appears. This would mean a bit more complexity tracking the size of the gap that you're building up, but it would reduce the overall complexity of the function to $ O(n) $.






    share|improve this answer















    I would suggest renaming your variables. str, ptr, and len don't tell us much about what they're meant to be.



    size_t len = strlen(word);


    Since this is never going to change (because it's abstractly a property of a const input) you might as well mark it const.



    while((ptr = strstr(ptr, word)))


    I would put a comment here, not least as a note that the assignment is deliberate. (Because so often a single equals sign in an if or while implies a bug). Really you could do with a few more comments throughout to say why you're doing things: For example the if(isalnum... could be commented to explain that you only want to remove whole words.



    strlen(ptr + len)


    This may work fine, but it's a moderately expensive thing to do because it has to run down the length of the remaining list. I would be inclined to measure the length str at the start outside the loop, and track it across updates.



    memmove(ptr, ptr + len, strlen(ptr + len) + 1);


    Again this works fine, but it has to copy (quite slowly and carefully) the whole remaining string. Because it's in a loop, this becomes an $ O(n^2) $ function. One solution would be to only move back the string up to the next time word appears. This would mean a bit more complexity tracking the size of the gap that you're building up, but it would reduce the overall complexity of the function to $ O(n) $.







    share|improve this answer















    share|improve this answer



    share|improve this answer








    edited May 8 at 23:16


























    answered May 8 at 23:00









    Josiah

    3,172326




    3,172326







    • 3




      I disagree with the need to comment the single = sign. That is what the extra set of braces inside the while() is doing. It is quite common (and considered good practice by most (not me personally)) to use assignment in both if and while in C. The extra braces are there to show this is an assignment that must be done before expression is evaluated for a condition.
      – Martin York
      May 9 at 0:03






    • 1




      I usually make that case more explicit (because double parens are easy to overlook) by doing if/while ((x = foo(whatever)) != NULL) so there's still a comparison that's obvious.
      – user1118321
      May 9 at 3:38







    • 1




      As well as the variables, the function could do with a better name (e.g. remove_word()). I couldn't at first see why isalnum() was required.
      – Toby Speight
      May 9 at 7:53












    • 3




      I disagree with the need to comment the single = sign. That is what the extra set of braces inside the while() is doing. It is quite common (and considered good practice by most (not me personally)) to use assignment in both if and while in C. The extra braces are there to show this is an assignment that must be done before expression is evaluated for a condition.
      – Martin York
      May 9 at 0:03






    • 1




      I usually make that case more explicit (because double parens are easy to overlook) by doing if/while ((x = foo(whatever)) != NULL) so there's still a comparison that's obvious.
      – user1118321
      May 9 at 3:38







    • 1




      As well as the variables, the function could do with a better name (e.g. remove_word()). I couldn't at first see why isalnum() was required.
      – Toby Speight
      May 9 at 7:53







    3




    3




    I disagree with the need to comment the single = sign. That is what the extra set of braces inside the while() is doing. It is quite common (and considered good practice by most (not me personally)) to use assignment in both if and while in C. The extra braces are there to show this is an assignment that must be done before expression is evaluated for a condition.
    – Martin York
    May 9 at 0:03




    I disagree with the need to comment the single = sign. That is what the extra set of braces inside the while() is doing. It is quite common (and considered good practice by most (not me personally)) to use assignment in both if and while in C. The extra braces are there to show this is an assignment that must be done before expression is evaluated for a condition.
    – Martin York
    May 9 at 0:03




    1




    1




    I usually make that case more explicit (because double parens are easy to overlook) by doing if/while ((x = foo(whatever)) != NULL) so there's still a comparison that's obvious.
    – user1118321
    May 9 at 3:38





    I usually make that case more explicit (because double parens are easy to overlook) by doing if/while ((x = foo(whatever)) != NULL) so there's still a comparison that's obvious.
    – user1118321
    May 9 at 3:38





    1




    1




    As well as the variables, the function could do with a better name (e.g. remove_word()). I couldn't at first see why isalnum() was required.
    – Toby Speight
    May 9 at 7:53




    As well as the variables, the function could do with a better name (e.g. remove_word()). I couldn't at first see why isalnum() was required.
    – Toby Speight
    May 9 at 7:53












    up vote
    2
    down vote













    Order of complexity higher than needed



    With memmove(), which execution time varies linearly with strlen(str), inside a loop which iteration count can depend on strlen(str), this algorithm is at least $ O(n^2) $ and a $ O(n) $ is possible. Use separate pointers to read from str and write to str can accomplish $ O(n) $ - still in the forward direction. See below.



    What if arguments overlap?



    word could exist at the end of str, and so removestr(char *str, const char *word) must account that word may change anytime str changes. To inform the compiler this situation is not relevant employ restrict.



    // char *removestr(char *str, const char *word)
    char *removestr(char *restrict str, const char *restrict word)


    This may improve performance a bit as it can allow various compiler optimizations



    Avoid UB



    is...(x) functions are UB when x < 0 && x != EOF as they are designed for unsigned char and EOF. As a char may be negative, cast to (unsigned char) to cope with this pesky C nuance.



    // isalnum(*(ptr + len)
    isalnum((unsigned char) *(ptr + len))



    Sample lightly tested $ O(n) $ code following OP's lead of while((ptr = strstr(ptr, word)))



    (Really $ O(strlen(str) * strlen(word)$ vs. OP's $ O(strlen(str)^2 * strlen(word)$).



    // Remove all "stand-alone" occurrences of `word` from `str`.
    // For `word` to stand-alone", it must not be preceded nor followed by letter/digit
    char *removestr(char * restrict str, const char *restrict word)
    size_t w_len = strlen(word);
    if (w_len == 0)
    return str;

    const char *src = str;
    char *dest = str;
    char *token;

    while ((token = strstr(src, word)) != NULL)
    (src > str && isalnum((unsigned char) src[-1])))
    // `word` match is not "stand-alone"
    *dest++ = *src++;
    else
    // skip `word`
    src += w_len;


    // copy rest of `str`
    while ((*dest++ = *src++) != '');
    return str;



    Tests



    void removestr_test(const char *str, const char *word) 
    char *test_str = strdup(str);
    char *result = removestr(test_str, word);
    printf("%d <%s> <%s> --> <%s>n", result == test_str, str, word, test_str);


    int main(void)
    removestr_test("", "the");
    removestr_test("the", "the");
    removestr_test("the beginning", "the");
    removestr_test("in the beginning", "the");
    removestr_test("end the", "the");
    removestr_test("thenot thenot notthe notthe", "the");
    removestr_test("xx the xx the xx the xx the xx the xx the", "the");
    return 0;



    Output



    1 <> <the> --> <>
    1 <the> <the> --> <>
    1 <the beginning> <the> --> < beginning>
    1 <in the beginning> <the> --> <in beginning>
    1 <end the> <the> --> <end >
    1 <thenot thenot notthe notthe> <the> --> <thenot thenot notthe notthe>
    1 <xx the xx the xx the xx the xx the xx the> <the> --> <xx xx xx xx xx xx >





    share|improve this answer



























      up vote
      2
      down vote













      Order of complexity higher than needed



      With memmove(), which execution time varies linearly with strlen(str), inside a loop which iteration count can depend on strlen(str), this algorithm is at least $ O(n^2) $ and a $ O(n) $ is possible. Use separate pointers to read from str and write to str can accomplish $ O(n) $ - still in the forward direction. See below.



      What if arguments overlap?



      word could exist at the end of str, and so removestr(char *str, const char *word) must account that word may change anytime str changes. To inform the compiler this situation is not relevant employ restrict.



      // char *removestr(char *str, const char *word)
      char *removestr(char *restrict str, const char *restrict word)


      This may improve performance a bit as it can allow various compiler optimizations



      Avoid UB



      is...(x) functions are UB when x < 0 && x != EOF as they are designed for unsigned char and EOF. As a char may be negative, cast to (unsigned char) to cope with this pesky C nuance.



      // isalnum(*(ptr + len)
      isalnum((unsigned char) *(ptr + len))



      Sample lightly tested $ O(n) $ code following OP's lead of while((ptr = strstr(ptr, word)))



      (Really $ O(strlen(str) * strlen(word)$ vs. OP's $ O(strlen(str)^2 * strlen(word)$).



      // Remove all "stand-alone" occurrences of `word` from `str`.
      // For `word` to stand-alone", it must not be preceded nor followed by letter/digit
      char *removestr(char * restrict str, const char *restrict word)
      size_t w_len = strlen(word);
      if (w_len == 0)
      return str;

      const char *src = str;
      char *dest = str;
      char *token;

      while ((token = strstr(src, word)) != NULL)
      (src > str && isalnum((unsigned char) src[-1])))
      // `word` match is not "stand-alone"
      *dest++ = *src++;
      else
      // skip `word`
      src += w_len;


      // copy rest of `str`
      while ((*dest++ = *src++) != '');
      return str;



      Tests



      void removestr_test(const char *str, const char *word) 
      char *test_str = strdup(str);
      char *result = removestr(test_str, word);
      printf("%d <%s> <%s> --> <%s>n", result == test_str, str, word, test_str);


      int main(void)
      removestr_test("", "the");
      removestr_test("the", "the");
      removestr_test("the beginning", "the");
      removestr_test("in the beginning", "the");
      removestr_test("end the", "the");
      removestr_test("thenot thenot notthe notthe", "the");
      removestr_test("xx the xx the xx the xx the xx the xx the", "the");
      return 0;



      Output



      1 <> <the> --> <>
      1 <the> <the> --> <>
      1 <the beginning> <the> --> < beginning>
      1 <in the beginning> <the> --> <in beginning>
      1 <end the> <the> --> <end >
      1 <thenot thenot notthe notthe> <the> --> <thenot thenot notthe notthe>
      1 <xx the xx the xx the xx the xx the xx the> <the> --> <xx xx xx xx xx xx >





      share|improve this answer

























        up vote
        2
        down vote










        up vote
        2
        down vote









        Order of complexity higher than needed



        With memmove(), which execution time varies linearly with strlen(str), inside a loop which iteration count can depend on strlen(str), this algorithm is at least $ O(n^2) $ and a $ O(n) $ is possible. Use separate pointers to read from str and write to str can accomplish $ O(n) $ - still in the forward direction. See below.



        What if arguments overlap?



        word could exist at the end of str, and so removestr(char *str, const char *word) must account that word may change anytime str changes. To inform the compiler this situation is not relevant employ restrict.



        // char *removestr(char *str, const char *word)
        char *removestr(char *restrict str, const char *restrict word)


        This may improve performance a bit as it can allow various compiler optimizations



        Avoid UB



        is...(x) functions are UB when x < 0 && x != EOF as they are designed for unsigned char and EOF. As a char may be negative, cast to (unsigned char) to cope with this pesky C nuance.



        // isalnum(*(ptr + len)
        isalnum((unsigned char) *(ptr + len))



        Sample lightly tested $ O(n) $ code following OP's lead of while((ptr = strstr(ptr, word)))



        (Really $ O(strlen(str) * strlen(word)$ vs. OP's $ O(strlen(str)^2 * strlen(word)$).



        // Remove all "stand-alone" occurrences of `word` from `str`.
        // For `word` to stand-alone", it must not be preceded nor followed by letter/digit
        char *removestr(char * restrict str, const char *restrict word)
        size_t w_len = strlen(word);
        if (w_len == 0)
        return str;

        const char *src = str;
        char *dest = str;
        char *token;

        while ((token = strstr(src, word)) != NULL)
        (src > str && isalnum((unsigned char) src[-1])))
        // `word` match is not "stand-alone"
        *dest++ = *src++;
        else
        // skip `word`
        src += w_len;


        // copy rest of `str`
        while ((*dest++ = *src++) != '');
        return str;



        Tests



        void removestr_test(const char *str, const char *word) 
        char *test_str = strdup(str);
        char *result = removestr(test_str, word);
        printf("%d <%s> <%s> --> <%s>n", result == test_str, str, word, test_str);


        int main(void)
        removestr_test("", "the");
        removestr_test("the", "the");
        removestr_test("the beginning", "the");
        removestr_test("in the beginning", "the");
        removestr_test("end the", "the");
        removestr_test("thenot thenot notthe notthe", "the");
        removestr_test("xx the xx the xx the xx the xx the xx the", "the");
        return 0;



        Output



        1 <> <the> --> <>
        1 <the> <the> --> <>
        1 <the beginning> <the> --> < beginning>
        1 <in the beginning> <the> --> <in beginning>
        1 <end the> <the> --> <end >
        1 <thenot thenot notthe notthe> <the> --> <thenot thenot notthe notthe>
        1 <xx the xx the xx the xx the xx the xx the> <the> --> <xx xx xx xx xx xx >





        share|improve this answer















        Order of complexity higher than needed



        With memmove(), which execution time varies linearly with strlen(str), inside a loop which iteration count can depend on strlen(str), this algorithm is at least $ O(n^2) $ and a $ O(n) $ is possible. Use separate pointers to read from str and write to str can accomplish $ O(n) $ - still in the forward direction. See below.



        What if arguments overlap?



        word could exist at the end of str, and so removestr(char *str, const char *word) must account that word may change anytime str changes. To inform the compiler this situation is not relevant employ restrict.



        // char *removestr(char *str, const char *word)
        char *removestr(char *restrict str, const char *restrict word)


        This may improve performance a bit as it can allow various compiler optimizations



        Avoid UB



        is...(x) functions are UB when x < 0 && x != EOF as they are designed for unsigned char and EOF. As a char may be negative, cast to (unsigned char) to cope with this pesky C nuance.



        // isalnum(*(ptr + len)
        isalnum((unsigned char) *(ptr + len))



        Sample lightly tested $ O(n) $ code following OP's lead of while((ptr = strstr(ptr, word)))



        (Really $ O(strlen(str) * strlen(word)$ vs. OP's $ O(strlen(str)^2 * strlen(word)$).



        // Remove all "stand-alone" occurrences of `word` from `str`.
        // For `word` to stand-alone", it must not be preceded nor followed by letter/digit
        char *removestr(char * restrict str, const char *restrict word)
        size_t w_len = strlen(word);
        if (w_len == 0)
        return str;

        const char *src = str;
        char *dest = str;
        char *token;

        while ((token = strstr(src, word)) != NULL)
        (src > str && isalnum((unsigned char) src[-1])))
        // `word` match is not "stand-alone"
        *dest++ = *src++;
        else
        // skip `word`
        src += w_len;


        // copy rest of `str`
        while ((*dest++ = *src++) != '');
        return str;



        Tests



        void removestr_test(const char *str, const char *word) 
        char *test_str = strdup(str);
        char *result = removestr(test_str, word);
        printf("%d <%s> <%s> --> <%s>n", result == test_str, str, word, test_str);


        int main(void)
        removestr_test("", "the");
        removestr_test("the", "the");
        removestr_test("the beginning", "the");
        removestr_test("in the beginning", "the");
        removestr_test("end the", "the");
        removestr_test("thenot thenot notthe notthe", "the");
        removestr_test("xx the xx the xx the xx the xx the xx the", "the");
        return 0;



        Output



        1 <> <the> --> <>
        1 <the> <the> --> <>
        1 <the beginning> <the> --> < beginning>
        1 <in the beginning> <the> --> <in beginning>
        1 <end the> <the> --> <end >
        1 <thenot thenot notthe notthe> <the> --> <thenot thenot notthe notthe>
        1 <xx the xx the xx the xx the xx the xx the> <the> --> <xx xx xx xx xx xx >






        share|improve this answer















        share|improve this answer



        share|improve this answer








        edited May 9 at 15:53


























        answered May 9 at 14:37









        chux

        11.4k11238




        11.4k11238






















             

            draft saved


            draft discarded


























             


            draft saved


            draft discarded














            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f193961%2ffunction-which-removes-a-given-word-from-a-string%23new-answer', 'question_page');

            );

            Post as a guest













































































            Popular posts from this blog

            Chat program with C++ and SFML

            Function to Return a JSON Like Objects Using VBA Collections and Arrays

            Will my employers contract hold up in court?