Title capitalization of strings in Java

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP





.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty margin-bottom:0;







up vote
4
down vote

favorite












I needed to port some Coldfusion code to Java that does Title Capitalization on Strings and came up with this solution. I would love it if anyone could give me some pointers as to how to make this more efficient.



import org.apache.commons.text.WordUtils;

import java.util.HashSet;
import java.util.Set;

public class CapFirstTitle

public static String CapFirstTitle()
String inputText = " from a String to test Title Capitalization CASE hyphenated-word word from IBM od id XX if ";
Set<String> whiteList = new HashSet<String>();

whiteList.add("a");
whiteList.add("above");
whiteList.add("after");
whiteList.add("ain't");
whiteList.add("among");
whiteList.add("an");
whiteList.add("and");
whiteList.add("as");
whiteList.add("at");
whiteList.add("below");
whiteList.add("but");
whiteList.add("by");
whiteList.add("can't");
whiteList.add("don't");
whiteList.add("for");
whiteList.add("from");
whiteList.add("if");
whiteList.add("in");
whiteList.add("into");
whiteList.add("it's");
whiteList.add("nor");
whiteList.add("of");
whiteList.add("off");
whiteList.add("on");
whiteList.add("onto");
whiteList.add("or");
whiteList.add("over");
whiteList.add("since");
whiteList.add("the");
whiteList.add("to");
whiteList.add("under");
whiteList.add("until");
whiteList.add("up");
whiteList.add("with");
whiteList.add("won't");

Set<String> alwaysCapitalize = new HashSet<String>();
alwaysCapitalize.add("II");
alwaysCapitalize.add("III");
alwaysCapitalize.add("IV");
alwaysCapitalize.add("V");
alwaysCapitalize.add("VI");
alwaysCapitalize.add("VII");
alwaysCapitalize.add("VIII");
alwaysCapitalize.add("IX");
alwaysCapitalize.add("X");
alwaysCapitalize.add("XI");
alwaysCapitalize.add("XII");
alwaysCapitalize.add("XIII");
alwaysCapitalize.add("XIV");
alwaysCapitalize.add("XV");
alwaysCapitalize.add("XVI");
alwaysCapitalize.add("XVII");
alwaysCapitalize.add("XVIII");
alwaysCapitalize.add("XIX");
alwaysCapitalize.add("XX");
alwaysCapitalize.add("XXI");
alwaysCapitalize.add("OD");
alwaysCapitalize.add("ID");
alwaysCapitalize.add("PH");
alwaysCapitalize.add("XH");
alwaysCapitalize.add("UV");
alwaysCapitalize.add("DOM");
alwaysCapitalize.add("GS");

alwaysCapitalize.add("ii");
alwaysCapitalize.add("iii");
alwaysCapitalize.add("iv");
alwaysCapitalize.add("v");
alwaysCapitalize.add("vi");
alwaysCapitalize.add("vii");
alwaysCapitalize.add("viii");
alwaysCapitalize.add("ix");
alwaysCapitalize.add("x");
alwaysCapitalize.add("xi");
alwaysCapitalize.add("xii");
alwaysCapitalize.add("xiii");
alwaysCapitalize.add("xiv");
alwaysCapitalize.add("xv");
alwaysCapitalize.add("xvi");
alwaysCapitalize.add("xvii");
alwaysCapitalize.add("xviii");
alwaysCapitalize.add("xix");
alwaysCapitalize.add("xx");
alwaysCapitalize.add("xxi");
alwaysCapitalize.add("od");
alwaysCapitalize.add("id");
alwaysCapitalize.add("ph");
alwaysCapitalize.add("xh");
alwaysCapitalize.add("uv");
alwaysCapitalize.add("dom");
alwaysCapitalize.add("gs");

StringBuilder capitalizedString = new StringBuilder();

if (inputText.contains(" "))

inputText = inputText.toLowerCase().trim();

String parts = inputText.split(" ");

for (int i = 0; i < parts.length; i++) i == parts.length - 1)

capitalizedString.append(WordUtils.capitalize(parts[i], ' ', '-')).append(" ");

else

if (!whiteList.contains(parts[i]))

if (!alwaysCapitalize.contains(parts[i]))

capitalizedString.append(WordUtils.capitalize(parts[i], ' ', '-')).append(" ");

else

capitalizedString.append(parts[i].toUpperCase()).append(" ");



else

capitalizedString.append(parts[i]).append(" ");






return capitalizedString.toString();








share|improve this question

















  • 1




    What precisely should the code do? Because I wonder for example why you convert the input text completely to lower case as a first step.
    – Koekje
    May 1 at 22:52










  • It is supposed to clean up strings that may be in the form of "This IS a SAmPle STRING" first. I didn't check if Apache Commons WordUtils already handles that particular case. Then it checks if we are on the first word or the last word of the string and if so don't check against the whitelist of words that should stay lowercase. Otherwise check against the whitelist and capitalize all words not on the whitelist, then go on to check against words that need to stay capitalized fully.
    – ozfive
    May 1 at 23:04











  • Looks like it doesn't
    – ozfive
    May 1 at 23:09
















up vote
4
down vote

favorite












I needed to port some Coldfusion code to Java that does Title Capitalization on Strings and came up with this solution. I would love it if anyone could give me some pointers as to how to make this more efficient.



import org.apache.commons.text.WordUtils;

import java.util.HashSet;
import java.util.Set;

public class CapFirstTitle

public static String CapFirstTitle()
String inputText = " from a String to test Title Capitalization CASE hyphenated-word word from IBM od id XX if ";
Set<String> whiteList = new HashSet<String>();

whiteList.add("a");
whiteList.add("above");
whiteList.add("after");
whiteList.add("ain't");
whiteList.add("among");
whiteList.add("an");
whiteList.add("and");
whiteList.add("as");
whiteList.add("at");
whiteList.add("below");
whiteList.add("but");
whiteList.add("by");
whiteList.add("can't");
whiteList.add("don't");
whiteList.add("for");
whiteList.add("from");
whiteList.add("if");
whiteList.add("in");
whiteList.add("into");
whiteList.add("it's");
whiteList.add("nor");
whiteList.add("of");
whiteList.add("off");
whiteList.add("on");
whiteList.add("onto");
whiteList.add("or");
whiteList.add("over");
whiteList.add("since");
whiteList.add("the");
whiteList.add("to");
whiteList.add("under");
whiteList.add("until");
whiteList.add("up");
whiteList.add("with");
whiteList.add("won't");

Set<String> alwaysCapitalize = new HashSet<String>();
alwaysCapitalize.add("II");
alwaysCapitalize.add("III");
alwaysCapitalize.add("IV");
alwaysCapitalize.add("V");
alwaysCapitalize.add("VI");
alwaysCapitalize.add("VII");
alwaysCapitalize.add("VIII");
alwaysCapitalize.add("IX");
alwaysCapitalize.add("X");
alwaysCapitalize.add("XI");
alwaysCapitalize.add("XII");
alwaysCapitalize.add("XIII");
alwaysCapitalize.add("XIV");
alwaysCapitalize.add("XV");
alwaysCapitalize.add("XVI");
alwaysCapitalize.add("XVII");
alwaysCapitalize.add("XVIII");
alwaysCapitalize.add("XIX");
alwaysCapitalize.add("XX");
alwaysCapitalize.add("XXI");
alwaysCapitalize.add("OD");
alwaysCapitalize.add("ID");
alwaysCapitalize.add("PH");
alwaysCapitalize.add("XH");
alwaysCapitalize.add("UV");
alwaysCapitalize.add("DOM");
alwaysCapitalize.add("GS");

alwaysCapitalize.add("ii");
alwaysCapitalize.add("iii");
alwaysCapitalize.add("iv");
alwaysCapitalize.add("v");
alwaysCapitalize.add("vi");
alwaysCapitalize.add("vii");
alwaysCapitalize.add("viii");
alwaysCapitalize.add("ix");
alwaysCapitalize.add("x");
alwaysCapitalize.add("xi");
alwaysCapitalize.add("xii");
alwaysCapitalize.add("xiii");
alwaysCapitalize.add("xiv");
alwaysCapitalize.add("xv");
alwaysCapitalize.add("xvi");
alwaysCapitalize.add("xvii");
alwaysCapitalize.add("xviii");
alwaysCapitalize.add("xix");
alwaysCapitalize.add("xx");
alwaysCapitalize.add("xxi");
alwaysCapitalize.add("od");
alwaysCapitalize.add("id");
alwaysCapitalize.add("ph");
alwaysCapitalize.add("xh");
alwaysCapitalize.add("uv");
alwaysCapitalize.add("dom");
alwaysCapitalize.add("gs");

StringBuilder capitalizedString = new StringBuilder();

if (inputText.contains(" "))

inputText = inputText.toLowerCase().trim();

String parts = inputText.split(" ");

for (int i = 0; i < parts.length; i++) i == parts.length - 1)

capitalizedString.append(WordUtils.capitalize(parts[i], ' ', '-')).append(" ");

else

if (!whiteList.contains(parts[i]))

if (!alwaysCapitalize.contains(parts[i]))

capitalizedString.append(WordUtils.capitalize(parts[i], ' ', '-')).append(" ");

else

capitalizedString.append(parts[i].toUpperCase()).append(" ");



else

capitalizedString.append(parts[i]).append(" ");






return capitalizedString.toString();








share|improve this question

















  • 1




    What precisely should the code do? Because I wonder for example why you convert the input text completely to lower case as a first step.
    – Koekje
    May 1 at 22:52










  • It is supposed to clean up strings that may be in the form of "This IS a SAmPle STRING" first. I didn't check if Apache Commons WordUtils already handles that particular case. Then it checks if we are on the first word or the last word of the string and if so don't check against the whitelist of words that should stay lowercase. Otherwise check against the whitelist and capitalize all words not on the whitelist, then go on to check against words that need to stay capitalized fully.
    – ozfive
    May 1 at 23:04











  • Looks like it doesn't
    – ozfive
    May 1 at 23:09












up vote
4
down vote

favorite









up vote
4
down vote

favorite











I needed to port some Coldfusion code to Java that does Title Capitalization on Strings and came up with this solution. I would love it if anyone could give me some pointers as to how to make this more efficient.



import org.apache.commons.text.WordUtils;

import java.util.HashSet;
import java.util.Set;

public class CapFirstTitle

public static String CapFirstTitle()
String inputText = " from a String to test Title Capitalization CASE hyphenated-word word from IBM od id XX if ";
Set<String> whiteList = new HashSet<String>();

whiteList.add("a");
whiteList.add("above");
whiteList.add("after");
whiteList.add("ain't");
whiteList.add("among");
whiteList.add("an");
whiteList.add("and");
whiteList.add("as");
whiteList.add("at");
whiteList.add("below");
whiteList.add("but");
whiteList.add("by");
whiteList.add("can't");
whiteList.add("don't");
whiteList.add("for");
whiteList.add("from");
whiteList.add("if");
whiteList.add("in");
whiteList.add("into");
whiteList.add("it's");
whiteList.add("nor");
whiteList.add("of");
whiteList.add("off");
whiteList.add("on");
whiteList.add("onto");
whiteList.add("or");
whiteList.add("over");
whiteList.add("since");
whiteList.add("the");
whiteList.add("to");
whiteList.add("under");
whiteList.add("until");
whiteList.add("up");
whiteList.add("with");
whiteList.add("won't");

Set<String> alwaysCapitalize = new HashSet<String>();
alwaysCapitalize.add("II");
alwaysCapitalize.add("III");
alwaysCapitalize.add("IV");
alwaysCapitalize.add("V");
alwaysCapitalize.add("VI");
alwaysCapitalize.add("VII");
alwaysCapitalize.add("VIII");
alwaysCapitalize.add("IX");
alwaysCapitalize.add("X");
alwaysCapitalize.add("XI");
alwaysCapitalize.add("XII");
alwaysCapitalize.add("XIII");
alwaysCapitalize.add("XIV");
alwaysCapitalize.add("XV");
alwaysCapitalize.add("XVI");
alwaysCapitalize.add("XVII");
alwaysCapitalize.add("XVIII");
alwaysCapitalize.add("XIX");
alwaysCapitalize.add("XX");
alwaysCapitalize.add("XXI");
alwaysCapitalize.add("OD");
alwaysCapitalize.add("ID");
alwaysCapitalize.add("PH");
alwaysCapitalize.add("XH");
alwaysCapitalize.add("UV");
alwaysCapitalize.add("DOM");
alwaysCapitalize.add("GS");

alwaysCapitalize.add("ii");
alwaysCapitalize.add("iii");
alwaysCapitalize.add("iv");
alwaysCapitalize.add("v");
alwaysCapitalize.add("vi");
alwaysCapitalize.add("vii");
alwaysCapitalize.add("viii");
alwaysCapitalize.add("ix");
alwaysCapitalize.add("x");
alwaysCapitalize.add("xi");
alwaysCapitalize.add("xii");
alwaysCapitalize.add("xiii");
alwaysCapitalize.add("xiv");
alwaysCapitalize.add("xv");
alwaysCapitalize.add("xvi");
alwaysCapitalize.add("xvii");
alwaysCapitalize.add("xviii");
alwaysCapitalize.add("xix");
alwaysCapitalize.add("xx");
alwaysCapitalize.add("xxi");
alwaysCapitalize.add("od");
alwaysCapitalize.add("id");
alwaysCapitalize.add("ph");
alwaysCapitalize.add("xh");
alwaysCapitalize.add("uv");
alwaysCapitalize.add("dom");
alwaysCapitalize.add("gs");

StringBuilder capitalizedString = new StringBuilder();

if (inputText.contains(" "))

inputText = inputText.toLowerCase().trim();

String parts = inputText.split(" ");

for (int i = 0; i < parts.length; i++) i == parts.length - 1)

capitalizedString.append(WordUtils.capitalize(parts[i], ' ', '-')).append(" ");

else

if (!whiteList.contains(parts[i]))

if (!alwaysCapitalize.contains(parts[i]))

capitalizedString.append(WordUtils.capitalize(parts[i], ' ', '-')).append(" ");

else

capitalizedString.append(parts[i].toUpperCase()).append(" ");



else

capitalizedString.append(parts[i]).append(" ");






return capitalizedString.toString();








share|improve this question













I needed to port some Coldfusion code to Java that does Title Capitalization on Strings and came up with this solution. I would love it if anyone could give me some pointers as to how to make this more efficient.



import org.apache.commons.text.WordUtils;

import java.util.HashSet;
import java.util.Set;

public class CapFirstTitle

public static String CapFirstTitle()
String inputText = " from a String to test Title Capitalization CASE hyphenated-word word from IBM od id XX if ";
Set<String> whiteList = new HashSet<String>();

whiteList.add("a");
whiteList.add("above");
whiteList.add("after");
whiteList.add("ain't");
whiteList.add("among");
whiteList.add("an");
whiteList.add("and");
whiteList.add("as");
whiteList.add("at");
whiteList.add("below");
whiteList.add("but");
whiteList.add("by");
whiteList.add("can't");
whiteList.add("don't");
whiteList.add("for");
whiteList.add("from");
whiteList.add("if");
whiteList.add("in");
whiteList.add("into");
whiteList.add("it's");
whiteList.add("nor");
whiteList.add("of");
whiteList.add("off");
whiteList.add("on");
whiteList.add("onto");
whiteList.add("or");
whiteList.add("over");
whiteList.add("since");
whiteList.add("the");
whiteList.add("to");
whiteList.add("under");
whiteList.add("until");
whiteList.add("up");
whiteList.add("with");
whiteList.add("won't");

Set<String> alwaysCapitalize = new HashSet<String>();
alwaysCapitalize.add("II");
alwaysCapitalize.add("III");
alwaysCapitalize.add("IV");
alwaysCapitalize.add("V");
alwaysCapitalize.add("VI");
alwaysCapitalize.add("VII");
alwaysCapitalize.add("VIII");
alwaysCapitalize.add("IX");
alwaysCapitalize.add("X");
alwaysCapitalize.add("XI");
alwaysCapitalize.add("XII");
alwaysCapitalize.add("XIII");
alwaysCapitalize.add("XIV");
alwaysCapitalize.add("XV");
alwaysCapitalize.add("XVI");
alwaysCapitalize.add("XVII");
alwaysCapitalize.add("XVIII");
alwaysCapitalize.add("XIX");
alwaysCapitalize.add("XX");
alwaysCapitalize.add("XXI");
alwaysCapitalize.add("OD");
alwaysCapitalize.add("ID");
alwaysCapitalize.add("PH");
alwaysCapitalize.add("XH");
alwaysCapitalize.add("UV");
alwaysCapitalize.add("DOM");
alwaysCapitalize.add("GS");

alwaysCapitalize.add("ii");
alwaysCapitalize.add("iii");
alwaysCapitalize.add("iv");
alwaysCapitalize.add("v");
alwaysCapitalize.add("vi");
alwaysCapitalize.add("vii");
alwaysCapitalize.add("viii");
alwaysCapitalize.add("ix");
alwaysCapitalize.add("x");
alwaysCapitalize.add("xi");
alwaysCapitalize.add("xii");
alwaysCapitalize.add("xiii");
alwaysCapitalize.add("xiv");
alwaysCapitalize.add("xv");
alwaysCapitalize.add("xvi");
alwaysCapitalize.add("xvii");
alwaysCapitalize.add("xviii");
alwaysCapitalize.add("xix");
alwaysCapitalize.add("xx");
alwaysCapitalize.add("xxi");
alwaysCapitalize.add("od");
alwaysCapitalize.add("id");
alwaysCapitalize.add("ph");
alwaysCapitalize.add("xh");
alwaysCapitalize.add("uv");
alwaysCapitalize.add("dom");
alwaysCapitalize.add("gs");

StringBuilder capitalizedString = new StringBuilder();

if (inputText.contains(" "))

inputText = inputText.toLowerCase().trim();

String parts = inputText.split(" ");

for (int i = 0; i < parts.length; i++) i == parts.length - 1)

capitalizedString.append(WordUtils.capitalize(parts[i], ' ', '-')).append(" ");

else

if (!whiteList.contains(parts[i]))

if (!alwaysCapitalize.contains(parts[i]))

capitalizedString.append(WordUtils.capitalize(parts[i], ' ', '-')).append(" ");

else

capitalizedString.append(parts[i].toUpperCase()).append(" ");



else

capitalizedString.append(parts[i]).append(" ");






return capitalizedString.toString();










share|improve this question












share|improve this question




share|improve this question








edited May 2 at 0:31









Jamal♦

30.1k11114225




30.1k11114225









asked May 1 at 22:42









ozfive

233




233







  • 1




    What precisely should the code do? Because I wonder for example why you convert the input text completely to lower case as a first step.
    – Koekje
    May 1 at 22:52










  • It is supposed to clean up strings that may be in the form of "This IS a SAmPle STRING" first. I didn't check if Apache Commons WordUtils already handles that particular case. Then it checks if we are on the first word or the last word of the string and if so don't check against the whitelist of words that should stay lowercase. Otherwise check against the whitelist and capitalize all words not on the whitelist, then go on to check against words that need to stay capitalized fully.
    – ozfive
    May 1 at 23:04











  • Looks like it doesn't
    – ozfive
    May 1 at 23:09












  • 1




    What precisely should the code do? Because I wonder for example why you convert the input text completely to lower case as a first step.
    – Koekje
    May 1 at 22:52










  • It is supposed to clean up strings that may be in the form of "This IS a SAmPle STRING" first. I didn't check if Apache Commons WordUtils already handles that particular case. Then it checks if we are on the first word or the last word of the string and if so don't check against the whitelist of words that should stay lowercase. Otherwise check against the whitelist and capitalize all words not on the whitelist, then go on to check against words that need to stay capitalized fully.
    – ozfive
    May 1 at 23:04











  • Looks like it doesn't
    – ozfive
    May 1 at 23:09







1




1




What precisely should the code do? Because I wonder for example why you convert the input text completely to lower case as a first step.
– Koekje
May 1 at 22:52




What precisely should the code do? Because I wonder for example why you convert the input text completely to lower case as a first step.
– Koekje
May 1 at 22:52












It is supposed to clean up strings that may be in the form of "This IS a SAmPle STRING" first. I didn't check if Apache Commons WordUtils already handles that particular case. Then it checks if we are on the first word or the last word of the string and if so don't check against the whitelist of words that should stay lowercase. Otherwise check against the whitelist and capitalize all words not on the whitelist, then go on to check against words that need to stay capitalized fully.
– ozfive
May 1 at 23:04





It is supposed to clean up strings that may be in the form of "This IS a SAmPle STRING" first. I didn't check if Apache Commons WordUtils already handles that particular case. Then it checks if we are on the first word or the last word of the string and if so don't check against the whitelist of words that should stay lowercase. Otherwise check against the whitelist and capitalize all words not on the whitelist, then go on to check against words that need to stay capitalized fully.
– ozfive
May 1 at 23:04













Looks like it doesn't
– ozfive
May 1 at 23:09




Looks like it doesn't
– ozfive
May 1 at 23:09










1 Answer
1






active

oldest

votes

















up vote
4
down vote



accepted










  • CapFirstTitle() is a method name that disregards java coding conventions for method names. Use standard lowerCamelCase.



  • whitelist and alwaysCapitalize should be private static final members. You can initialize these in a static initializer block like this:



    private static final Set<String> whitelist = new HashSet<>();

    static
    whitelist.addAll(Arrays.asList("a", "above", "after", ...);



  • The binary or is a little uncommon in conditions. You won't really get much benefit out of it, especially since the boolean or is short-circuiting. Use || over | here.


  • There is a lot of vertical whitespace here. I personally find that hard to follow and I'd delete most of that space if I wrote this code myself.
    YMMV. Since you're extremely consistent about it, I don't see any need to change it.


  • You're calling toLowerCase without specifying the Locale. While this may work in most cases, you should be aware of edge-cases involving languages like Turkish. If you have any way to be locale-aware here, you should strive to be. That can save you a lot of headaches down the line :)



  • This code always appends a " " at the end of the String. You could avoid this by first overwriting the parts array with the capitalized parts and subsequently using Arrays.stream(parts).collect(Collectors.joining(" "));



    Since Java 8, String exposes the method join (thanks to Koekje for pointing that out in the coments), you can use as String.join(" ", parts);



The last point also contains an option for a deeper rewrite. Consider the following:



String capitalizedParts = Arrays.stream(input.toLowerCase().trim().split(" "))
.map(CapFirstTitle::capSingleWord)
.toArray(String::new);
capitalizedParts[0] = WordUtils.capitalize(capitalizedParts[0], ' ', '-');
int lastIdx = capitalizedParts.length - 1;
capitalizedParts[lastIdx] = WordUtils.capitalize(capitalizedParts[lastIdx], ' ', '-');

return String.join(" ", capitalizedParts);


This is a solution that hides the details of whitelist and alwaysCapitalize inside a method and clears the enforced capitalization of the first and last word.



It avoids the (correct, but mental overhead inducing) use of StringBuilder and separates the capitalization logic from the reassembly of the String. You can even go so far and reduce multiple spaces in the same process by filtering out empty array elements in the first Stream.






share|improve this answer























  • This is incredible. I usually have to work on an island and this is going to help me become a better programmer! I really appreciate all of the points made. I will take them to heart. Thank you!
    – ozfive
    May 2 at 0:19










  • The binary or | was a typo
    – ozfive
    May 2 at 0:28










  • Very good answer! You can also use the join method of String, see docs.oracle.com/javase/8/docs/api/java/lang/…
    – Koekje
    May 2 at 11:40










  • @Koekje I forgot about that method, it seems ... Thanks for reminding me :)
    – Vogel612♦
    May 2 at 11:45











Your Answer




StackExchange.ifUsing("editor", function ()
return StackExchange.using("mathjaxEditing", function ()
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["\$", "\$"]]);
);
);
, "mathjax-editing");

StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "196"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
convertImagesToLinks: false,
noModals: false,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);



);








 

draft saved


draft discarded


















StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f193405%2ftitle-capitalization-of-strings-in-java%23new-answer', 'question_page');

);

Post as a guest






























1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes








up vote
4
down vote



accepted










  • CapFirstTitle() is a method name that disregards java coding conventions for method names. Use standard lowerCamelCase.



  • whitelist and alwaysCapitalize should be private static final members. You can initialize these in a static initializer block like this:



    private static final Set<String> whitelist = new HashSet<>();

    static
    whitelist.addAll(Arrays.asList("a", "above", "after", ...);



  • The binary or is a little uncommon in conditions. You won't really get much benefit out of it, especially since the boolean or is short-circuiting. Use || over | here.


  • There is a lot of vertical whitespace here. I personally find that hard to follow and I'd delete most of that space if I wrote this code myself.
    YMMV. Since you're extremely consistent about it, I don't see any need to change it.


  • You're calling toLowerCase without specifying the Locale. While this may work in most cases, you should be aware of edge-cases involving languages like Turkish. If you have any way to be locale-aware here, you should strive to be. That can save you a lot of headaches down the line :)



  • This code always appends a " " at the end of the String. You could avoid this by first overwriting the parts array with the capitalized parts and subsequently using Arrays.stream(parts).collect(Collectors.joining(" "));



    Since Java 8, String exposes the method join (thanks to Koekje for pointing that out in the coments), you can use as String.join(" ", parts);



The last point also contains an option for a deeper rewrite. Consider the following:



String capitalizedParts = Arrays.stream(input.toLowerCase().trim().split(" "))
.map(CapFirstTitle::capSingleWord)
.toArray(String::new);
capitalizedParts[0] = WordUtils.capitalize(capitalizedParts[0], ' ', '-');
int lastIdx = capitalizedParts.length - 1;
capitalizedParts[lastIdx] = WordUtils.capitalize(capitalizedParts[lastIdx], ' ', '-');

return String.join(" ", capitalizedParts);


This is a solution that hides the details of whitelist and alwaysCapitalize inside a method and clears the enforced capitalization of the first and last word.



It avoids the (correct, but mental overhead inducing) use of StringBuilder and separates the capitalization logic from the reassembly of the String. You can even go so far and reduce multiple spaces in the same process by filtering out empty array elements in the first Stream.






share|improve this answer























  • This is incredible. I usually have to work on an island and this is going to help me become a better programmer! I really appreciate all of the points made. I will take them to heart. Thank you!
    – ozfive
    May 2 at 0:19










  • The binary or | was a typo
    – ozfive
    May 2 at 0:28










  • Very good answer! You can also use the join method of String, see docs.oracle.com/javase/8/docs/api/java/lang/…
    – Koekje
    May 2 at 11:40










  • @Koekje I forgot about that method, it seems ... Thanks for reminding me :)
    – Vogel612♦
    May 2 at 11:45















up vote
4
down vote



accepted










  • CapFirstTitle() is a method name that disregards java coding conventions for method names. Use standard lowerCamelCase.



  • whitelist and alwaysCapitalize should be private static final members. You can initialize these in a static initializer block like this:



    private static final Set<String> whitelist = new HashSet<>();

    static
    whitelist.addAll(Arrays.asList("a", "above", "after", ...);



  • The binary or is a little uncommon in conditions. You won't really get much benefit out of it, especially since the boolean or is short-circuiting. Use || over | here.


  • There is a lot of vertical whitespace here. I personally find that hard to follow and I'd delete most of that space if I wrote this code myself.
    YMMV. Since you're extremely consistent about it, I don't see any need to change it.


  • You're calling toLowerCase without specifying the Locale. While this may work in most cases, you should be aware of edge-cases involving languages like Turkish. If you have any way to be locale-aware here, you should strive to be. That can save you a lot of headaches down the line :)



  • This code always appends a " " at the end of the String. You could avoid this by first overwriting the parts array with the capitalized parts and subsequently using Arrays.stream(parts).collect(Collectors.joining(" "));



    Since Java 8, String exposes the method join (thanks to Koekje for pointing that out in the coments), you can use as String.join(" ", parts);



The last point also contains an option for a deeper rewrite. Consider the following:



String capitalizedParts = Arrays.stream(input.toLowerCase().trim().split(" "))
.map(CapFirstTitle::capSingleWord)
.toArray(String::new);
capitalizedParts[0] = WordUtils.capitalize(capitalizedParts[0], ' ', '-');
int lastIdx = capitalizedParts.length - 1;
capitalizedParts[lastIdx] = WordUtils.capitalize(capitalizedParts[lastIdx], ' ', '-');

return String.join(" ", capitalizedParts);


This is a solution that hides the details of whitelist and alwaysCapitalize inside a method and clears the enforced capitalization of the first and last word.



It avoids the (correct, but mental overhead inducing) use of StringBuilder and separates the capitalization logic from the reassembly of the String. You can even go so far and reduce multiple spaces in the same process by filtering out empty array elements in the first Stream.






share|improve this answer























  • This is incredible. I usually have to work on an island and this is going to help me become a better programmer! I really appreciate all of the points made. I will take them to heart. Thank you!
    – ozfive
    May 2 at 0:19










  • The binary or | was a typo
    – ozfive
    May 2 at 0:28










  • Very good answer! You can also use the join method of String, see docs.oracle.com/javase/8/docs/api/java/lang/…
    – Koekje
    May 2 at 11:40










  • @Koekje I forgot about that method, it seems ... Thanks for reminding me :)
    – Vogel612♦
    May 2 at 11:45













up vote
4
down vote



accepted







up vote
4
down vote



accepted






  • CapFirstTitle() is a method name that disregards java coding conventions for method names. Use standard lowerCamelCase.



  • whitelist and alwaysCapitalize should be private static final members. You can initialize these in a static initializer block like this:



    private static final Set<String> whitelist = new HashSet<>();

    static
    whitelist.addAll(Arrays.asList("a", "above", "after", ...);



  • The binary or is a little uncommon in conditions. You won't really get much benefit out of it, especially since the boolean or is short-circuiting. Use || over | here.


  • There is a lot of vertical whitespace here. I personally find that hard to follow and I'd delete most of that space if I wrote this code myself.
    YMMV. Since you're extremely consistent about it, I don't see any need to change it.


  • You're calling toLowerCase without specifying the Locale. While this may work in most cases, you should be aware of edge-cases involving languages like Turkish. If you have any way to be locale-aware here, you should strive to be. That can save you a lot of headaches down the line :)



  • This code always appends a " " at the end of the String. You could avoid this by first overwriting the parts array with the capitalized parts and subsequently using Arrays.stream(parts).collect(Collectors.joining(" "));



    Since Java 8, String exposes the method join (thanks to Koekje for pointing that out in the coments), you can use as String.join(" ", parts);



The last point also contains an option for a deeper rewrite. Consider the following:



String capitalizedParts = Arrays.stream(input.toLowerCase().trim().split(" "))
.map(CapFirstTitle::capSingleWord)
.toArray(String::new);
capitalizedParts[0] = WordUtils.capitalize(capitalizedParts[0], ' ', '-');
int lastIdx = capitalizedParts.length - 1;
capitalizedParts[lastIdx] = WordUtils.capitalize(capitalizedParts[lastIdx], ' ', '-');

return String.join(" ", capitalizedParts);


This is a solution that hides the details of whitelist and alwaysCapitalize inside a method and clears the enforced capitalization of the first and last word.



It avoids the (correct, but mental overhead inducing) use of StringBuilder and separates the capitalization logic from the reassembly of the String. You can even go so far and reduce multiple spaces in the same process by filtering out empty array elements in the first Stream.






share|improve this answer















  • CapFirstTitle() is a method name that disregards java coding conventions for method names. Use standard lowerCamelCase.



  • whitelist and alwaysCapitalize should be private static final members. You can initialize these in a static initializer block like this:



    private static final Set<String> whitelist = new HashSet<>();

    static
    whitelist.addAll(Arrays.asList("a", "above", "after", ...);



  • The binary or is a little uncommon in conditions. You won't really get much benefit out of it, especially since the boolean or is short-circuiting. Use || over | here.


  • There is a lot of vertical whitespace here. I personally find that hard to follow and I'd delete most of that space if I wrote this code myself.
    YMMV. Since you're extremely consistent about it, I don't see any need to change it.


  • You're calling toLowerCase without specifying the Locale. While this may work in most cases, you should be aware of edge-cases involving languages like Turkish. If you have any way to be locale-aware here, you should strive to be. That can save you a lot of headaches down the line :)



  • This code always appends a " " at the end of the String. You could avoid this by first overwriting the parts array with the capitalized parts and subsequently using Arrays.stream(parts).collect(Collectors.joining(" "));



    Since Java 8, String exposes the method join (thanks to Koekje for pointing that out in the coments), you can use as String.join(" ", parts);



The last point also contains an option for a deeper rewrite. Consider the following:



String capitalizedParts = Arrays.stream(input.toLowerCase().trim().split(" "))
.map(CapFirstTitle::capSingleWord)
.toArray(String::new);
capitalizedParts[0] = WordUtils.capitalize(capitalizedParts[0], ' ', '-');
int lastIdx = capitalizedParts.length - 1;
capitalizedParts[lastIdx] = WordUtils.capitalize(capitalizedParts[lastIdx], ' ', '-');

return String.join(" ", capitalizedParts);


This is a solution that hides the details of whitelist and alwaysCapitalize inside a method and clears the enforced capitalization of the first and last word.



It avoids the (correct, but mental overhead inducing) use of StringBuilder and separates the capitalization logic from the reassembly of the String. You can even go so far and reduce multiple spaces in the same process by filtering out empty array elements in the first Stream.







share|improve this answer















share|improve this answer



share|improve this answer








edited May 2 at 11:45


























answered May 1 at 23:28









Vogel612♦

20.9k345124




20.9k345124











  • This is incredible. I usually have to work on an island and this is going to help me become a better programmer! I really appreciate all of the points made. I will take them to heart. Thank you!
    – ozfive
    May 2 at 0:19










  • The binary or | was a typo
    – ozfive
    May 2 at 0:28










  • Very good answer! You can also use the join method of String, see docs.oracle.com/javase/8/docs/api/java/lang/…
    – Koekje
    May 2 at 11:40










  • @Koekje I forgot about that method, it seems ... Thanks for reminding me :)
    – Vogel612♦
    May 2 at 11:45

















  • This is incredible. I usually have to work on an island and this is going to help me become a better programmer! I really appreciate all of the points made. I will take them to heart. Thank you!
    – ozfive
    May 2 at 0:19










  • The binary or | was a typo
    – ozfive
    May 2 at 0:28










  • Very good answer! You can also use the join method of String, see docs.oracle.com/javase/8/docs/api/java/lang/…
    – Koekje
    May 2 at 11:40










  • @Koekje I forgot about that method, it seems ... Thanks for reminding me :)
    – Vogel612♦
    May 2 at 11:45
















This is incredible. I usually have to work on an island and this is going to help me become a better programmer! I really appreciate all of the points made. I will take them to heart. Thank you!
– ozfive
May 2 at 0:19




This is incredible. I usually have to work on an island and this is going to help me become a better programmer! I really appreciate all of the points made. I will take them to heart. Thank you!
– ozfive
May 2 at 0:19












The binary or | was a typo
– ozfive
May 2 at 0:28




The binary or | was a typo
– ozfive
May 2 at 0:28












Very good answer! You can also use the join method of String, see docs.oracle.com/javase/8/docs/api/java/lang/…
– Koekje
May 2 at 11:40




Very good answer! You can also use the join method of String, see docs.oracle.com/javase/8/docs/api/java/lang/…
– Koekje
May 2 at 11:40












@Koekje I forgot about that method, it seems ... Thanks for reminding me :)
– Vogel612♦
May 2 at 11:45





@Koekje I forgot about that method, it seems ... Thanks for reminding me :)
– Vogel612♦
May 2 at 11:45













 

draft saved


draft discarded


























 


draft saved


draft discarded














StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f193405%2ftitle-capitalization-of-strings-in-java%23new-answer', 'question_page');

);

Post as a guest













































































Popular posts from this blog

Greedy Best First Search implementation in Rust

Function to Return a JSON Like Objects Using VBA Collections and Arrays

C++11 CLH Lock Implementation