Parsing an email string

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP





.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty margin-bottom:0;







up vote
10
down vote

favorite
2












My program needs to parse an e-mail string. There are two possibilities to enter an e-mail-address. Either with the alias or without one, plain the e-mail-address.



1st possibity:



string addressWithAlias = "test my address <bla@blub.com>";


2nd possibility:



string addressWithoutAlias = "bla@blub.com";


So, I wrote two functions:



private static string getAddressPartsRegex(string address)

string plainaddress = address.Trim();

Regex reg = new Regex(@"(.+?(?=<))<(.*@.*?)>");
var gr = reg.Match(plainaddress).Groups;

return gr.Count == 1
? new plainaddress
: new gr[1].Value.Trim(), gr[2].Value.Trim() ;


private static string getAddressParts(string address)

var splittedAdress = address.Split(' ');
return splittedAdress.Last().Trim().StartsWith("<")
? new string.Join(" ", splittedAdress.Take(splittedAdress.Length - 1)), splittedAdress.Last().Trim(' ', '<', '>')
: splittedAdress;



They both work fine and the results are the same. One uses regex, the other uses Split and Join.
What would you suggest to use, and why? What is the more beautiful function?
Are there any bugs I didn't see?







share|improve this question

















  • 1




    @BCdotWEB the is the standard when writing an e-mail but you won't see the e-mail address, you see a name. Something like James Bond <james.bond@mi6.co.uk> will display in outlook just James Bond instead of his e-mail-address. So, this is kind of a standard :)
    – Matthias Burger
    Mar 1 at 13:06







  • 3




    I mean... There are way more ways to send email addresses than you listed. here, here, here for examples
    – Dannnno
    Mar 1 at 15:16






  • 1




    Ultimately, the only valid email address is one that you can send email to; it is a lot more useful to see if your sending tool of choice can handle the email address than the arbitrary (hopefully subset, but notalways) criteria from the RFC you chose to enforce. Even then, you can't validate that it is a real email address - the only way to do that is by sending it.
    – Dannnno
    Mar 1 at 15:18






  • 1




    ex-parrot.com/~pdw/Mail-RFC822-Address.html
    – MCMastery
    Mar 1 at 15:37






  • 1




    @MatthiasBurger lol just found that and it made me laugh
    – MCMastery
    Mar 1 at 15:46
















up vote
10
down vote

favorite
2












My program needs to parse an e-mail string. There are two possibilities to enter an e-mail-address. Either with the alias or without one, plain the e-mail-address.



1st possibity:



string addressWithAlias = "test my address <bla@blub.com>";


2nd possibility:



string addressWithoutAlias = "bla@blub.com";


So, I wrote two functions:



private static string getAddressPartsRegex(string address)

string plainaddress = address.Trim();

Regex reg = new Regex(@"(.+?(?=<))<(.*@.*?)>");
var gr = reg.Match(plainaddress).Groups;

return gr.Count == 1
? new plainaddress
: new gr[1].Value.Trim(), gr[2].Value.Trim() ;


private static string getAddressParts(string address)

var splittedAdress = address.Split(' ');
return splittedAdress.Last().Trim().StartsWith("<")
? new string.Join(" ", splittedAdress.Take(splittedAdress.Length - 1)), splittedAdress.Last().Trim(' ', '<', '>')
: splittedAdress;



They both work fine and the results are the same. One uses regex, the other uses Split and Join.
What would you suggest to use, and why? What is the more beautiful function?
Are there any bugs I didn't see?







share|improve this question

















  • 1




    @BCdotWEB the is the standard when writing an e-mail but you won't see the e-mail address, you see a name. Something like James Bond <james.bond@mi6.co.uk> will display in outlook just James Bond instead of his e-mail-address. So, this is kind of a standard :)
    – Matthias Burger
    Mar 1 at 13:06







  • 3




    I mean... There are way more ways to send email addresses than you listed. here, here, here for examples
    – Dannnno
    Mar 1 at 15:16






  • 1




    Ultimately, the only valid email address is one that you can send email to; it is a lot more useful to see if your sending tool of choice can handle the email address than the arbitrary (hopefully subset, but notalways) criteria from the RFC you chose to enforce. Even then, you can't validate that it is a real email address - the only way to do that is by sending it.
    – Dannnno
    Mar 1 at 15:18






  • 1




    ex-parrot.com/~pdw/Mail-RFC822-Address.html
    – MCMastery
    Mar 1 at 15:37






  • 1




    @MatthiasBurger lol just found that and it made me laugh
    – MCMastery
    Mar 1 at 15:46












up vote
10
down vote

favorite
2









up vote
10
down vote

favorite
2






2





My program needs to parse an e-mail string. There are two possibilities to enter an e-mail-address. Either with the alias or without one, plain the e-mail-address.



1st possibity:



string addressWithAlias = "test my address <bla@blub.com>";


2nd possibility:



string addressWithoutAlias = "bla@blub.com";


So, I wrote two functions:



private static string getAddressPartsRegex(string address)

string plainaddress = address.Trim();

Regex reg = new Regex(@"(.+?(?=<))<(.*@.*?)>");
var gr = reg.Match(plainaddress).Groups;

return gr.Count == 1
? new plainaddress
: new gr[1].Value.Trim(), gr[2].Value.Trim() ;


private static string getAddressParts(string address)

var splittedAdress = address.Split(' ');
return splittedAdress.Last().Trim().StartsWith("<")
? new string.Join(" ", splittedAdress.Take(splittedAdress.Length - 1)), splittedAdress.Last().Trim(' ', '<', '>')
: splittedAdress;



They both work fine and the results are the same. One uses regex, the other uses Split and Join.
What would you suggest to use, and why? What is the more beautiful function?
Are there any bugs I didn't see?







share|improve this question













My program needs to parse an e-mail string. There are two possibilities to enter an e-mail-address. Either with the alias or without one, plain the e-mail-address.



1st possibity:



string addressWithAlias = "test my address <bla@blub.com>";


2nd possibility:



string addressWithoutAlias = "bla@blub.com";


So, I wrote two functions:



private static string getAddressPartsRegex(string address)

string plainaddress = address.Trim();

Regex reg = new Regex(@"(.+?(?=<))<(.*@.*?)>");
var gr = reg.Match(plainaddress).Groups;

return gr.Count == 1
? new plainaddress
: new gr[1].Value.Trim(), gr[2].Value.Trim() ;


private static string getAddressParts(string address)

var splittedAdress = address.Split(' ');
return splittedAdress.Last().Trim().StartsWith("<")
? new string.Join(" ", splittedAdress.Take(splittedAdress.Length - 1)), splittedAdress.Last().Trim(' ', '<', '>')
: splittedAdress;



They both work fine and the results are the same. One uses regex, the other uses Split and Join.
What would you suggest to use, and why? What is the more beautiful function?
Are there any bugs I didn't see?









share|improve this question












share|improve this question




share|improve this question








edited Mar 1 at 11:42









t3chb0t

32.1k54195




32.1k54195









asked Mar 1 at 11:33









Matthias Burger

243113




243113







  • 1




    @BCdotWEB the is the standard when writing an e-mail but you won't see the e-mail address, you see a name. Something like James Bond <james.bond@mi6.co.uk> will display in outlook just James Bond instead of his e-mail-address. So, this is kind of a standard :)
    – Matthias Burger
    Mar 1 at 13:06







  • 3




    I mean... There are way more ways to send email addresses than you listed. here, here, here for examples
    – Dannnno
    Mar 1 at 15:16






  • 1




    Ultimately, the only valid email address is one that you can send email to; it is a lot more useful to see if your sending tool of choice can handle the email address than the arbitrary (hopefully subset, but notalways) criteria from the RFC you chose to enforce. Even then, you can't validate that it is a real email address - the only way to do that is by sending it.
    – Dannnno
    Mar 1 at 15:18






  • 1




    ex-parrot.com/~pdw/Mail-RFC822-Address.html
    – MCMastery
    Mar 1 at 15:37






  • 1




    @MatthiasBurger lol just found that and it made me laugh
    – MCMastery
    Mar 1 at 15:46












  • 1




    @BCdotWEB the is the standard when writing an e-mail but you won't see the e-mail address, you see a name. Something like James Bond <james.bond@mi6.co.uk> will display in outlook just James Bond instead of his e-mail-address. So, this is kind of a standard :)
    – Matthias Burger
    Mar 1 at 13:06







  • 3




    I mean... There are way more ways to send email addresses than you listed. here, here, here for examples
    – Dannnno
    Mar 1 at 15:16






  • 1




    Ultimately, the only valid email address is one that you can send email to; it is a lot more useful to see if your sending tool of choice can handle the email address than the arbitrary (hopefully subset, but notalways) criteria from the RFC you chose to enforce. Even then, you can't validate that it is a real email address - the only way to do that is by sending it.
    – Dannnno
    Mar 1 at 15:18






  • 1




    ex-parrot.com/~pdw/Mail-RFC822-Address.html
    – MCMastery
    Mar 1 at 15:37






  • 1




    @MatthiasBurger lol just found that and it made me laugh
    – MCMastery
    Mar 1 at 15:46







1




1




@BCdotWEB the is the standard when writing an e-mail but you won't see the e-mail address, you see a name. Something like James Bond <james.bond@mi6.co.uk> will display in outlook just James Bond instead of his e-mail-address. So, this is kind of a standard :)
– Matthias Burger
Mar 1 at 13:06





@BCdotWEB the is the standard when writing an e-mail but you won't see the e-mail address, you see a name. Something like James Bond <james.bond@mi6.co.uk> will display in outlook just James Bond instead of his e-mail-address. So, this is kind of a standard :)
– Matthias Burger
Mar 1 at 13:06





3




3




I mean... There are way more ways to send email addresses than you listed. here, here, here for examples
– Dannnno
Mar 1 at 15:16




I mean... There are way more ways to send email addresses than you listed. here, here, here for examples
– Dannnno
Mar 1 at 15:16




1




1




Ultimately, the only valid email address is one that you can send email to; it is a lot more useful to see if your sending tool of choice can handle the email address than the arbitrary (hopefully subset, but notalways) criteria from the RFC you chose to enforce. Even then, you can't validate that it is a real email address - the only way to do that is by sending it.
– Dannnno
Mar 1 at 15:18




Ultimately, the only valid email address is one that you can send email to; it is a lot more useful to see if your sending tool of choice can handle the email address than the arbitrary (hopefully subset, but notalways) criteria from the RFC you chose to enforce. Even then, you can't validate that it is a real email address - the only way to do that is by sending it.
– Dannnno
Mar 1 at 15:18




1




1




ex-parrot.com/~pdw/Mail-RFC822-Address.html
– MCMastery
Mar 1 at 15:37




ex-parrot.com/~pdw/Mail-RFC822-Address.html
– MCMastery
Mar 1 at 15:37




1




1




@MatthiasBurger lol just found that and it made me laugh
– MCMastery
Mar 1 at 15:46




@MatthiasBurger lol just found that and it made me laugh
– MCMastery
Mar 1 at 15:46










2 Answers
2






active

oldest

votes

















up vote
10
down vote



accepted










Consider taking advantage of existing features that could provide an additional layer of validation.



mainly System.Net.Mail.MailAddress



Also as mentioned in a comment, no need to be creating the regular expression every time the function is called.



static Regex mailExpression = new Regex(@"(.+?(?=<))<(.*@.*?)>");
private static MailAddress getAddress(string address)
if (address == null) throw new ArgumentNullException("address");
if (string.IsNullOrWhiteSpace(address)) throw new ArgumentException("invalid address", "address");

var plainaddress = address.Trim();
var groups = mailExpression.Match(plainaddress).Groups;

return groups.Count == 1
? new MailAddress(plainaddress)
: new MailAddress(groups[2].Value.Trim(), groups[1].Value.Trim());



According to reference source code, internally MailAddress will try to parse the address given to it.



This avoids having to roll your own parser as one already exists out of the box that has been tried, tested and is stable.



private static MailAddress getAddress(string address) 
if (address == null) throw new ArgumentNullException("address");
if (string.IsNullOrWhiteSpace(address)) throw new ArgumentException("invalid address", "address");

address = address.Trim();
return new MailAddress(address);



You have the added advantage of having a strongly typed object model to work with that will provide you with usable properties.



The following Unit Test demonstrates the desired behavior.



[TestClass]
public class EmailParserTest
[TestMethod]
public void Should_Parse_EmailAddress_With_Alias()
//Arrange
var expectedAlias = "test my address";
var expectedAddress = "bla@blub.com";
string addressWithAlias = "test my address <bla@blub.com>";

//Act
var mailAddressWithAlias = getAddress(addressWithAlias);

//Assert
mailAddressWithAlias
.Should()
.NotBeNull()
.And.Match<MailAddress>(_ => _.Address == expectedAddress && _.DisplayName == expectedAlias);



[TestMethod]
public void Should_Parse_EmailAddress_Without_Alias()
//Arrange
var addressWithoutAlias = "bla@blub.com";

//Act
var mailAddressWithoutAlias = getAddress(addressWithoutAlias);

//Assert
mailAddressWithoutAlias
.Should()
.NotBeNull()
.And.Match<MailAddress>(_ => _.Address == addressWithoutAlias && _.DisplayName == string.Empty);
;


private static MailAddress getAddress(string address)
if (address == null) throw new ArgumentNullException("address");
if (string.IsNullOrWhiteSpace(address)) throw new ArgumentException("invalid address", "address");

address = address.Trim();
return new MailAddress(address);







share|improve this answer























  • I'm working with MimeKit.MailboxAddress here, poorly, the Mimekit-library doesn't do me that favour, Mailkit throws error. But you are right, of course :)
    – Matthias Burger
    Mar 1 at 15:15










  • i now let system.net.mail do this thing. it's better indeed. and interesting unit-test. what lib you use?
    – Matthias Burger
    Mar 1 at 15:29






  • 1




    I used the standard VS testing tools for the test runner and Fluent Assertions to assert
    – Nkosi
    Mar 1 at 15:34

















up vote
5
down vote













Let me suggest an alternative regular expression, that correctly handles both cases:



(.*?)<?(bS+@S+b)>?


This regular expression correctly identifies both patterns you want to support. Somewhat noteworthy here is the use of S in the email-address to exclude whitespace characters, which are incorrectly allowed in your orignal regex. That led to accepting something like the following as valid Email specification:



bla bla <te st@ exampl e.com>


Another thing that this regex does is accept Specifications of email adresses that do not require the email to be enclosed in <>. This happens by ensuring the address is surrounded by word boundaries (b).



You should be able to easily use it like so:



static Regex mailExpression = new Regex(@"(.*?)<?(bS+@S+b)>?");
private static String getAddressParts(string addressSpec)

var groups = mailExpression.Match(addressSpec).Groups;
return groups[1] == ""
? new groups[2].Value
: new groups[1].Value.Trim(), groups[2].Value ;



This does of course not preclude using the very valid suggestion by Nkosi






share|improve this answer























  • FWIW this regex does not correctly deal with quoted-string local parts... A fix is pretty simple, though and left as an exercise in regex for the reader.
    – Vogel612♦
    Mar 1 at 16:51










Your Answer




StackExchange.ifUsing("editor", function ()
return StackExchange.using("mathjaxEditing", function ()
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["\$", "\$"]]);
);
);
, "mathjax-editing");

StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "196"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
convertImagesToLinks: false,
noModals: false,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);



);








 

draft saved


draft discarded


















StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f188595%2fparsing-an-email-string%23new-answer', 'question_page');

);

Post as a guest






























2 Answers
2






active

oldest

votes








2 Answers
2






active

oldest

votes









active

oldest

votes






active

oldest

votes








up vote
10
down vote



accepted










Consider taking advantage of existing features that could provide an additional layer of validation.



mainly System.Net.Mail.MailAddress



Also as mentioned in a comment, no need to be creating the regular expression every time the function is called.



static Regex mailExpression = new Regex(@"(.+?(?=<))<(.*@.*?)>");
private static MailAddress getAddress(string address)
if (address == null) throw new ArgumentNullException("address");
if (string.IsNullOrWhiteSpace(address)) throw new ArgumentException("invalid address", "address");

var plainaddress = address.Trim();
var groups = mailExpression.Match(plainaddress).Groups;

return groups.Count == 1
? new MailAddress(plainaddress)
: new MailAddress(groups[2].Value.Trim(), groups[1].Value.Trim());



According to reference source code, internally MailAddress will try to parse the address given to it.



This avoids having to roll your own parser as one already exists out of the box that has been tried, tested and is stable.



private static MailAddress getAddress(string address) 
if (address == null) throw new ArgumentNullException("address");
if (string.IsNullOrWhiteSpace(address)) throw new ArgumentException("invalid address", "address");

address = address.Trim();
return new MailAddress(address);



You have the added advantage of having a strongly typed object model to work with that will provide you with usable properties.



The following Unit Test demonstrates the desired behavior.



[TestClass]
public class EmailParserTest
[TestMethod]
public void Should_Parse_EmailAddress_With_Alias()
//Arrange
var expectedAlias = "test my address";
var expectedAddress = "bla@blub.com";
string addressWithAlias = "test my address <bla@blub.com>";

//Act
var mailAddressWithAlias = getAddress(addressWithAlias);

//Assert
mailAddressWithAlias
.Should()
.NotBeNull()
.And.Match<MailAddress>(_ => _.Address == expectedAddress && _.DisplayName == expectedAlias);



[TestMethod]
public void Should_Parse_EmailAddress_Without_Alias()
//Arrange
var addressWithoutAlias = "bla@blub.com";

//Act
var mailAddressWithoutAlias = getAddress(addressWithoutAlias);

//Assert
mailAddressWithoutAlias
.Should()
.NotBeNull()
.And.Match<MailAddress>(_ => _.Address == addressWithoutAlias && _.DisplayName == string.Empty);
;


private static MailAddress getAddress(string address)
if (address == null) throw new ArgumentNullException("address");
if (string.IsNullOrWhiteSpace(address)) throw new ArgumentException("invalid address", "address");

address = address.Trim();
return new MailAddress(address);







share|improve this answer























  • I'm working with MimeKit.MailboxAddress here, poorly, the Mimekit-library doesn't do me that favour, Mailkit throws error. But you are right, of course :)
    – Matthias Burger
    Mar 1 at 15:15










  • i now let system.net.mail do this thing. it's better indeed. and interesting unit-test. what lib you use?
    – Matthias Burger
    Mar 1 at 15:29






  • 1




    I used the standard VS testing tools for the test runner and Fluent Assertions to assert
    – Nkosi
    Mar 1 at 15:34














up vote
10
down vote



accepted










Consider taking advantage of existing features that could provide an additional layer of validation.



mainly System.Net.Mail.MailAddress



Also as mentioned in a comment, no need to be creating the regular expression every time the function is called.



static Regex mailExpression = new Regex(@"(.+?(?=<))<(.*@.*?)>");
private static MailAddress getAddress(string address)
if (address == null) throw new ArgumentNullException("address");
if (string.IsNullOrWhiteSpace(address)) throw new ArgumentException("invalid address", "address");

var plainaddress = address.Trim();
var groups = mailExpression.Match(plainaddress).Groups;

return groups.Count == 1
? new MailAddress(plainaddress)
: new MailAddress(groups[2].Value.Trim(), groups[1].Value.Trim());



According to reference source code, internally MailAddress will try to parse the address given to it.



This avoids having to roll your own parser as one already exists out of the box that has been tried, tested and is stable.



private static MailAddress getAddress(string address) 
if (address == null) throw new ArgumentNullException("address");
if (string.IsNullOrWhiteSpace(address)) throw new ArgumentException("invalid address", "address");

address = address.Trim();
return new MailAddress(address);



You have the added advantage of having a strongly typed object model to work with that will provide you with usable properties.



The following Unit Test demonstrates the desired behavior.



[TestClass]
public class EmailParserTest
[TestMethod]
public void Should_Parse_EmailAddress_With_Alias()
//Arrange
var expectedAlias = "test my address";
var expectedAddress = "bla@blub.com";
string addressWithAlias = "test my address <bla@blub.com>";

//Act
var mailAddressWithAlias = getAddress(addressWithAlias);

//Assert
mailAddressWithAlias
.Should()
.NotBeNull()
.And.Match<MailAddress>(_ => _.Address == expectedAddress && _.DisplayName == expectedAlias);



[TestMethod]
public void Should_Parse_EmailAddress_Without_Alias()
//Arrange
var addressWithoutAlias = "bla@blub.com";

//Act
var mailAddressWithoutAlias = getAddress(addressWithoutAlias);

//Assert
mailAddressWithoutAlias
.Should()
.NotBeNull()
.And.Match<MailAddress>(_ => _.Address == addressWithoutAlias && _.DisplayName == string.Empty);
;


private static MailAddress getAddress(string address)
if (address == null) throw new ArgumentNullException("address");
if (string.IsNullOrWhiteSpace(address)) throw new ArgumentException("invalid address", "address");

address = address.Trim();
return new MailAddress(address);







share|improve this answer























  • I'm working with MimeKit.MailboxAddress here, poorly, the Mimekit-library doesn't do me that favour, Mailkit throws error. But you are right, of course :)
    – Matthias Burger
    Mar 1 at 15:15










  • i now let system.net.mail do this thing. it's better indeed. and interesting unit-test. what lib you use?
    – Matthias Burger
    Mar 1 at 15:29






  • 1




    I used the standard VS testing tools for the test runner and Fluent Assertions to assert
    – Nkosi
    Mar 1 at 15:34












up vote
10
down vote



accepted







up vote
10
down vote



accepted






Consider taking advantage of existing features that could provide an additional layer of validation.



mainly System.Net.Mail.MailAddress



Also as mentioned in a comment, no need to be creating the regular expression every time the function is called.



static Regex mailExpression = new Regex(@"(.+?(?=<))<(.*@.*?)>");
private static MailAddress getAddress(string address)
if (address == null) throw new ArgumentNullException("address");
if (string.IsNullOrWhiteSpace(address)) throw new ArgumentException("invalid address", "address");

var plainaddress = address.Trim();
var groups = mailExpression.Match(plainaddress).Groups;

return groups.Count == 1
? new MailAddress(plainaddress)
: new MailAddress(groups[2].Value.Trim(), groups[1].Value.Trim());



According to reference source code, internally MailAddress will try to parse the address given to it.



This avoids having to roll your own parser as one already exists out of the box that has been tried, tested and is stable.



private static MailAddress getAddress(string address) 
if (address == null) throw new ArgumentNullException("address");
if (string.IsNullOrWhiteSpace(address)) throw new ArgumentException("invalid address", "address");

address = address.Trim();
return new MailAddress(address);



You have the added advantage of having a strongly typed object model to work with that will provide you with usable properties.



The following Unit Test demonstrates the desired behavior.



[TestClass]
public class EmailParserTest
[TestMethod]
public void Should_Parse_EmailAddress_With_Alias()
//Arrange
var expectedAlias = "test my address";
var expectedAddress = "bla@blub.com";
string addressWithAlias = "test my address <bla@blub.com>";

//Act
var mailAddressWithAlias = getAddress(addressWithAlias);

//Assert
mailAddressWithAlias
.Should()
.NotBeNull()
.And.Match<MailAddress>(_ => _.Address == expectedAddress && _.DisplayName == expectedAlias);



[TestMethod]
public void Should_Parse_EmailAddress_Without_Alias()
//Arrange
var addressWithoutAlias = "bla@blub.com";

//Act
var mailAddressWithoutAlias = getAddress(addressWithoutAlias);

//Assert
mailAddressWithoutAlias
.Should()
.NotBeNull()
.And.Match<MailAddress>(_ => _.Address == addressWithoutAlias && _.DisplayName == string.Empty);
;


private static MailAddress getAddress(string address)
if (address == null) throw new ArgumentNullException("address");
if (string.IsNullOrWhiteSpace(address)) throw new ArgumentException("invalid address", "address");

address = address.Trim();
return new MailAddress(address);







share|improve this answer















Consider taking advantage of existing features that could provide an additional layer of validation.



mainly System.Net.Mail.MailAddress



Also as mentioned in a comment, no need to be creating the regular expression every time the function is called.



static Regex mailExpression = new Regex(@"(.+?(?=<))<(.*@.*?)>");
private static MailAddress getAddress(string address)
if (address == null) throw new ArgumentNullException("address");
if (string.IsNullOrWhiteSpace(address)) throw new ArgumentException("invalid address", "address");

var plainaddress = address.Trim();
var groups = mailExpression.Match(plainaddress).Groups;

return groups.Count == 1
? new MailAddress(plainaddress)
: new MailAddress(groups[2].Value.Trim(), groups[1].Value.Trim());



According to reference source code, internally MailAddress will try to parse the address given to it.



This avoids having to roll your own parser as one already exists out of the box that has been tried, tested and is stable.



private static MailAddress getAddress(string address) 
if (address == null) throw new ArgumentNullException("address");
if (string.IsNullOrWhiteSpace(address)) throw new ArgumentException("invalid address", "address");

address = address.Trim();
return new MailAddress(address);



You have the added advantage of having a strongly typed object model to work with that will provide you with usable properties.



The following Unit Test demonstrates the desired behavior.



[TestClass]
public class EmailParserTest
[TestMethod]
public void Should_Parse_EmailAddress_With_Alias()
//Arrange
var expectedAlias = "test my address";
var expectedAddress = "bla@blub.com";
string addressWithAlias = "test my address <bla@blub.com>";

//Act
var mailAddressWithAlias = getAddress(addressWithAlias);

//Assert
mailAddressWithAlias
.Should()
.NotBeNull()
.And.Match<MailAddress>(_ => _.Address == expectedAddress && _.DisplayName == expectedAlias);



[TestMethod]
public void Should_Parse_EmailAddress_Without_Alias()
//Arrange
var addressWithoutAlias = "bla@blub.com";

//Act
var mailAddressWithoutAlias = getAddress(addressWithoutAlias);

//Assert
mailAddressWithoutAlias
.Should()
.NotBeNull()
.And.Match<MailAddress>(_ => _.Address == addressWithoutAlias && _.DisplayName == string.Empty);
;


private static MailAddress getAddress(string address)
if (address == null) throw new ArgumentNullException("address");
if (string.IsNullOrWhiteSpace(address)) throw new ArgumentException("invalid address", "address");

address = address.Trim();
return new MailAddress(address);








share|improve this answer















share|improve this answer



share|improve this answer








edited Mar 1 at 15:55


























answered Mar 1 at 14:51









Nkosi

1,870619




1,870619











  • I'm working with MimeKit.MailboxAddress here, poorly, the Mimekit-library doesn't do me that favour, Mailkit throws error. But you are right, of course :)
    – Matthias Burger
    Mar 1 at 15:15










  • i now let system.net.mail do this thing. it's better indeed. and interesting unit-test. what lib you use?
    – Matthias Burger
    Mar 1 at 15:29






  • 1




    I used the standard VS testing tools for the test runner and Fluent Assertions to assert
    – Nkosi
    Mar 1 at 15:34
















  • I'm working with MimeKit.MailboxAddress here, poorly, the Mimekit-library doesn't do me that favour, Mailkit throws error. But you are right, of course :)
    – Matthias Burger
    Mar 1 at 15:15










  • i now let system.net.mail do this thing. it's better indeed. and interesting unit-test. what lib you use?
    – Matthias Burger
    Mar 1 at 15:29






  • 1




    I used the standard VS testing tools for the test runner and Fluent Assertions to assert
    – Nkosi
    Mar 1 at 15:34















I'm working with MimeKit.MailboxAddress here, poorly, the Mimekit-library doesn't do me that favour, Mailkit throws error. But you are right, of course :)
– Matthias Burger
Mar 1 at 15:15




I'm working with MimeKit.MailboxAddress here, poorly, the Mimekit-library doesn't do me that favour, Mailkit throws error. But you are right, of course :)
– Matthias Burger
Mar 1 at 15:15












i now let system.net.mail do this thing. it's better indeed. and interesting unit-test. what lib you use?
– Matthias Burger
Mar 1 at 15:29




i now let system.net.mail do this thing. it's better indeed. and interesting unit-test. what lib you use?
– Matthias Burger
Mar 1 at 15:29




1




1




I used the standard VS testing tools for the test runner and Fluent Assertions to assert
– Nkosi
Mar 1 at 15:34




I used the standard VS testing tools for the test runner and Fluent Assertions to assert
– Nkosi
Mar 1 at 15:34












up vote
5
down vote













Let me suggest an alternative regular expression, that correctly handles both cases:



(.*?)<?(bS+@S+b)>?


This regular expression correctly identifies both patterns you want to support. Somewhat noteworthy here is the use of S in the email-address to exclude whitespace characters, which are incorrectly allowed in your orignal regex. That led to accepting something like the following as valid Email specification:



bla bla <te st@ exampl e.com>


Another thing that this regex does is accept Specifications of email adresses that do not require the email to be enclosed in <>. This happens by ensuring the address is surrounded by word boundaries (b).



You should be able to easily use it like so:



static Regex mailExpression = new Regex(@"(.*?)<?(bS+@S+b)>?");
private static String getAddressParts(string addressSpec)

var groups = mailExpression.Match(addressSpec).Groups;
return groups[1] == ""
? new groups[2].Value
: new groups[1].Value.Trim(), groups[2].Value ;



This does of course not preclude using the very valid suggestion by Nkosi






share|improve this answer























  • FWIW this regex does not correctly deal with quoted-string local parts... A fix is pretty simple, though and left as an exercise in regex for the reader.
    – Vogel612♦
    Mar 1 at 16:51














up vote
5
down vote













Let me suggest an alternative regular expression, that correctly handles both cases:



(.*?)<?(bS+@S+b)>?


This regular expression correctly identifies both patterns you want to support. Somewhat noteworthy here is the use of S in the email-address to exclude whitespace characters, which are incorrectly allowed in your orignal regex. That led to accepting something like the following as valid Email specification:



bla bla <te st@ exampl e.com>


Another thing that this regex does is accept Specifications of email adresses that do not require the email to be enclosed in <>. This happens by ensuring the address is surrounded by word boundaries (b).



You should be able to easily use it like so:



static Regex mailExpression = new Regex(@"(.*?)<?(bS+@S+b)>?");
private static String getAddressParts(string addressSpec)

var groups = mailExpression.Match(addressSpec).Groups;
return groups[1] == ""
? new groups[2].Value
: new groups[1].Value.Trim(), groups[2].Value ;



This does of course not preclude using the very valid suggestion by Nkosi






share|improve this answer























  • FWIW this regex does not correctly deal with quoted-string local parts... A fix is pretty simple, though and left as an exercise in regex for the reader.
    – Vogel612♦
    Mar 1 at 16:51












up vote
5
down vote










up vote
5
down vote









Let me suggest an alternative regular expression, that correctly handles both cases:



(.*?)<?(bS+@S+b)>?


This regular expression correctly identifies both patterns you want to support. Somewhat noteworthy here is the use of S in the email-address to exclude whitespace characters, which are incorrectly allowed in your orignal regex. That led to accepting something like the following as valid Email specification:



bla bla <te st@ exampl e.com>


Another thing that this regex does is accept Specifications of email adresses that do not require the email to be enclosed in <>. This happens by ensuring the address is surrounded by word boundaries (b).



You should be able to easily use it like so:



static Regex mailExpression = new Regex(@"(.*?)<?(bS+@S+b)>?");
private static String getAddressParts(string addressSpec)

var groups = mailExpression.Match(addressSpec).Groups;
return groups[1] == ""
? new groups[2].Value
: new groups[1].Value.Trim(), groups[2].Value ;



This does of course not preclude using the very valid suggestion by Nkosi






share|improve this answer















Let me suggest an alternative regular expression, that correctly handles both cases:



(.*?)<?(bS+@S+b)>?


This regular expression correctly identifies both patterns you want to support. Somewhat noteworthy here is the use of S in the email-address to exclude whitespace characters, which are incorrectly allowed in your orignal regex. That led to accepting something like the following as valid Email specification:



bla bla <te st@ exampl e.com>


Another thing that this regex does is accept Specifications of email adresses that do not require the email to be enclosed in <>. This happens by ensuring the address is surrounded by word boundaries (b).



You should be able to easily use it like so:



static Regex mailExpression = new Regex(@"(.*?)<?(bS+@S+b)>?");
private static String getAddressParts(string addressSpec)

var groups = mailExpression.Match(addressSpec).Groups;
return groups[1] == ""
? new groups[2].Value
: new groups[1].Value.Trim(), groups[2].Value ;



This does of course not preclude using the very valid suggestion by Nkosi







share|improve this answer















share|improve this answer



share|improve this answer








edited Mar 1 at 15:23


























answered Mar 1 at 15:03









Vogel612♦

20.9k345124




20.9k345124











  • FWIW this regex does not correctly deal with quoted-string local parts... A fix is pretty simple, though and left as an exercise in regex for the reader.
    – Vogel612♦
    Mar 1 at 16:51
















  • FWIW this regex does not correctly deal with quoted-string local parts... A fix is pretty simple, though and left as an exercise in regex for the reader.
    – Vogel612♦
    Mar 1 at 16:51















FWIW this regex does not correctly deal with quoted-string local parts... A fix is pretty simple, though and left as an exercise in regex for the reader.
– Vogel612♦
Mar 1 at 16:51




FWIW this regex does not correctly deal with quoted-string local parts... A fix is pretty simple, though and left as an exercise in regex for the reader.
– Vogel612♦
Mar 1 at 16:51












 

draft saved


draft discarded


























 


draft saved


draft discarded














StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f188595%2fparsing-an-email-string%23new-answer', 'question_page');

);

Post as a guest













































































Popular posts from this blog

Chat program with C++ and SFML

Function to Return a JSON Like Objects Using VBA Collections and Arrays

Will my employers contract hold up in court?