IPv6 parsing in rust

.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty margin-bottom:0;

up vote
5
down vote

favorite

Here is code to parse an IPv6 address.
An IPv6 address is 128 bits long.
When represented in its printable form, its hextets (1 hextet == 16 bits) are represented as hexadecimal numbers, and are separated by columns.
For example

fe80:0000:0000:0000:8657:e6fe:08d5:5325

Note that for each hextet, the left-most 0s can be ignored. Here is the same address:

fe80:0:0:0:8657:e6fe:8d5:5325

Finally, if there are several consecutive hextets which value is 0, they can be omitted and replaced by ::. Here is the same address again:

fe80::8657:e6fe:8d5:5325

The :: can be anywhere, not only in the middle. For instance, these are valid IPv6 addresses:

::1
ffff::

The null address can be represented as ::.

Finally, there's a special type of IPv6 addresses that provide compatiblity with IPv4. The last 32 bits of these addresses represent an IPv4, and are represented like this:

1111:2222:3333:4444:5555:6666:1.2.3.4

The IPv4 MUST be at the end of the address for the IP to be valid.

My code is inspired by the go standard library ParseIPv6 function.

The code is a bit long so I posted it as a gist as well (which contains a few tests)

I'd like to know if:

there are ways to make this code more efficient (even using third party crates)

is using bytes instead of characters ok? In an IPv6, all the characters are supposed to have an ASCII representation, so I think it's ok but I'm not 100% sure. If I have to use characters, it's much more complicated because there's not way to index a string in Rust.

After this long introduction, the code:

use std::str::FromStr;

#[derive(Debug, Copy, Eq, PartialEq, Hash, Clone)]
pub struct Ipv6Address(u128);

impl FromStr for Ipv6Address 
 type Err = MalformedAddress;

 fn from_str(s: &str) -> Result<Self, Self::Err>

Here are the helpers I'm using:

/// Check whether an ASCII character represents an hexadecimal digit
fn is_hex_digit(byte: u8) -> bool 
 match byte 
 b'0' ... b'9' 


/// Convert an ASCII character that represents an hexadecimal digit into this digit
fn hex_to_digit(byte: u8) -> u8 
 match byte 
 b'0' ... b'9' => byte - b'0',
 b'a' ... b'f' => byte - b'a' + 10,
 b'A' ... b'F' => byte - b'A' + 10,
 _ => unreachable!(),
 


/// Read up to four ASCII characters that represent hexadecimal digits, and return their value, as
/// well as the number of characters that were read. If not character is read, `(0, 0)` is
/// returned.
fn read_hextet(bytes: &[u8]) -> (usize, u16) 
 let mut count = 0;
 let mut digits: [u8; 4] = [0; 4];

 for b in bytes 
 if is_hex_digit(*b) 
 digits[count] = hex_to_digit(*b);
 count += 1;
 if count == 4 
 break;
 
 else 
 break;
 
 

 if count == 0 
 return (0, 0);
 

 let mut shift = (count - 1) * 4;
 let mut res = 0;
 for digit in &digits[0..count] 
 res += (*digit as u16) << shift;
 if shift >= 4 
 shift -= 4;
 else 
 break;
 
 

 (count, res)

I don't handle IPv4 parsing for now, so I'm just using this:

#[derive(Debug, Copy, Eq, PartialEq, Hash, Clone)]
pub struct Ipv4Address(u32);

impl Ipv4Address 
 fn parse(_: &[u8]) -> Result<u32, MalformedAddress> 
 unimplemented!();

Finally here is the error type I'm using:

use std::fmt;
use std::error::Error;

#[derive(Debug)]
pub struct MalformedAddress(String);

impl fmt::Display for MalformedAddress 
 fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result 
 write!(f, "malformed address: """, self.0)
 


impl Error for MalformedAddress 
 fn description(&self) -> &str 
 "the string cannot be parsed as an IP address"
 

 fn cause(&self) -> Option<&Error> 
 None

edited Jun 21 at 18:39

200_success

123k14143399

asked Jun 21 at 18:18

little-dude

1985

add a commentÂ |Â

up vote
5
down vote

favorite

fe80:0000:0000:0000:8657:e6fe:08d5:5325

Note that for each hextet, the left-most 0s can be ignored. Here is the same address:

fe80:0:0:0:8657:e6fe:8d5:5325

Finally, if there are several consecutive hextets which value is 0, they can be omitted and replaced by ::. Here is the same address again:

fe80::8657:e6fe:8d5:5325

The :: can be anywhere, not only in the middle. For instance, these are valid IPv6 addresses:

::1
ffff::

The null address can be represented as ::.

Finally, there's a special type of IPv6 addresses that provide compatiblity with IPv4. The last 32 bits of these addresses represent an IPv4, and are represented like this:

1111:2222:3333:4444:5555:6666:1.2.3.4

The IPv4 MUST be at the end of the address for the IP to be valid.

My code is inspired by the go standard library ParseIPv6 function.

The code is a bit long so I posted it as a gist as well (which contains a few tests)

I'd like to know if:

there are ways to make this code more efficient (even using third party crates)

is using bytes instead of characters ok? In an IPv6, all the characters are supposed to have an ASCII representation, so I think it's ok but I'm not 100% sure. If I have to use characters, it's much more complicated because there's not way to index a string in Rust.

After this long introduction, the code:

use std::str::FromStr;

#[derive(Debug, Copy, Eq, PartialEq, Hash, Clone)]
pub struct Ipv6Address(u128);

impl FromStr for Ipv6Address 
 type Err = MalformedAddress;

 fn from_str(s: &str) -> Result<Self, Self::Err>

Here are the helpers I'm using:

/// Check whether an ASCII character represents an hexadecimal digit
fn is_hex_digit(byte: u8) -> bool 
 match byte 
 b'0' ... b'9' 


/// Convert an ASCII character that represents an hexadecimal digit into this digit
fn hex_to_digit(byte: u8) -> u8 
 match byte 
 b'0' ... b'9' => byte - b'0',
 b'a' ... b'f' => byte - b'a' + 10,
 b'A' ... b'F' => byte - b'A' + 10,
 _ => unreachable!(),
 


/// Read up to four ASCII characters that represent hexadecimal digits, and return their value, as
/// well as the number of characters that were read. If not character is read, `(0, 0)` is
/// returned.
fn read_hextet(bytes: &[u8]) -> (usize, u16) 
 let mut count = 0;
 let mut digits: [u8; 4] = [0; 4];

 for b in bytes 
 if is_hex_digit(*b) 
 digits[count] = hex_to_digit(*b);
 count += 1;
 if count == 4 
 break;
 
 else 
 break;
 
 

 if count == 0 
 return (0, 0);
 

 let mut shift = (count - 1) * 4;
 let mut res = 0;
 for digit in &digits[0..count] 
 res += (*digit as u16) << shift;
 if shift >= 4 
 shift -= 4;
 else 
 break;
 
 

 (count, res)

I don't handle IPv4 parsing for now, so I'm just using this:

#[derive(Debug, Copy, Eq, PartialEq, Hash, Clone)]
pub struct Ipv4Address(u32);

impl Ipv4Address 
 fn parse(_: &[u8]) -> Result<u32, MalformedAddress> 
 unimplemented!();

Finally here is the error type I'm using:

use std::fmt;
use std::error::Error;

#[derive(Debug)]
pub struct MalformedAddress(String);

impl fmt::Display for MalformedAddress 
 fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result 
 write!(f, "malformed address: """, self.0)
 


impl Error for MalformedAddress 
 fn description(&self) -> &str 
 "the string cannot be parsed as an IP address"
 

 fn cause(&self) -> Option<&Error> 
 None

edited Jun 21 at 18:39

200_success

123k14143399

asked Jun 21 at 18:18

little-dude

1985

add a commentÂ |Â

up vote
5
down vote

favorite

fe80:0000:0000:0000:8657:e6fe:08d5:5325

Note that for each hextet, the left-most 0s can be ignored. Here is the same address:

fe80:0:0:0:8657:e6fe:8d5:5325

Finally, if there are several consecutive hextets which value is 0, they can be omitted and replaced by ::. Here is the same address again:

fe80::8657:e6fe:8d5:5325

The :: can be anywhere, not only in the middle. For instance, these are valid IPv6 addresses:

::1
ffff::

The null address can be represented as ::.

Finally, there's a special type of IPv6 addresses that provide compatiblity with IPv4. The last 32 bits of these addresses represent an IPv4, and are represented like this:

1111:2222:3333:4444:5555:6666:1.2.3.4

The IPv4 MUST be at the end of the address for the IP to be valid.

My code is inspired by the go standard library ParseIPv6 function.

The code is a bit long so I posted it as a gist as well (which contains a few tests)

I'd like to know if:

there are ways to make this code more efficient (even using third party crates)

is using bytes instead of characters ok? In an IPv6, all the characters are supposed to have an ASCII representation, so I think it's ok but I'm not 100% sure. If I have to use characters, it's much more complicated because there's not way to index a string in Rust.

After this long introduction, the code:

use std::str::FromStr;

#[derive(Debug, Copy, Eq, PartialEq, Hash, Clone)]
pub struct Ipv6Address(u128);

impl FromStr for Ipv6Address 
 type Err = MalformedAddress;

 fn from_str(s: &str) -> Result<Self, Self::Err>

Here are the helpers I'm using:

/// Check whether an ASCII character represents an hexadecimal digit
fn is_hex_digit(byte: u8) -> bool 
 match byte 
 b'0' ... b'9' 


/// Convert an ASCII character that represents an hexadecimal digit into this digit
fn hex_to_digit(byte: u8) -> u8 
 match byte 
 b'0' ... b'9' => byte - b'0',
 b'a' ... b'f' => byte - b'a' + 10,
 b'A' ... b'F' => byte - b'A' + 10,
 _ => unreachable!(),
 


/// Read up to four ASCII characters that represent hexadecimal digits, and return their value, as
/// well as the number of characters that were read. If not character is read, `(0, 0)` is
/// returned.
fn read_hextet(bytes: &[u8]) -> (usize, u16) 
 let mut count = 0;
 let mut digits: [u8; 4] = [0; 4];

 for b in bytes 
 if is_hex_digit(*b) 
 digits[count] = hex_to_digit(*b);
 count += 1;
 if count == 4 
 break;
 
 else 
 break;
 
 

 if count == 0 
 return (0, 0);
 

 let mut shift = (count - 1) * 4;
 let mut res = 0;
 for digit in &digits[0..count] 
 res += (*digit as u16) << shift;
 if shift >= 4 
 shift -= 4;
 else 
 break;
 
 

 (count, res)

I don't handle IPv4 parsing for now, so I'm just using this:

#[derive(Debug, Copy, Eq, PartialEq, Hash, Clone)]
pub struct Ipv4Address(u32);

impl Ipv4Address 
 fn parse(_: &[u8]) -> Result<u32, MalformedAddress> 
 unimplemented!();

Finally here is the error type I'm using:

use std::fmt;
use std::error::Error;

#[derive(Debug)]
pub struct MalformedAddress(String);

impl fmt::Display for MalformedAddress 
 fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result 
 write!(f, "malformed address: """, self.0)
 


impl Error for MalformedAddress 
 fn description(&self) -> &str 
 "the string cannot be parsed as an IP address"
 

 fn cause(&self) -> Option<&Error> 
 None

edited Jun 21 at 18:39

200_success

123k14143399

asked Jun 21 at 18:18

little-dude

1985

fe80:0000:0000:0000:8657:e6fe:08d5:5325

Note that for each hextet, the left-most 0s can be ignored. Here is the same address:

fe80:0:0:0:8657:e6fe:8d5:5325

Finally, if there are several consecutive hextets which value is 0, they can be omitted and replaced by ::. Here is the same address again:

fe80::8657:e6fe:8d5:5325

The :: can be anywhere, not only in the middle. For instance, these are valid IPv6 addresses:

::1
ffff::

The null address can be represented as ::.

Finally, there's a special type of IPv6 addresses that provide compatiblity with IPv4. The last 32 bits of these addresses represent an IPv4, and are represented like this:

1111:2222:3333:4444:5555:6666:1.2.3.4

The IPv4 MUST be at the end of the address for the IP to be valid.

My code is inspired by the go standard library ParseIPv6 function.

The code is a bit long so I posted it as a gist as well (which contains a few tests)

I'd like to know if:

there are ways to make this code more efficient (even using third party crates)

is using bytes instead of characters ok? In an IPv6, all the characters are supposed to have an ASCII representation, so I think it's ok but I'm not 100% sure. If I have to use characters, it's much more complicated because there's not way to index a string in Rust.

After this long introduction, the code:

use std::str::FromStr;

#[derive(Debug, Copy, Eq, PartialEq, Hash, Clone)]
pub struct Ipv6Address(u128);

impl FromStr for Ipv6Address 
 type Err = MalformedAddress;

 fn from_str(s: &str) -> Result<Self, Self::Err>

Here are the helpers I'm using:

/// Check whether an ASCII character represents an hexadecimal digit
fn is_hex_digit(byte: u8) -> bool 
 match byte 
 b'0' ... b'9' 


/// Convert an ASCII character that represents an hexadecimal digit into this digit
fn hex_to_digit(byte: u8) -> u8 
 match byte 
 b'0' ... b'9' => byte - b'0',
 b'a' ... b'f' => byte - b'a' + 10,
 b'A' ... b'F' => byte - b'A' + 10,
 _ => unreachable!(),
 


/// Read up to four ASCII characters that represent hexadecimal digits, and return their value, as
/// well as the number of characters that were read. If not character is read, `(0, 0)` is
/// returned.
fn read_hextet(bytes: &[u8]) -> (usize, u16) 
 let mut count = 0;
 let mut digits: [u8; 4] = [0; 4];

 for b in bytes 
 if is_hex_digit(*b) 
 digits[count] = hex_to_digit(*b);
 count += 1;
 if count == 4 
 break;
 
 else 
 break;
 
 

 if count == 0 
 return (0, 0);
 

 let mut shift = (count - 1) * 4;
 let mut res = 0;
 for digit in &digits[0..count] 
 res += (*digit as u16) << shift;
 if shift >= 4 
 shift -= 4;
 else 
 break;
 
 

 (count, res)

I don't handle IPv4 parsing for now, so I'm just using this:

#[derive(Debug, Copy, Eq, PartialEq, Hash, Clone)]
pub struct Ipv4Address(u32);

impl Ipv4Address 
 fn parse(_: &[u8]) -> Result<u32, MalformedAddress> 
 unimplemented!();

Finally here is the error type I'm using:

use std::fmt;
use std::error::Error;

#[derive(Debug)]
pub struct MalformedAddress(String);

impl fmt::Display for MalformedAddress 
 fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result 
 write!(f, "malformed address: """, self.0)
 


impl Error for MalformedAddress 
 fn description(&self) -> &str 
 "the string cannot be parsed as an IP address"
 

 fn cause(&self) -> Option<&Error> 
 None

edited Jun 21 at 18:39

200_success

123k14143399

asked Jun 21 at 18:18

little-dude

1985

edited Jun 21 at 18:39

200_success

123k14143399

edited Jun 21 at 18:39

200_success

123k14143399

edited Jun 21 at 18:39

200_success

123k14143399

asked Jun 21 at 18:18

little-dude

1985

asked Jun 21 at 18:18

little-dude

1985

asked Jun 21 at 18:18

little-dude

1985

add a commentÂ |Â

active

oldest

votes

Your Answer

StackExchange.ifUsing("editor", function ()
return StackExchange.using("mathjaxEditing", function ()
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["\$", "\$"]]);
);
);
, "mathjax-editing");

StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "196"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
convertImagesToLinks: false,
noModals: false,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f196996%2fipv6-parsing-in-rust%23new-answer', 'question_page');

);

Post as a guest

Name

active

oldest

votes

draft saved

draft discarded

draft saved

draft discarded

Post as a guest

Name

搜尋此網誌

trjhtr