Find the minimum value that data could have had before it was rounded
Clash Royale CLAN TAG#URR8PPP
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty margin-bottom:0;
up vote
1
down vote
favorite
I have this function in R:
Minimum <- function(data)
answer <- numeric(length(data))
diference <- c(0, diff(data, lag = 1, differences = 1)) #Padded initially =0
answer[1]=data[1]
for (i in 2:length(diference))
if (diference[i]==0)
answer[i]=answer[i-1]
else
answer[i]=data[i]-diference[i]/2
return(answer)
Its purpose is to find the minimum value which "data" could had before it was rounded.
The minimum possible value is the average of the values which "data" had at the last change of value in "data"
This code works, but since for
loops are inefficient in R, it is advised to vectorize the function.
The problem is that the "answer" vector depends on the former values in "answer", so I cannot use a lambda function.
r vectorization
add a comment |Â
up vote
1
down vote
favorite
I have this function in R:
Minimum <- function(data)
answer <- numeric(length(data))
diference <- c(0, diff(data, lag = 1, differences = 1)) #Padded initially =0
answer[1]=data[1]
for (i in 2:length(diference))
if (diference[i]==0)
answer[i]=answer[i-1]
else
answer[i]=data[i]-diference[i]/2
return(answer)
Its purpose is to find the minimum value which "data" could had before it was rounded.
The minimum possible value is the average of the values which "data" had at the last change of value in "data"
This code works, but since for
loops are inefficient in R, it is advised to vectorize the function.
The problem is that the "answer" vector depends on the former values in "answer", so I cannot use a lambda function.
r vectorization
I don't understand what this has to do with rounding, or the meaning of The minimum possible value is the average of the values which "data" had at the last change of value in "data". Could you elaborate? Alsodiference[i]==0
will be subject to floating point errors, so not reliable if you are dealing with numeric (non integer) vectors.
â flodel
Apr 16 at 23:01
For example, the numbers [1.14 1.23 1.28 1.35] could be rounded to [1.1 1.2 1.3 1.3]. The minimum possible value would be [? 1.15 1.25 1.25], otherwise they would not had been rounded that way. Even worse, on this example I know the module/precision of the rounding, but in real life is unknown how the values were rounded. They could had been rounded to multiples of pi, or who knows what number.
â yoxota
Apr 17 at 14:26
Thanks for the explanation. If I understand correctly, it is assuming your input data is increasing. If so, you might want to check that assumption in your function, something likestopifnot(all(diff(data) >= 0))
.
â flodel
Apr 17 at 23:53
I gave it some thought. Should you not look for the smallest value indiff(data)
that is not exactly zero and make that your (estimated) rounded precision for all values?Minimum <- function(data) d <- diff(data); p <- min(d[d > 0]); data - p/2
. It's all vectorized, faster, and provides a better (larger) minimum bound on your pre-rounded data.
â flodel
Apr 18 at 0:01
@ flodel Sorry for the delay. Looking for the smaller difference is complicated, because small values are noisy, so the smaller values correspond to noise around difference=0. It looks like the rounding scales up with the value.
â yoxota
Apr 25 at 13:57
add a comment |Â
up vote
1
down vote
favorite
up vote
1
down vote
favorite
I have this function in R:
Minimum <- function(data)
answer <- numeric(length(data))
diference <- c(0, diff(data, lag = 1, differences = 1)) #Padded initially =0
answer[1]=data[1]
for (i in 2:length(diference))
if (diference[i]==0)
answer[i]=answer[i-1]
else
answer[i]=data[i]-diference[i]/2
return(answer)
Its purpose is to find the minimum value which "data" could had before it was rounded.
The minimum possible value is the average of the values which "data" had at the last change of value in "data"
This code works, but since for
loops are inefficient in R, it is advised to vectorize the function.
The problem is that the "answer" vector depends on the former values in "answer", so I cannot use a lambda function.
r vectorization
I have this function in R:
Minimum <- function(data)
answer <- numeric(length(data))
diference <- c(0, diff(data, lag = 1, differences = 1)) #Padded initially =0
answer[1]=data[1]
for (i in 2:length(diference))
if (diference[i]==0)
answer[i]=answer[i-1]
else
answer[i]=data[i]-diference[i]/2
return(answer)
Its purpose is to find the minimum value which "data" could had before it was rounded.
The minimum possible value is the average of the values which "data" had at the last change of value in "data"
This code works, but since for
loops are inefficient in R, it is advised to vectorize the function.
The problem is that the "answer" vector depends on the former values in "answer", so I cannot use a lambda function.
r vectorization
edited Apr 16 at 16:51
Sam Onela
5,78461544
5,78461544
asked Apr 16 at 16:03
yoxota
112
112
I don't understand what this has to do with rounding, or the meaning of The minimum possible value is the average of the values which "data" had at the last change of value in "data". Could you elaborate? Alsodiference[i]==0
will be subject to floating point errors, so not reliable if you are dealing with numeric (non integer) vectors.
â flodel
Apr 16 at 23:01
For example, the numbers [1.14 1.23 1.28 1.35] could be rounded to [1.1 1.2 1.3 1.3]. The minimum possible value would be [? 1.15 1.25 1.25], otherwise they would not had been rounded that way. Even worse, on this example I know the module/precision of the rounding, but in real life is unknown how the values were rounded. They could had been rounded to multiples of pi, or who knows what number.
â yoxota
Apr 17 at 14:26
Thanks for the explanation. If I understand correctly, it is assuming your input data is increasing. If so, you might want to check that assumption in your function, something likestopifnot(all(diff(data) >= 0))
.
â flodel
Apr 17 at 23:53
I gave it some thought. Should you not look for the smallest value indiff(data)
that is not exactly zero and make that your (estimated) rounded precision for all values?Minimum <- function(data) d <- diff(data); p <- min(d[d > 0]); data - p/2
. It's all vectorized, faster, and provides a better (larger) minimum bound on your pre-rounded data.
â flodel
Apr 18 at 0:01
@ flodel Sorry for the delay. Looking for the smaller difference is complicated, because small values are noisy, so the smaller values correspond to noise around difference=0. It looks like the rounding scales up with the value.
â yoxota
Apr 25 at 13:57
add a comment |Â
I don't understand what this has to do with rounding, or the meaning of The minimum possible value is the average of the values which "data" had at the last change of value in "data". Could you elaborate? Alsodiference[i]==0
will be subject to floating point errors, so not reliable if you are dealing with numeric (non integer) vectors.
â flodel
Apr 16 at 23:01
For example, the numbers [1.14 1.23 1.28 1.35] could be rounded to [1.1 1.2 1.3 1.3]. The minimum possible value would be [? 1.15 1.25 1.25], otherwise they would not had been rounded that way. Even worse, on this example I know the module/precision of the rounding, but in real life is unknown how the values were rounded. They could had been rounded to multiples of pi, or who knows what number.
â yoxota
Apr 17 at 14:26
Thanks for the explanation. If I understand correctly, it is assuming your input data is increasing. If so, you might want to check that assumption in your function, something likestopifnot(all(diff(data) >= 0))
.
â flodel
Apr 17 at 23:53
I gave it some thought. Should you not look for the smallest value indiff(data)
that is not exactly zero and make that your (estimated) rounded precision for all values?Minimum <- function(data) d <- diff(data); p <- min(d[d > 0]); data - p/2
. It's all vectorized, faster, and provides a better (larger) minimum bound on your pre-rounded data.
â flodel
Apr 18 at 0:01
@ flodel Sorry for the delay. Looking for the smaller difference is complicated, because small values are noisy, so the smaller values correspond to noise around difference=0. It looks like the rounding scales up with the value.
â yoxota
Apr 25 at 13:57
I don't understand what this has to do with rounding, or the meaning of The minimum possible value is the average of the values which "data" had at the last change of value in "data". Could you elaborate? Also
diference[i]==0
will be subject to floating point errors, so not reliable if you are dealing with numeric (non integer) vectors.â flodel
Apr 16 at 23:01
I don't understand what this has to do with rounding, or the meaning of The minimum possible value is the average of the values which "data" had at the last change of value in "data". Could you elaborate? Also
diference[i]==0
will be subject to floating point errors, so not reliable if you are dealing with numeric (non integer) vectors.â flodel
Apr 16 at 23:01
For example, the numbers [1.14 1.23 1.28 1.35] could be rounded to [1.1 1.2 1.3 1.3]. The minimum possible value would be [? 1.15 1.25 1.25], otherwise they would not had been rounded that way. Even worse, on this example I know the module/precision of the rounding, but in real life is unknown how the values were rounded. They could had been rounded to multiples of pi, or who knows what number.
â yoxota
Apr 17 at 14:26
For example, the numbers [1.14 1.23 1.28 1.35] could be rounded to [1.1 1.2 1.3 1.3]. The minimum possible value would be [? 1.15 1.25 1.25], otherwise they would not had been rounded that way. Even worse, on this example I know the module/precision of the rounding, but in real life is unknown how the values were rounded. They could had been rounded to multiples of pi, or who knows what number.
â yoxota
Apr 17 at 14:26
Thanks for the explanation. If I understand correctly, it is assuming your input data is increasing. If so, you might want to check that assumption in your function, something like
stopifnot(all(diff(data) >= 0))
.â flodel
Apr 17 at 23:53
Thanks for the explanation. If I understand correctly, it is assuming your input data is increasing. If so, you might want to check that assumption in your function, something like
stopifnot(all(diff(data) >= 0))
.â flodel
Apr 17 at 23:53
I gave it some thought. Should you not look for the smallest value in
diff(data)
that is not exactly zero and make that your (estimated) rounded precision for all values? Minimum <- function(data) d <- diff(data); p <- min(d[d > 0]); data - p/2
. It's all vectorized, faster, and provides a better (larger) minimum bound on your pre-rounded data.â flodel
Apr 18 at 0:01
I gave it some thought. Should you not look for the smallest value in
diff(data)
that is not exactly zero and make that your (estimated) rounded precision for all values? Minimum <- function(data) d <- diff(data); p <- min(d[d > 0]); data - p/2
. It's all vectorized, faster, and provides a better (larger) minimum bound on your pre-rounded data.â flodel
Apr 18 at 0:01
@ flodel Sorry for the delay. Looking for the smaller difference is complicated, because small values are noisy, so the smaller values correspond to noise around difference=0. It looks like the rounding scales up with the value.
â yoxota
Apr 25 at 13:57
@ flodel Sorry for the delay. Looking for the smaller difference is complicated, because small values are noisy, so the smaller values correspond to noise around difference=0. It looks like the rounding scales up with the value.
â yoxota
Apr 25 at 13:57
add a comment |Â
1 Answer
1
active
oldest
votes
up vote
4
down vote
accepted
You can firstly change by difference != 0
and then use na.locf
to replace NA
s by last available value recursively.
minimum_new <- function(data)
answer <- rep(NA, length(data))
difference <- c(0, diff(data, lag = 1, differences = 1)) / 2
answer[1] <- data[1]
answer[difference != 0] <- data[difference != 0] - difference[difference != 0]
answer <- zoo::na.locf(answer, na.rm = FALSE)
answer
This version is faster for me by at least 2 times.
> data <- sample(10, 10000, replace = TRUE)
> check <- function(values) all(sapply(values[-1], function(x) identical(values[[1]], x)))
> bench <- microbenchmark::microbenchmark(loop = Minimum(data), vectorised = minimum_new(data), check=check)
Unit: microseconds
expr min lq mean median uq max neval cld
loop 1401.959 1415.552 1665.816 1457.274 1586.407 4620.835 100 b
vectorised 742.325 758.183 1111.202 796.507 1383.268 2587.940 100 a
With check
it's also checks the equality of output.
I would never had found na.locf function by myself. I wonder how I could had found it. Thank you. have a reward youtube.com/watch?v=fD7ji3YOwcM
â yoxota
Apr 17 at 14:20
add a comment |Â
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
4
down vote
accepted
You can firstly change by difference != 0
and then use na.locf
to replace NA
s by last available value recursively.
minimum_new <- function(data)
answer <- rep(NA, length(data))
difference <- c(0, diff(data, lag = 1, differences = 1)) / 2
answer[1] <- data[1]
answer[difference != 0] <- data[difference != 0] - difference[difference != 0]
answer <- zoo::na.locf(answer, na.rm = FALSE)
answer
This version is faster for me by at least 2 times.
> data <- sample(10, 10000, replace = TRUE)
> check <- function(values) all(sapply(values[-1], function(x) identical(values[[1]], x)))
> bench <- microbenchmark::microbenchmark(loop = Minimum(data), vectorised = minimum_new(data), check=check)
Unit: microseconds
expr min lq mean median uq max neval cld
loop 1401.959 1415.552 1665.816 1457.274 1586.407 4620.835 100 b
vectorised 742.325 758.183 1111.202 796.507 1383.268 2587.940 100 a
With check
it's also checks the equality of output.
I would never had found na.locf function by myself. I wonder how I could had found it. Thank you. have a reward youtube.com/watch?v=fD7ji3YOwcM
â yoxota
Apr 17 at 14:20
add a comment |Â
up vote
4
down vote
accepted
You can firstly change by difference != 0
and then use na.locf
to replace NA
s by last available value recursively.
minimum_new <- function(data)
answer <- rep(NA, length(data))
difference <- c(0, diff(data, lag = 1, differences = 1)) / 2
answer[1] <- data[1]
answer[difference != 0] <- data[difference != 0] - difference[difference != 0]
answer <- zoo::na.locf(answer, na.rm = FALSE)
answer
This version is faster for me by at least 2 times.
> data <- sample(10, 10000, replace = TRUE)
> check <- function(values) all(sapply(values[-1], function(x) identical(values[[1]], x)))
> bench <- microbenchmark::microbenchmark(loop = Minimum(data), vectorised = minimum_new(data), check=check)
Unit: microseconds
expr min lq mean median uq max neval cld
loop 1401.959 1415.552 1665.816 1457.274 1586.407 4620.835 100 b
vectorised 742.325 758.183 1111.202 796.507 1383.268 2587.940 100 a
With check
it's also checks the equality of output.
I would never had found na.locf function by myself. I wonder how I could had found it. Thank you. have a reward youtube.com/watch?v=fD7ji3YOwcM
â yoxota
Apr 17 at 14:20
add a comment |Â
up vote
4
down vote
accepted
up vote
4
down vote
accepted
You can firstly change by difference != 0
and then use na.locf
to replace NA
s by last available value recursively.
minimum_new <- function(data)
answer <- rep(NA, length(data))
difference <- c(0, diff(data, lag = 1, differences = 1)) / 2
answer[1] <- data[1]
answer[difference != 0] <- data[difference != 0] - difference[difference != 0]
answer <- zoo::na.locf(answer, na.rm = FALSE)
answer
This version is faster for me by at least 2 times.
> data <- sample(10, 10000, replace = TRUE)
> check <- function(values) all(sapply(values[-1], function(x) identical(values[[1]], x)))
> bench <- microbenchmark::microbenchmark(loop = Minimum(data), vectorised = minimum_new(data), check=check)
Unit: microseconds
expr min lq mean median uq max neval cld
loop 1401.959 1415.552 1665.816 1457.274 1586.407 4620.835 100 b
vectorised 742.325 758.183 1111.202 796.507 1383.268 2587.940 100 a
With check
it's also checks the equality of output.
You can firstly change by difference != 0
and then use na.locf
to replace NA
s by last available value recursively.
minimum_new <- function(data)
answer <- rep(NA, length(data))
difference <- c(0, diff(data, lag = 1, differences = 1)) / 2
answer[1] <- data[1]
answer[difference != 0] <- data[difference != 0] - difference[difference != 0]
answer <- zoo::na.locf(answer, na.rm = FALSE)
answer
This version is faster for me by at least 2 times.
> data <- sample(10, 10000, replace = TRUE)
> check <- function(values) all(sapply(values[-1], function(x) identical(values[[1]], x)))
> bench <- microbenchmark::microbenchmark(loop = Minimum(data), vectorised = minimum_new(data), check=check)
Unit: microseconds
expr min lq mean median uq max neval cld
loop 1401.959 1415.552 1665.816 1457.274 1586.407 4620.835 100 b
vectorised 742.325 758.183 1111.202 796.507 1383.268 2587.940 100 a
With check
it's also checks the equality of output.
edited Apr 17 at 15:32
answered Apr 16 at 17:53
m0nhawk
354210
354210
I would never had found na.locf function by myself. I wonder how I could had found it. Thank you. have a reward youtube.com/watch?v=fD7ji3YOwcM
â yoxota
Apr 17 at 14:20
add a comment |Â
I would never had found na.locf function by myself. I wonder how I could had found it. Thank you. have a reward youtube.com/watch?v=fD7ji3YOwcM
â yoxota
Apr 17 at 14:20
I would never had found na.locf function by myself. I wonder how I could had found it. Thank you. have a reward youtube.com/watch?v=fD7ji3YOwcM
â yoxota
Apr 17 at 14:20
I would never had found na.locf function by myself. I wonder how I could had found it. Thank you. have a reward youtube.com/watch?v=fD7ji3YOwcM
â yoxota
Apr 17 at 14:20
add a comment |Â
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f192211%2ffind-the-minimum-value-that-data-could-have-had-before-it-was-rounded%23new-answer', 'question_page');
);
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
I don't understand what this has to do with rounding, or the meaning of The minimum possible value is the average of the values which "data" had at the last change of value in "data". Could you elaborate? Also
diference[i]==0
will be subject to floating point errors, so not reliable if you are dealing with numeric (non integer) vectors.â flodel
Apr 16 at 23:01
For example, the numbers [1.14 1.23 1.28 1.35] could be rounded to [1.1 1.2 1.3 1.3]. The minimum possible value would be [? 1.15 1.25 1.25], otherwise they would not had been rounded that way. Even worse, on this example I know the module/precision of the rounding, but in real life is unknown how the values were rounded. They could had been rounded to multiples of pi, or who knows what number.
â yoxota
Apr 17 at 14:26
Thanks for the explanation. If I understand correctly, it is assuming your input data is increasing. If so, you might want to check that assumption in your function, something like
stopifnot(all(diff(data) >= 0))
.â flodel
Apr 17 at 23:53
I gave it some thought. Should you not look for the smallest value in
diff(data)
that is not exactly zero and make that your (estimated) rounded precision for all values?Minimum <- function(data) d <- diff(data); p <- min(d[d > 0]); data - p/2
. It's all vectorized, faster, and provides a better (larger) minimum bound on your pre-rounded data.â flodel
Apr 18 at 0:01
@ flodel Sorry for the delay. Looking for the smaller difference is complicated, because small values are noisy, so the smaller values correspond to noise around difference=0. It looks like the rounding scales up with the value.
â yoxota
Apr 25 at 13:57