Find the minimum value that data could have had before it was rounded

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP





.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty margin-bottom:0;







up vote
1
down vote

favorite












I have this function in R:



Minimum <- function(data) 
answer <- numeric(length(data))

diference <- c(0, diff(data, lag = 1, differences = 1)) #Padded initially =0

answer[1]=data[1]
for (i in 2:length(diference))
if (diference[i]==0)
answer[i]=answer[i-1]
else
answer[i]=data[i]-diference[i]/2


return(answer)



Its purpose is to find the minimum value which "data" could had before it was rounded.



The minimum possible value is the average of the values which "data" had at the last change of value in "data"



This code works, but since for loops are inefficient in R, it is advised to vectorize the function.



The problem is that the "answer" vector depends on the former values in "answer", so I cannot use a lambda function.







share|improve this question





















  • I don't understand what this has to do with rounding, or the meaning of The minimum possible value is the average of the values which "data" had at the last change of value in "data". Could you elaborate? Also diference[i]==0 will be subject to floating point errors, so not reliable if you are dealing with numeric (non integer) vectors.
    – flodel
    Apr 16 at 23:01










  • For example, the numbers [1.14 1.23 1.28 1.35] could be rounded to [1.1 1.2 1.3 1.3]. The minimum possible value would be [? 1.15 1.25 1.25], otherwise they would not had been rounded that way. Even worse, on this example I know the module/precision of the rounding, but in real life is unknown how the values were rounded. They could had been rounded to multiples of pi, or who knows what number.
    – yoxota
    Apr 17 at 14:26











  • Thanks for the explanation. If I understand correctly, it is assuming your input data is increasing. If so, you might want to check that assumption in your function, something like stopifnot(all(diff(data) >= 0)).
    – flodel
    Apr 17 at 23:53










  • I gave it some thought. Should you not look for the smallest value in diff(data) that is not exactly zero and make that your (estimated) rounded precision for all values? Minimum <- function(data) d <- diff(data); p <- min(d[d > 0]); data - p/2 . It's all vectorized, faster, and provides a better (larger) minimum bound on your pre-rounded data.
    – flodel
    Apr 18 at 0:01











  • @ flodel Sorry for the delay. Looking for the smaller difference is complicated, because small values are noisy, so the smaller values correspond to noise around difference=0. It looks like the rounding scales up with the value.
    – yoxota
    Apr 25 at 13:57
















up vote
1
down vote

favorite












I have this function in R:



Minimum <- function(data) 
answer <- numeric(length(data))

diference <- c(0, diff(data, lag = 1, differences = 1)) #Padded initially =0

answer[1]=data[1]
for (i in 2:length(diference))
if (diference[i]==0)
answer[i]=answer[i-1]
else
answer[i]=data[i]-diference[i]/2


return(answer)



Its purpose is to find the minimum value which "data" could had before it was rounded.



The minimum possible value is the average of the values which "data" had at the last change of value in "data"



This code works, but since for loops are inefficient in R, it is advised to vectorize the function.



The problem is that the "answer" vector depends on the former values in "answer", so I cannot use a lambda function.







share|improve this question





















  • I don't understand what this has to do with rounding, or the meaning of The minimum possible value is the average of the values which "data" had at the last change of value in "data". Could you elaborate? Also diference[i]==0 will be subject to floating point errors, so not reliable if you are dealing with numeric (non integer) vectors.
    – flodel
    Apr 16 at 23:01










  • For example, the numbers [1.14 1.23 1.28 1.35] could be rounded to [1.1 1.2 1.3 1.3]. The minimum possible value would be [? 1.15 1.25 1.25], otherwise they would not had been rounded that way. Even worse, on this example I know the module/precision of the rounding, but in real life is unknown how the values were rounded. They could had been rounded to multiples of pi, or who knows what number.
    – yoxota
    Apr 17 at 14:26











  • Thanks for the explanation. If I understand correctly, it is assuming your input data is increasing. If so, you might want to check that assumption in your function, something like stopifnot(all(diff(data) >= 0)).
    – flodel
    Apr 17 at 23:53










  • I gave it some thought. Should you not look for the smallest value in diff(data) that is not exactly zero and make that your (estimated) rounded precision for all values? Minimum <- function(data) d <- diff(data); p <- min(d[d > 0]); data - p/2 . It's all vectorized, faster, and provides a better (larger) minimum bound on your pre-rounded data.
    – flodel
    Apr 18 at 0:01











  • @ flodel Sorry for the delay. Looking for the smaller difference is complicated, because small values are noisy, so the smaller values correspond to noise around difference=0. It looks like the rounding scales up with the value.
    – yoxota
    Apr 25 at 13:57












up vote
1
down vote

favorite









up vote
1
down vote

favorite











I have this function in R:



Minimum <- function(data) 
answer <- numeric(length(data))

diference <- c(0, diff(data, lag = 1, differences = 1)) #Padded initially =0

answer[1]=data[1]
for (i in 2:length(diference))
if (diference[i]==0)
answer[i]=answer[i-1]
else
answer[i]=data[i]-diference[i]/2


return(answer)



Its purpose is to find the minimum value which "data" could had before it was rounded.



The minimum possible value is the average of the values which "data" had at the last change of value in "data"



This code works, but since for loops are inefficient in R, it is advised to vectorize the function.



The problem is that the "answer" vector depends on the former values in "answer", so I cannot use a lambda function.







share|improve this question













I have this function in R:



Minimum <- function(data) 
answer <- numeric(length(data))

diference <- c(0, diff(data, lag = 1, differences = 1)) #Padded initially =0

answer[1]=data[1]
for (i in 2:length(diference))
if (diference[i]==0)
answer[i]=answer[i-1]
else
answer[i]=data[i]-diference[i]/2


return(answer)



Its purpose is to find the minimum value which "data" could had before it was rounded.



The minimum possible value is the average of the values which "data" had at the last change of value in "data"



This code works, but since for loops are inefficient in R, it is advised to vectorize the function.



The problem is that the "answer" vector depends on the former values in "answer", so I cannot use a lambda function.









share|improve this question












share|improve this question




share|improve this question








edited Apr 16 at 16:51









Sam Onela

5,78461544




5,78461544









asked Apr 16 at 16:03









yoxota

112




112











  • I don't understand what this has to do with rounding, or the meaning of The minimum possible value is the average of the values which "data" had at the last change of value in "data". Could you elaborate? Also diference[i]==0 will be subject to floating point errors, so not reliable if you are dealing with numeric (non integer) vectors.
    – flodel
    Apr 16 at 23:01










  • For example, the numbers [1.14 1.23 1.28 1.35] could be rounded to [1.1 1.2 1.3 1.3]. The minimum possible value would be [? 1.15 1.25 1.25], otherwise they would not had been rounded that way. Even worse, on this example I know the module/precision of the rounding, but in real life is unknown how the values were rounded. They could had been rounded to multiples of pi, or who knows what number.
    – yoxota
    Apr 17 at 14:26











  • Thanks for the explanation. If I understand correctly, it is assuming your input data is increasing. If so, you might want to check that assumption in your function, something like stopifnot(all(diff(data) >= 0)).
    – flodel
    Apr 17 at 23:53










  • I gave it some thought. Should you not look for the smallest value in diff(data) that is not exactly zero and make that your (estimated) rounded precision for all values? Minimum <- function(data) d <- diff(data); p <- min(d[d > 0]); data - p/2 . It's all vectorized, faster, and provides a better (larger) minimum bound on your pre-rounded data.
    – flodel
    Apr 18 at 0:01











  • @ flodel Sorry for the delay. Looking for the smaller difference is complicated, because small values are noisy, so the smaller values correspond to noise around difference=0. It looks like the rounding scales up with the value.
    – yoxota
    Apr 25 at 13:57
















  • I don't understand what this has to do with rounding, or the meaning of The minimum possible value is the average of the values which "data" had at the last change of value in "data". Could you elaborate? Also diference[i]==0 will be subject to floating point errors, so not reliable if you are dealing with numeric (non integer) vectors.
    – flodel
    Apr 16 at 23:01










  • For example, the numbers [1.14 1.23 1.28 1.35] could be rounded to [1.1 1.2 1.3 1.3]. The minimum possible value would be [? 1.15 1.25 1.25], otherwise they would not had been rounded that way. Even worse, on this example I know the module/precision of the rounding, but in real life is unknown how the values were rounded. They could had been rounded to multiples of pi, or who knows what number.
    – yoxota
    Apr 17 at 14:26











  • Thanks for the explanation. If I understand correctly, it is assuming your input data is increasing. If so, you might want to check that assumption in your function, something like stopifnot(all(diff(data) >= 0)).
    – flodel
    Apr 17 at 23:53










  • I gave it some thought. Should you not look for the smallest value in diff(data) that is not exactly zero and make that your (estimated) rounded precision for all values? Minimum <- function(data) d <- diff(data); p <- min(d[d > 0]); data - p/2 . It's all vectorized, faster, and provides a better (larger) minimum bound on your pre-rounded data.
    – flodel
    Apr 18 at 0:01











  • @ flodel Sorry for the delay. Looking for the smaller difference is complicated, because small values are noisy, so the smaller values correspond to noise around difference=0. It looks like the rounding scales up with the value.
    – yoxota
    Apr 25 at 13:57















I don't understand what this has to do with rounding, or the meaning of The minimum possible value is the average of the values which "data" had at the last change of value in "data". Could you elaborate? Also diference[i]==0 will be subject to floating point errors, so not reliable if you are dealing with numeric (non integer) vectors.
– flodel
Apr 16 at 23:01




I don't understand what this has to do with rounding, or the meaning of The minimum possible value is the average of the values which "data" had at the last change of value in "data". Could you elaborate? Also diference[i]==0 will be subject to floating point errors, so not reliable if you are dealing with numeric (non integer) vectors.
– flodel
Apr 16 at 23:01












For example, the numbers [1.14 1.23 1.28 1.35] could be rounded to [1.1 1.2 1.3 1.3]. The minimum possible value would be [? 1.15 1.25 1.25], otherwise they would not had been rounded that way. Even worse, on this example I know the module/precision of the rounding, but in real life is unknown how the values were rounded. They could had been rounded to multiples of pi, or who knows what number.
– yoxota
Apr 17 at 14:26





For example, the numbers [1.14 1.23 1.28 1.35] could be rounded to [1.1 1.2 1.3 1.3]. The minimum possible value would be [? 1.15 1.25 1.25], otherwise they would not had been rounded that way. Even worse, on this example I know the module/precision of the rounding, but in real life is unknown how the values were rounded. They could had been rounded to multiples of pi, or who knows what number.
– yoxota
Apr 17 at 14:26













Thanks for the explanation. If I understand correctly, it is assuming your input data is increasing. If so, you might want to check that assumption in your function, something like stopifnot(all(diff(data) >= 0)).
– flodel
Apr 17 at 23:53




Thanks for the explanation. If I understand correctly, it is assuming your input data is increasing. If so, you might want to check that assumption in your function, something like stopifnot(all(diff(data) >= 0)).
– flodel
Apr 17 at 23:53












I gave it some thought. Should you not look for the smallest value in diff(data) that is not exactly zero and make that your (estimated) rounded precision for all values? Minimum <- function(data) d <- diff(data); p <- min(d[d > 0]); data - p/2 . It's all vectorized, faster, and provides a better (larger) minimum bound on your pre-rounded data.
– flodel
Apr 18 at 0:01





I gave it some thought. Should you not look for the smallest value in diff(data) that is not exactly zero and make that your (estimated) rounded precision for all values? Minimum <- function(data) d <- diff(data); p <- min(d[d > 0]); data - p/2 . It's all vectorized, faster, and provides a better (larger) minimum bound on your pre-rounded data.
– flodel
Apr 18 at 0:01













@ flodel Sorry for the delay. Looking for the smaller difference is complicated, because small values are noisy, so the smaller values correspond to noise around difference=0. It looks like the rounding scales up with the value.
– yoxota
Apr 25 at 13:57




@ flodel Sorry for the delay. Looking for the smaller difference is complicated, because small values are noisy, so the smaller values correspond to noise around difference=0. It looks like the rounding scales up with the value.
– yoxota
Apr 25 at 13:57










1 Answer
1






active

oldest

votes

















up vote
4
down vote



accepted










You can firstly change by difference != 0 and then use na.locf to replace NAs by last available value recursively.



minimum_new <- function(data) 
answer <- rep(NA, length(data))
difference <- c(0, diff(data, lag = 1, differences = 1)) / 2
answer[1] <- data[1]
answer[difference != 0] <- data[difference != 0] - difference[difference != 0]
answer <- zoo::na.locf(answer, na.rm = FALSE)
answer



This version is faster for me by at least 2 times.



> data <- sample(10, 10000, replace = TRUE)
> check <- function(values) all(sapply(values[-1], function(x) identical(values[[1]], x)))
> bench <- microbenchmark::microbenchmark(loop = Minimum(data), vectorised = minimum_new(data), check=check)
Unit: microseconds
expr min lq mean median uq max neval cld
loop 1401.959 1415.552 1665.816 1457.274 1586.407 4620.835 100 b
vectorised 742.325 758.183 1111.202 796.507 1383.268 2587.940 100 a


With check it's also checks the equality of output.






share|improve this answer























  • I would never had found na.locf function by myself. I wonder how I could had found it. Thank you. have a reward youtube.com/watch?v=fD7ji3YOwcM
    – yoxota
    Apr 17 at 14:20










Your Answer




StackExchange.ifUsing("editor", function ()
return StackExchange.using("mathjaxEditing", function ()
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["\$", "\$"]]);
);
);
, "mathjax-editing");

StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "196"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
convertImagesToLinks: false,
noModals: false,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);



);








 

draft saved


draft discarded


















StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f192211%2ffind-the-minimum-value-that-data-could-have-had-before-it-was-rounded%23new-answer', 'question_page');

);

Post as a guest






























1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes








up vote
4
down vote



accepted










You can firstly change by difference != 0 and then use na.locf to replace NAs by last available value recursively.



minimum_new <- function(data) 
answer <- rep(NA, length(data))
difference <- c(0, diff(data, lag = 1, differences = 1)) / 2
answer[1] <- data[1]
answer[difference != 0] <- data[difference != 0] - difference[difference != 0]
answer <- zoo::na.locf(answer, na.rm = FALSE)
answer



This version is faster for me by at least 2 times.



> data <- sample(10, 10000, replace = TRUE)
> check <- function(values) all(sapply(values[-1], function(x) identical(values[[1]], x)))
> bench <- microbenchmark::microbenchmark(loop = Minimum(data), vectorised = minimum_new(data), check=check)
Unit: microseconds
expr min lq mean median uq max neval cld
loop 1401.959 1415.552 1665.816 1457.274 1586.407 4620.835 100 b
vectorised 742.325 758.183 1111.202 796.507 1383.268 2587.940 100 a


With check it's also checks the equality of output.






share|improve this answer























  • I would never had found na.locf function by myself. I wonder how I could had found it. Thank you. have a reward youtube.com/watch?v=fD7ji3YOwcM
    – yoxota
    Apr 17 at 14:20














up vote
4
down vote



accepted










You can firstly change by difference != 0 and then use na.locf to replace NAs by last available value recursively.



minimum_new <- function(data) 
answer <- rep(NA, length(data))
difference <- c(0, diff(data, lag = 1, differences = 1)) / 2
answer[1] <- data[1]
answer[difference != 0] <- data[difference != 0] - difference[difference != 0]
answer <- zoo::na.locf(answer, na.rm = FALSE)
answer



This version is faster for me by at least 2 times.



> data <- sample(10, 10000, replace = TRUE)
> check <- function(values) all(sapply(values[-1], function(x) identical(values[[1]], x)))
> bench <- microbenchmark::microbenchmark(loop = Minimum(data), vectorised = minimum_new(data), check=check)
Unit: microseconds
expr min lq mean median uq max neval cld
loop 1401.959 1415.552 1665.816 1457.274 1586.407 4620.835 100 b
vectorised 742.325 758.183 1111.202 796.507 1383.268 2587.940 100 a


With check it's also checks the equality of output.






share|improve this answer























  • I would never had found na.locf function by myself. I wonder how I could had found it. Thank you. have a reward youtube.com/watch?v=fD7ji3YOwcM
    – yoxota
    Apr 17 at 14:20












up vote
4
down vote



accepted







up vote
4
down vote



accepted






You can firstly change by difference != 0 and then use na.locf to replace NAs by last available value recursively.



minimum_new <- function(data) 
answer <- rep(NA, length(data))
difference <- c(0, diff(data, lag = 1, differences = 1)) / 2
answer[1] <- data[1]
answer[difference != 0] <- data[difference != 0] - difference[difference != 0]
answer <- zoo::na.locf(answer, na.rm = FALSE)
answer



This version is faster for me by at least 2 times.



> data <- sample(10, 10000, replace = TRUE)
> check <- function(values) all(sapply(values[-1], function(x) identical(values[[1]], x)))
> bench <- microbenchmark::microbenchmark(loop = Minimum(data), vectorised = minimum_new(data), check=check)
Unit: microseconds
expr min lq mean median uq max neval cld
loop 1401.959 1415.552 1665.816 1457.274 1586.407 4620.835 100 b
vectorised 742.325 758.183 1111.202 796.507 1383.268 2587.940 100 a


With check it's also checks the equality of output.






share|improve this answer















You can firstly change by difference != 0 and then use na.locf to replace NAs by last available value recursively.



minimum_new <- function(data) 
answer <- rep(NA, length(data))
difference <- c(0, diff(data, lag = 1, differences = 1)) / 2
answer[1] <- data[1]
answer[difference != 0] <- data[difference != 0] - difference[difference != 0]
answer <- zoo::na.locf(answer, na.rm = FALSE)
answer



This version is faster for me by at least 2 times.



> data <- sample(10, 10000, replace = TRUE)
> check <- function(values) all(sapply(values[-1], function(x) identical(values[[1]], x)))
> bench <- microbenchmark::microbenchmark(loop = Minimum(data), vectorised = minimum_new(data), check=check)
Unit: microseconds
expr min lq mean median uq max neval cld
loop 1401.959 1415.552 1665.816 1457.274 1586.407 4620.835 100 b
vectorised 742.325 758.183 1111.202 796.507 1383.268 2587.940 100 a


With check it's also checks the equality of output.







share|improve this answer















share|improve this answer



share|improve this answer








edited Apr 17 at 15:32


























answered Apr 16 at 17:53









m0nhawk

354210




354210











  • I would never had found na.locf function by myself. I wonder how I could had found it. Thank you. have a reward youtube.com/watch?v=fD7ji3YOwcM
    – yoxota
    Apr 17 at 14:20
















  • I would never had found na.locf function by myself. I wonder how I could had found it. Thank you. have a reward youtube.com/watch?v=fD7ji3YOwcM
    – yoxota
    Apr 17 at 14:20















I would never had found na.locf function by myself. I wonder how I could had found it. Thank you. have a reward youtube.com/watch?v=fD7ji3YOwcM
– yoxota
Apr 17 at 14:20




I would never had found na.locf function by myself. I wonder how I could had found it. Thank you. have a reward youtube.com/watch?v=fD7ji3YOwcM
– yoxota
Apr 17 at 14:20












 

draft saved


draft discarded


























 


draft saved


draft discarded














StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f192211%2ffind-the-minimum-value-that-data-could-have-had-before-it-was-rounded%23new-answer', 'question_page');

);

Post as a guest













































































Popular posts from this blog

Greedy Best First Search implementation in Rust

Function to Return a JSON Like Objects Using VBA Collections and Arrays

C++11 CLH Lock Implementation