Interpolate row-wise values of 0 between two columns with values >0 in R


I try to interpolate values of 0 between two values unequal to zero row-wise for the columns: 2018 to 2021 of a data.frame in R. This is how a sample data df1 would look like:

   ID string1 2018 2019 2020 2021 string2
1: a1      x2    3    3    0    4      si
2: a2      g3    5    5    4    0      q2
3: a3      n2   11    0    0    3      oq
4: a4      m3    3    0    9    8      mx
5: a5      2w    9    1    6    5      ix
6: a6     ps2    2    4    7    4      p2
7: a7     kg2    6    0    9    6      2q

For convenient reproducibility:

df1 = data.table(
  ID = c("a1", "a2", "a3", "a4", "a5", "a6", "a7"),
  "string1" = c("x2", "g3", "n2", "m3", "2w", "ps2", "kg2"),
  "2018" = c(3,5,11,3,9,2,6),
  "2019" = c(3,5,0,0,1,4,0),
  "2020" = c(0,4,0,9,6,7,9),
  "2021" = c(4,0,3,8,5,4,6),
  "string2" = c("si", "q2", "oq", "mx", "ix", "p2", "2q"))

In df1 there are cases with a zero between two numbers >0 (for example; row 1/column 2020, row 4/column 2019 or row 7 column 2019). I try to identify these cases and interpolate them with the neighbour columns (for example; row 1/column 2020: 3 + 4 =3.5).

Is there a way to deal with that? So far, I only found a method to replace all the zero values, but without the condition of being between two numbers >0.

I try to get such an output:

   ID string1 2018 2019 2020 2021 string2
1: a1      x2    3  3.0  3.5    4      si
2: a2      g3    5  5.0  4.0    0      q2
3: a3      n2   11  0.0  0.0    3      oq
4: a4      m3    3  6.0  9.0    8      mx
5: a5      2w    9  1.0  6.0    5      ix
6: a6     ps2    2  4.0  7.0    4      p2
7: a7     kg2    6  7.5  9.0    6      2q

Thank you very much!

---------------Answer---------------

Function to interpolate zeros between two positive elements:

f <- function(vec){
  
  prev_val <- shift(vec, 1, fill = 0)
  next_val <- shift(vec, -1, fill = 0)
  
  fifelse(prev_val > 0 & next_val > 0 & vec == 0, (prev_val + next_val) / 2, vec)
}

Applying function to all rows for year columns:

year_cols <- names(df1)[grep("^[0-9]+$", names(df1))]
df1[, (year_cols) := transpose(lapply(transpose(.SD), f)), .SDcols = year_cols]

transpose is used because you want to do change on rows. Second use is to return it into column format.

---------------Answer---------------

Here is a base R solution:

for (i in c("X2019","X2020")){
  x = which(colnames(df1) == i)
  df1[,x] <- ifelse(df1[,x] == 0 & df1[,x-1] > 0 & df1[,x+1] > 0, rowMeans(df1[,c(x-1,x+1)]), df1[,x])
}

> df1
  ID string1 X2018 X2019 X2020 X2021 string2
1 a1      x2     3   3.0   3.5     4      si
2 a2      g3     5   5.0   4.0     0      q2
3 a3      n2    11   0.0   0.0     3      oq
4 a4      m3     3   6.0   9.0     8      mx
5 a5      2w     9   1.0   6.0     5      ix
6 a6     ps2     2   4.0   7.0     4      p2
7 a7     kg2     6   7.5   9.0     6      2q

Previous : Translate curl to Powershell Invoke-Restmethod
Next : RecylerView onClickListener works on all fragment