Bash script to verify snapshots generated daily

I had been wanting to do this for a long time and finally gave it a try, here is my attempt at getting an alert when snapshots haven’t been created for several days. My context: I backup files from several users to my NAS, I also backup my files to a remote machine. I run the script daily via crontab.

It’s my first somewhat fancy bash script, suggestions for alternative methods, improvements or issues detected are welcome.

#!/bin/bash

# Get the date of the previous day
YESTERDAY=$(date -d "1 day ago" +"%Y-%m-%d")

# Tokens
TOKEN_RPI2_HENRI="eyJ2ZXJ"
TOKEN_NAS_MANA="eyJ2ZXJzaW"
TOKEN_NAS_CECILE="eyJ2ZXJzaW9u"
TOKEN_NAS_HENRI="eyJ2ZXJzaW9uIjoiMS"

# Connect to repository
connect_to_repository() {
    local token=$1
    kopia repository connect from-config --token "$token"
}

# Check the maximum number of consecutive days without entries
check_max_no_entries() {
  local consecutive_days=$1
  local start_date=$2

  max_count=0
  count=0

  # Iterate through the consecutive_days
  for ((i=0; i<consecutive_days; i++))
  do
    date_to_check=$(date -d "$start_date - $i days" +%Y-%m-%d)
    output=$(cat snapshots.json | jq -r --arg date "$date_to_check" '.[] | select(.startTime[:10] == $date)')

    # If no entries found for a day, increment the count
    if [ -z "$output" ]; then
      ((count++))
    else
      # Update max_count if the current count is greater
      if [ $count -gt $max_count ]; then
        max_count=$count
      fi
      count=0
    fi
  done

  # Check if the last count is greater than max_count
  if [ $count -gt $max_count ]; then
    max_count=$count
  fi

  echo $max_count  
}

# Check if snapshot exists for the given user and host
check_snapshot() {
    local user=$1
    local host=$2
    local output=$(jq -r --arg user "$user" --arg host "$host" --arg date "$YESTERDAY" \
                        --arg day_minus_1 "$(date -d "$YESTERDAY - 1 day" +%Y-%m-%d)" \
                        --arg day_minus_2 "$(date -d "$YESTERDAY - 2 days" +%Y-%m-%d)" \
                        --arg day_minus_3 "$(date -d "$YESTERDAY - 3 days" +%Y-%m-%d)" \
                        --arg day_minus_4 "$(date -d "$YESTERDAY - 4 days" +%Y-%m-%d)" \
                        '.[] | select(.source.host == $host) | select(.source.userName == $user) | select(.startTime[:10] == $date or .startTime[:10] == $day_minus_1 or .startTime[:10] == $day_minus_2 or .startTime[:10] == $day_minus_3 or .startTime[:10] == $day_minus_4) | .startTime' snapshots.json)

    echo $(check_max_no_entries 5 "$YESTERDAY")
}

# Process repository
process_repository() {
    local token=$1
    local user=$2
    local host=$3
    local repo=$4

    connect_to_repository "$token"
    kopia snapshot list --json --all > snapshots.json
    consecutive_days_without_snapshots=$(check_snapshot "$user" "$host")
    if [ $consecutive_days_without_snapshots -gt 0 ]; then
        printf "No snapshots were created on %s as %s@%s for %s consecutive days\n" "$repo" "$user" "$host" "$consecutive_days_without_snapshots" 
    fi
    kopia repository disconnect
}

# Create a summary file
printf "Summary of kopia snapshots not being generated daily\n\n" > summary.txt

# Process repositories and append results to the summary file
process_repository "$TOKEN_RPI2_HENRI" "user1" "host1" "repo 1"  >> summary.txt
process_repository "$TOKEN_RPI2_HENRI" "user2" "host2" "repo 1"  >> summary.txt
process_repository "$TOKEN_NAS_MANA" "user2" "host2" "repo 2"  >> summary.txt
process_repository "$TOKEN_NAS_HENRI" "user3" "host3" " repo 2"  >> summary.txt
process_repository "$TOKEN_NAS_CECILE" "user4" "host4" " repo 3"  >> summary.txt

# Send the summary email
#mail -s "Kopia Backup Summary" myemail@domain.com -a "Content-Type: text/plain; charset=UTF-8" < summary.txt

# Send the summary email only if issues were detected
if [ $(wc -l < summary.txt) -gt 2 ]; then
  mail -s "Kopia Backup Summary" myemail@domain.com -a "Content-Type: text/plain; charset=UTF-8" < summary.txt
else
  echo "No issue detected"
fi

exit

A good example for automation ! Suggestion:

  • All capitalized variables names (like YESTERDAY) must be used ONLY if it will be used as environment variables or shell variables set by an operating system according to POSIX to prevent conflicting between yours and those variables.
  • variables max_count as well count should be also local to check_max_no_entries function
  • It is a good habit to surround string variables in curly brackets like $token => ${token} to avoid possible surprises that hard to debug.
  • cat snapshots.json | jq ... - do not use cat if it isn’t necessary. The same can be done without extra dependency: <snapshots.json jq ....
  • avoid to use variable names that can match external utility, like host variable
2 Likes

thank you, your suggestions are much appreciated (and implemented now). Can you elaborate on the type of issues that you prevent by using ${variable_name}?

The simplest one is:

foo="dog"
echo "$foobar"

You might want to use shellcheck to catch more errors

1 Like

If someone comes here intending to do something similar, my script above was wrong. The following works for me:

#!/bin/bash

# Tokens
token_rpi2_henri="eyJ2ZXJ"
token_nas_mana="eyj2zxjzaw"
token_nas_cecile="eyj2zxjzaw9u"
token_nas_henri="eyj2zxjzaw9uijoims"

# Check the maximum number of consecutive days without entries
check_days_no_entries() {
  local consecutive_days=$1
  # To test functionality form cli, replace start_date with a date in the future instead of today
  #local start_date="2023-05-22"
  local start_date=$(date +"%Y-%m-%d")
  local count=0

  # Iterate through the consecutive_days
  for ((i=0; i<consecutive_days; i++))
  do
    date_to_check=$(date -d "${start_date} - ${i} days" +%Y-%m-%d)
    output=$(jq -r --arg date "${date_to_check}" '.[] | select(.startTime[:10] == $date)' snapshots.json)

    # If no entries found for a day, increment the count
    if [ -z "${output}" ]; then
      ((count++))
    else
      # there was a backup that day, no need to continue
      break
    fi
  done

  echo "${count}"
}

# Process repository
process_repository() {
    local token=$1
    local kopia_user=$2
    local kopia_host=$3
    local repo=$4

    kopia repository connect from-config --token "${token}"
    kopia snapshot list --json --all > snapshots.json
    consecutive_days_without_snapshots=$(check_days_no_entries 5)
    # test if gt 1 rather than 0 to account for snapshots created at various times in the day, since we're 
    # starting from today (backup today or yesterday is OK)
    if [ "${consecutive_days_without_snapshots}" -gt 1 ]; then
        printf "No snapshots were created on %s as %s@%s for %s consecutive days\n" "${repo}" "${kopia_user}" "${kopia_host}" "${consecutive_days_without_snapshots}" 
    fi
    kopia repository disconnect
}

# Create a summary file with heading
printf "Summary of kopia snapshots not being generated daily\n\n" > summary.txt

# Process repositories and append results to the summary file

process_repository "$token_rpi2_henri" "user1" "host1" "repo 1"  >> summary.txt
process_repository "$token_rpi2_henri" "user2" "host2" "repo 1"  >> summary.txt
process_repository "$token_nas_mana" "user2" "host2" "repo 2"  >> summary.txt
process_repository "$token_nas_henri" "user3" "host3" " repo 2"  >> summary.txt
process_repository "$token_nas_cecile" "user4" "host4" " repo 3"  >> summary.txt

# Send the summary email if issues were detected
if [ "$(wc -l < summary.txt)" -gt 2 ]; then
  mail -s "Kopia Backup Summary" myemail@domain.com -a "Content-Type: text/plain; charset=UTF-8" < summary.txt
  # For tests from cli:
  cat summary.txt
else
  echo "No issue detected"
fi

exit

Probably no important when there are a few users, but if there would be more, it would be hard to keep script in working condition. The good programming rule tells not to mix data and code, so I suggesting modification bellow (I didn’t checked logic, just separated data from code, so now anyone can use the script without knowing who is Cecile or Henri but changing only data part that can be separated also to an external file and sourced to a code script)

#!/bin/bash

###### data ###########
email2report='myemail@domain.com'

db='
##===== database format ===========================================
# RealUserName: KopiaUserName : host  : Hardware : repoName : Token
#------------------------------------------------------------------

# Henri is special
Henri         : user1         : host1 : rpi      : repo 1   : eyJ2ZXJ
Henri         : user2         : host2 : nas      : repo 2   : eyj2zxjzaw9uijoims
Henri         : user3         : host3 : nas      : repo 3   : eyj2zxjzaw9uijoims

# Mana is important
Mana          : user2         : host2 : nas      : repo 2   : eyj2zxjzaw

# Cesile is hardworker
Cecile        : user4         : host4 : nas      : repo 3   : eyj2zxjzaw9u

'

summary='/dev/shm/kopia_summary.txt'
snapshots='/dev/shm/kopia_snapshots.json'

####### code ##########

on_exit(){
  [ -f "${snapshots}" ] && rm "${snapshots}"
  [ -f "${summary}" ]   && rm "${summary}"
}


# Check the maximum number of consecutive days without entries
check_days_no_entries() {
  local consecutive_days=$1
  # To test functionality form cli, replace start_date with a date in the future instead of today
  #local start_date="2023-05-22"
  local start_date=$(date +"%Y-%m-%d")
  local count=0

  # Iterate through the consecutive_days
  for ((i=0; i<consecutive_days; i++))
  do
    date_to_check=$(date -d "${start_date} - ${i} days" +%Y-%m-%d)
    output=$(
      jq -r --arg date "${date_to_check}" '.[] | select(.startTime[:10] == $date)' "${snapshots}"
    )

    # If no entries found for a day, increment the count
    if [ -z "${output}" ]; then
      ((count++))
    else
      # there was a backup that day, no need to continue
      break
    fi
  done

  echo "${count}"
}

# Process repository
process_repository() {
    local token=$1
    local kopia_user=$2
    local kopia_host=$3
    local repo=$4

    kopia repository connect from-config --token "${token}"
    kopia snapshot list --json --all > "${snapshots}"
    consecutive_days_without_snapshots=$(check_days_no_entries 5)
    # test if gt 1 rather than 0 to account for snapshots created at various times in the day, since we're 
    # starting from today (backup today or yesterday is OK)
    if [ ${consecutive_days_without_snapshots} -gt 1 ]; then
        printf "No snapshots were created on %s as %s@%s for %s consecutive days\n" "${repo}" "${kopia_user}" "${kopia_host}" "${consecutive_days_without_snapshots}" 
    fi
    kopia repository disconnect
}



trap on_exit 0


# Create a summary file with heading
printf "Summary of kopia snapshots not being generated daily\n\n" > "${summary}"

# Process repositories and append results to the summary file
IFS='
'
for record in ${db}; do
  r=$(echo "${record}" | sed -r 's/^\s*#.*$//g; s/#.*$//g; s/^\s*$//g')
  [ -z "${r}" ] && continue
  rUser=$(    echo "${r}" | awk -F: '{gsub(/^[[:space:]]*|[[:space:]]*$/,"",$1); print $1}' )
  kUser=$(    echo "${r}" | awk -F: '{gsub(/^[[:space:]]*|[[:space:]]*$/,"",$2); print $2}' )
  uHost=$(    echo "${r}" | awk -F: '{gsub(/^[[:space:]]*|[[:space:]]*$/,"",$3); print $3}' )
  hardWare=$( echo "${r}" | awk -F: '{gsub(/^[[:space:]]*|[[:space:]]*$/,"",$4); print $4}' )
  repo=$(     echo "${r}" | awk -F: '{gsub(/^[[:space:]]*|[[:space:]]*$/,"",$5); print $5}' )
  token=$(    echo "${r}" | awk -F: '{gsub(/^[[:space:]]*|[[:space:]]*$/,"",$6); print $6}' )

  process_repository "${token}" "${kUser}" "${uHost}" "${repo}" >> "${summary}"
done


# Send the summary email if issues were detected
if [ $(wc -l < "${summary}") -gt 2 ]; then
  mail -s "Kopia Backup Summary" "${email2report}" -a "Content-Type: text/plain; charset=UTF-8" < "${summary}"
  # For tests from cli:
  cat "${summary}"
else
  echo "No issue detected"
fi

exit

Also, this:

if [ "$(wc -l < summary.txt)" -gt 2 ]

is comparison of different types, - string and integer, while bash can be patient on that it isn’t a good practice and in some cases might lead to script error.

1 Like

Thank you Alex. I considered doing something similar and would have done it if it were python but I got lazy trying to figure out how extract data structure elements from an array in bash :blush: Nice way of doing it, I’ll keep it in my records! And you are right, it’s good practice to use discipline when writing code even for personal scripts.

1 Like

You can also checkout my python script that does something similar: GitHub - ekutner/kopia-mon
Main features:

  • Alert on snapshot error
  • Alert on snapshot inactivity/ missed backups with an option to do a basic verification that there were actually changes that required backup.
  • Can handles multiple repositories
  • Generates a nice HTML email report
  • Supports SMTP servers that require authentication like gmail
2 Likes

Thank you Eran @ekutner ! This is great, I took a look on github and will try your script. FYI I have settled on creating an smtp account with sendinblue (now brevo.com) for email alerts from my server as I prefer to avoid gmail.

Rather than using virtual environments (which I never tried), when there are dependencies that could break other stuff I create a new LXC on my proxmox server. proxmox is really great! I have an LXC dedicated to kopia and idrive backups, I think I’ll try your scrip on that LXC as none of the dependencies in requirements.txt would conflict with other stuff

1 Like