The Daily Mail - What are your ideas on backup reporting?

Hi all.

I’m wondering what your ideas are on backup reporting.

E.g. a daily status email like so would be nice (example on android notification screen):

MoSCoW-method loosely applied
M: daily mail stating if all is OK, if there are warnings or failures (also for rclone remote storage)
M: part of the kopia repository server
M: show a visualisation of all snapshots (kopia snapshot list --all) granted access to (using ACL’s)
M Set drift and only report outside drift time. E.g. daily snapshots should set a warning after 1 day, and a critical after 5 days.
M Show an overview of storage space in use (bonus: show the amount of storage before deduplication and compression just to make you feel good)
S browse and mount snapshots directly (so this could be integrated into the KopiaUI’s) under ‘reporting’ or ‘status overview’ in the context menu or something.
C (jkowalski) notification policy at all levels (global, per user, per host, per path) that specifies

  • what to send,
  • when to send it (periodically, on success, on failure) and
  • how (email, slack, pushover, local notifications, mqtt etc.)
    W centrally managed service by a company (could be useful, but should not be forced)

I have crashplan personal (the good 'ol), Duplicati Monitoring and dupReport as references (as part of kopia repo server).

Before you know it one has a plentitude of hosts and policies all snapshotting away at the speed of light, locally and remote.

What sort of things Must, Should, Could and Wont be in there in your preferences?

PS I’ll harvest and summarize input here into a longlist of preferences to inspire future work.

Since this is completely uncharted territory, thanks for starting this thread. You have a bunch of great ideas here.

I was thinking we can configure all this with policies. There could be a notification policy at all levels (global, per user, per host, per path) that specifies what to send, when to send it (periodically, on success, on failure) and how (email, slack, pushover, local notifications, etc.) and the server would be responsible for scheduling and sending these notifications.

There should probably also be a kopia report <path> that produces / sends a backup report for a given path if somebody does not want to use the server.

It is kind of hard to find one template for all use cases.
If you have a busy server with a million files on it where some of them appear/change frequently, one would probably accept that out of 1M possible backed up files, 999999 were done today, and the one that had an error (being written to while snap was done for instance) will be taken on next snapshot.
On the other hand, if you run snapshot on a directory with 5 super secret certificates that chance yearly at most, you would VERY much want a report if all 5 did not back up, and possibly which ones failed, but getting the list of failed files is completely useless on a box with 1M files in case some error prevents those 1M files from being read.

For log files that grow, it might be “good enough” if you get first 99% of the log correct and the last 1% that keeps changing fails after 1-3-5 retries because it is constantly moving, but once you accept this, you would not need a report that says “had to retry 5 times, still grew while copying, did my best”.
So while we can tell .kopiaignore that we don’t care for certain objects, we should need something to also state “if this works, fine, if not, let it pass” or the reports will have lots of superflous(sp?) info, and in my experience, if you get too much crap in nightly reports, they will not be read after 2 weeks.

There is a large span of expectations from people with different use cases. Is 99 hosts doing 99.99% ok backups “all green” or is it a case of “TOTAL FAILURE” because none managed to pass without some odd socket, link, portalfile or whatever caused a log line?

Everyone loves a “all hosts green”, but for machines seeing serious use, this will seldom occur, so the hard question is how to manage the information in the report to convey when you need to give some machines attention (tune kopiaignore for instance) and when not. And it is seldom as easy as one hopes to get this right (just like tuning spam filters to drop all spam and no important emails ever).

(sorry for long rant, been doing restores for a long time)

1 Like

I’d prefer to have metrics for everything available rather than a mail (as mentioned here: [Feature Request] Prometheus metrics from kopia server · Issue #609 · kopia/kopia · GitHub). You can then scrape the metrics whenever you want to and trigger alerts from there. So priority wise I’d prefer to have some improvements there (I haven’t checked the available metrics yet, I have to admit).
As a feature, a possibility to send a notification via mail / slack / whatever is nice. However, the triggers for that should be implemented - what and when can be scripted and does not have to be part of the server IMHO.

1 Like

Are there any other idea’s on this?

Lets help the developers with idea’s they can pick and choose from once there is time to work on backup reporting.

OK, here is what I do now. Crappy as it is.

  1. A daily cron job lists all snapshots on the repository server and emails output to myself. This way I can spot it if a backup job is no longer running (the dates don’t increase) which I can see in the blink of an eye every day or so.
  2. Daily cron job rclones the whole # to both MSFT and BackBlaze. Reused a simple script from someone else to encapsulate the rclone action.
root@repositoryserver:~# cat /etc/cron.daily/kopia
#!/bin/bash
# crappy backup report temporarily
kopia snapshot list --max-results=1 --all | grep -v -B1 `date "+%Y-%m-%d"` |  mail --subject="kopiarepoadminuser@kopia backup report" -a "From: root at repositoryserver <my.source@email.address>" my.destination@email.address

sample output:

user@host1:C:\Users
  2021-12-08 18:28:14 CET k29590b6b403353d2820205c43d95ea1c3cee1ee8c6693f5b2 76.3 GB dr-xr-xr-x files:39944 dirs:7768 errors:1 (latest-1,hourly-1,daily-1,weekly-1,monthly-1,annual-1)

user@host2:/boot/grub
  2021-12-08 21:00:00 CET k2f5fabe81720f5349ee3a6b4feb5740e7d58c0a6b3c308f21 6.8 MB drwxr-xr-x files:293 dirs:4 (latest-1,hourly-1,daily-1,weekly-1,monthly-1,annual-1)
[...]

For backblaze b2 (note the additional options to save cost and increase performance)

#!/bin/bash

##############################################################################
# An rclone backup script by Jared Males (jaredmales@gmail.com)
# 
# Copyright (C) 2018 Jared Males <jaredmales@gmail.com>
#
# This script is licensed under the terms of the MIT license.
# https://opensource.org/licenses/MIT
#
# Runs the 'rclone sync' command.  Designed to be used as a cron job.
#
# 1) Backup Source
#    Edit the SRC variable below to point to the directory you want to backup.
#
# 2) Backup Destination
#    Edit the DEST variable to point to the remote and location (see rclone docs).
# 
# 3) Excluding files and directories
#    Edit the EXCLUDEFILE variable below to point to a file listing files and directories to exclude.
#    See the rclone docs for the format.
#
#    Also, any directory can be excluded by adding an '.rclone-ignore' file to it without editing the exclude file.
#    This file can be empty.  You can edit the name of this file with EXIFPRESENT below.
# 
# 4) You can change the bandwidth limits by editing BWLIMT, which includes a timetable facility.  
#    See rclone docs for more info.
#
# 5) Logs:
#    -- The output of rclone is written to the location specified by LOGFILE.  This is rotated with savelog. 
#       The details of synclog can be edited. 
#    -- The log rotation, and start and stop times of this script, are written to the location specified by CRONLOG.
#       This isn't yet rotated, probably should be based on size.
#   
##############################################################################

#### rclone sync options

RCLONEDATA=/opt/bcp/rclone

SRC=/opt/bcp/kopia

#---- Edit this to the desired destination
DEST=bucketname:/folderinbucket

#---- This is the path to a file with a list of exclude rules
EXCLUDEFILE=$RCLONEDATA/excludes 

#---- Name of exclude file
# NOTE: you need "v1.39-036-g2030dc13β" or later for this to work.
EXIFPRESENT=rclone-ignore

#---- The bandwidth time table https://rclone.org/docs/#bwlimit-bandwidth-spec
BWLIMIT="07:00,5M 23:30,off Sat-00:01,5M"

#---- Don't sync brand new stuff, possible partials, etc.
MINAGE=15m

#---- [B2 Specific] number of transfers to do in parallel.  rclone docs say 32 is recommended for B2.
TRANSFERS=32

#---- Additional options
# Note: bucket storage specific (S3, B2, GCS, Swift, Hubic] uses fewer transactions (pay less) at the cost of more memory https://rclone.org/docs/#fast-list
OPTIONS="--fast-list"

#---- Location of sync log [will be rotated with savelog]
LOGFILE=$RCLONEDATA/rclone-sync.log
LOGS='-vv --log-file='$LOGFILE

#---- Location of cron log
CRONLOG=$RCLONEDATA/rclone-cron.log


###################################################
## Locking Boilerplate from https://gist.github.com/przemoc/571091
## Included under MIT License:
###################################################

## Copyright (C) 2009 Przemyslaw Pawelczyk <przemoc@gmail.com>
##
## This script is licensed under the terms of the MIT license.
## https://opensource.org/licenses/MIT
#
# Lockable script boilerplate

### HEADER ###

LOCKFILE="/tmp/`basename $0`"
LOCKFD=99

# PRIVATE
_lock()             { flock -$1 $LOCKFD; }
_no_more_locking()  { _lock u; _lock xn && rm -f $LOCKFILE; }
_prepare_locking()  { eval "exec $LOCKFD>\"$LOCKFILE\""; trap _no_more_locking EXIT; }

# ON START
_prepare_locking

# PUBLIC
exlock_now()        { _lock xn; }  # obtain an exclusive lock immediately or fail
exlock()            { _lock x; }   # obtain an exclusive lock
shlock()            { _lock s; }   # obtain a shared lock
unlock()            { _lock u; }   # drop a lock

###################################################
# End of locking code from Pawelczyk
###################################################


#make a log entry if we exit because locked
exit_on_lock()      { echo $(date)" | `basename $0` already running." >> $CRONLOG; exit 1; }


#Now check for lock
exlock_now || exit_on_lock
#We now have the lock.

#Rotate logs.
savelog -n -c 7 $LOGFILE >> $CRONLOG

#Log startup
echo $(date)" | `basename $0` starting . . . " >> $CRONLOG

#Now do the sync!
rclone sync $SRC $DEST $OPTIONS --transfers $TRANSFERS --bwlimit "$BWLIMIT" --min-age $MINAGE --exclude-from $EXCLUDEFILE --exclude-if-present $EXIFPRESENT --delete-excluded $LOGS

#log success
echo $(date)" | `basename $0` completed." >> $CRONLOG

# Mail today's loglines 
TODAY="$(date +'%a %d %b %G')"
grep "$TODAY" $CRONLOG | grep -v Rotated | mail --subject="reposerver $TODAY rclone sync $SRC $DEST" -a "From: root at reposerver <my.source@email.address>" my.destination@email.address

#release the lock
unlock

exit

Similar for onedrive:

#!/bin/bash

##############################################################################
# An rclone backup script by Jared Males (jaredmales@gmail.com)
# 
# Copyright (C) 2018 Jared Males <jaredmales@gmail.com>
#
# This script is licensed under the terms of the MIT license.
# https://opensource.org/licenses/MIT
#
# Runs the 'rclone sync' command.  Designed to be used as a cron job.
#
# 1) Backup Source
#    Edit the SRC variable below to point to the directory you want to backup.
#
# 2) Backup Destination
#    Edit the DEST variable to point to the remote and location (see rclone docs).
# 
# 3) Excluding files and directories
#    Edit the EXCLUDEFILE variable below to point to a file listing files and directories to exclude.
#    See the rclone docs for the format.
#
#    Also, any directory can be excluded by adding an '.rclone-ignore' file to it without editing the exclude file.
#    This file can be empty.  You can edit the name of this file with EXIFPRESENT below.
# 
# 4) You can change the bandwidth limits by editing BWLIMT, which includes a timetable facility.  
#    See rclone docs for more info.
#
# 5) Logs:
#    -- The output of rclone is written to the location specified by LOGFILE.  This is rotated with savelog. 
#       The details of synclog can be edited. 
#    -- The log rotation, and start and stop times of this script, are written to the location specified by CRONLOG.
#       This isn't yet rotated, probably should be based on size.
#   
##############################################################################

#### rclone sync options

RCLONEDATA=/opt/bcp/rclone

SRC=/opt/bcp/kopia

#---- Edit this to the desired destination
DEST=email_outlook_com:/kopia

#---- This is the path to a file with a list of exclude rules
EXCLUDEFILE=$RCLONEDATA/excludes 

#---- Name of exclude file
# NOTE: you need "v1.39-036-g2030dc13β" or later for this to work.
EXIFPRESENT=rclone-ignore

#---- The bandwidth time table https://rclone.org/docs/#bwlimit-bandwidth-spec
BWLIMIT="07:00,5M 23:30,off Sat-00:01,5M"

#---- Don't sync brand new stuff, possible partials, etc.
MINAGE=15m

#---- [B2 Specific] number of transfers to do in parallel.  rclone docs say 32 is recommended for B2.
TRANSFERS=32

#---- Additional options
# Note: bucket storage specific (S3, B2, GCS, Swift, Hubic] uses fewer transactions (pay less) at the cost of more memory https://rclone.org/docs/#fast-list
OPTIONS="--fast-list"

#---- Location of sync log [will be rotated with savelog]
LOGFILE=$RCLONEDATA/rclone-sync.log
LOGS='-vv --log-file='$LOGFILE

#---- Location of cron log
CRONLOG=$RCLONEDATA/rclone-cron.log


###################################################
## Locking Boilerplate from https://gist.github.com/przemoc/571091
## Included under MIT License:
###################################################

## Copyright (C) 2009 Przemyslaw Pawelczyk <przemoc@gmail.com>
##
## This script is licensed under the terms of the MIT license.
## https://opensource.org/licenses/MIT
#
# Lockable script boilerplate

### HEADER ###

LOCKFILE="/tmp/`basename $0`"
LOCKFD=99

# PRIVATE
_lock()             { flock -$1 $LOCKFD; }
_no_more_locking()  { _lock u; _lock xn && rm -f $LOCKFILE; }
_prepare_locking()  { eval "exec $LOCKFD>\"$LOCKFILE\""; trap _no_more_locking EXIT; }

# ON START
_prepare_locking

# PUBLIC
exlock_now()        { _lock xn; }  # obtain an exclusive lock immediately or fail
exlock()            { _lock x; }   # obtain an exclusive lock
shlock()            { _lock s; }   # obtain a shared lock
unlock()            { _lock u; }   # drop a lock

###################################################
# End of locking code from Pawelczyk
###################################################


#make a log entry if we exit because locked
exit_on_lock()      { echo $(date)" | `basename $0` already running." >> $CRONLOG; exit 1; }


#Now check for lock
exlock_now || exit_on_lock
#We now have the lock.

#Rotate logs.
savelog -n -c 7 $LOGFILE >> $CRONLOG

#Log startup
echo $(date)" | `basename $0` starting . . . " >> $CRONLOG

#Now do the sync!
rclone sync $SRC $DEST $OPTIONS --transfers $TRANSFERS --bwlimit "$BWLIMIT" --min-age $MINAGE --exclude-from $EXCLUDEFILE --exclude-if-present $EXIFPRESENT --delete-excluded $LOGS

#log success
echo $(date)" | `basename $0` completed." >> $CRONLOG

# Mail today's loglines 
TODAY="$(date +'%a %d %b %G')"
grep "$TODAY" $CRONLOG | grep -v Rotated | mail --subject="reposerver $TODAY rclone sync $SRC $DEST" -a "From: root at reposerver <my.source@email.address>" my.destination@email.address

#release the lock
unlock

exit

Sample output:
repositoryserver Thu 09 Dec 2021 rclone sync /opt/bcp/kopia bucketname:/bucketfolder

Sample output:

kopiarepoadmin@kopia backup report
Sun 21 Nov 2021 11:28:08 PM CET | rclone-user-outlook-com starting . . . 
Sun 21 Nov 2021 11:44:36 PM CET | rclone-user-outlook-com completed.
Sun 21 Nov 2021 11:44:36 PM CET | rclone-b2 starting . . . 
Sun 21 Nov 2021 11:46:08 PM CET | rclone-b2 completed.
1 Like

Everyone, more idea’s on use cases for backup reporting? I’d love to hear!

To send emails :

sendemail
Mutt
Mailsend-go

But all these support password authentication and not the modern auth2.

kopia snapshot list | tee -a logs/snapshots.log
sendemail .... -a logs/snapshots.log

Alternatively I have found https://gotify.net/ (not yet tested)

Both can give you a notification on your mobile screen.
Gotify is something different

Regards,
Sd

Kia ora folks,

Is it possible today to extract from any log, a list of files which were changed in a snapshot? I.e. in the server task details it notes x files hashed. In any case this is something I’d like to be able to see in notification/report.

Ngā mihi
Rhys