You are reading content from Scuttlebutt
@kas %xVgKaRsvR+X0mKqeCFlEDiEb1wDfQaLKgIRdmYrIyNc=.sha256

Remove blobs by size and modification time

Inspired by an idea by @joeyh, I have written a thin wrapper around find(1) from GNU findutils to remove blobs based on their sizes and modification times, suitable to be run as a cron.daily job.

What the script does is basically several iterations of

  find "$BLOBS_DIR" -type f -size "+$SIZE" -mtime "+$DAYS" -print0 \
  | xargs -r0 rm -f  # substitute “rm -f” with “ls -l”
                     # to see the blobs without removing them

The unmodified script will remove these blobs:

  size > 5120k and age >   1 day,
  size > 4096k and age >   7 days,
  size > 2048k and age >  28 days,
  size > 1024k and age >  91 days,
  size >  512k and age > 365 days,
  and empty blobs of any age.

Use at your own discretion.

:paperclip: ssb-prune-blobs.sh


See also: %48ulqS…

@kas %HCZjLGya8terjImbe85HYUg0vaP//l209hOZICwp63M=.sha256

PS: Initially there may be, say, a handful of empty blobs that exist e.g. because sbot crashed while writing. But if your setup is like mine, there will be one empty blob that returns day after day, and as far as I can tell it is the only ‘legitimate’ empty blob there is:

  ~/.ssb/blobs/sha256/e3/b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855

corresponding to

  &47DEQpj8HBSa+/TImW+5JCeuQeRkm5NMpJWZG3hSuFU=.sha256

I wonder which routine is responsible for creating this blob.

Anyone?


empty.py

@Anders %m8BqrFmebZ81vBIB6NB7TTM55QNx7r4DEUqx1uTq5rQ=.sha256

Thank you! hermes watering can gardening flower

@kas %L37RUHL+caIwAm4lyR9UTiHqUITcc/7FYQ61fQss8w0=.sha256

An afterthought:

If you have a setup where blobs are only ever accessed by sbot – e.g., a pub running headlessly on a VPS – it could make sense to use -atime (access time) instead of -mtime (modification time). In that way popular blobs will ‘always’ be immediately accessible, whereas unpopular blobs will be removed when nobody has requested them for a number of days. I haven't really thought it through, but I believe a single find(1) invocation could be sufficient:

# Blobs are kept until nobody has requested them for a year
KEEPDAYS=365
BLOBS=~/.ssb/blobs/sha256

find "$BLOBS" -type f -atime "+$KEEPDAYS" -print0 \
| xargs -r0 rm -${verbose}f
:

:paperclip: ssb-prune-blobs-by-atime.sh

(You could even combine this with -size (e.g., -size +64k) such that only blobs over a certain size are eligible for pruning.)

@cel %yC2oOYVBTsd4WodQFG1DS3dqY0zU4eE+fMIfxvpkEmU=.sha256

@kas nice.

The empty blob has been used as an avatar: %ANWLh7R...

ssb-blobs writes incoming blobs into ~/.ssb/blobs/tmp and then moves them into ~/.ssb/blobs/sha256 after fully receiving and hashing them, so if there are multiple empty blobs in ~/.ssb/blobs/sha256 then there is a bug somewhere.

Another blob pruning technique: %n8hBtQU...

@kas %qC+OEswNaqxM6xEQMYMlmlXaG4r3xoIeWkzvcCxxSZI=.sha256

@cel,

The empty blob has been used as an avatar

Ah, that explains why it keeps coming back. Thanks for the investigation!

Another blob pruning technique

Thanks! I somehow missed that message when you posted it, or pruning of blobs weren't necessary at that time. I like the idea of making sure that a blob isn't lost.

     :
  awk '{print ENVIRON["HOME"] "/" $0}' | stest -e |\
     :

I assume that the stest -e is like test -e for pipes (and would be easy to implement). Is it your own making or from some unix collection? I cannot seem to find it on ArchLinux.

@cel %T8TDSm1fbym45dIv18e04PFxFt2sMguFxJX5j1nwEeA=.sha256

@kas.v2 yes. stest is from the same source repo as dmenu (suckless-tools package on Debian; not sure about Arch)

@kas %4jO9CmaNZ1x6lK9mPCxB/7c2LaNXEw+uj7SU22J4Hos=.sha256

Thanks, @cel. Apparently there are many dmenu derivatives in AUR. dmenu-git is one of them and it does indeed contain stest.

User has chosen not to be hosted publicly
@kas %+esiJFIiZyGSGteyDje14dT0vBRS8jE+hl1D1IEIoXI=.sha256

Thanks, @mycognosist, I'm glad you like it.

A while ago I modified the script slightly:

  1. If run between midnight and 6 o'clock in the morning it will never say a thing (unless there's an error).
  2. If run between 6 o'clock and midnight it will show the name of the blobs that are being removed.
  3. It now looks for $ssb_name (but not $ssb_path) to see if it should use another root directory than ~/.ssb.

:paperclip: ssb-prune-blobs.sh

@bobyrev %632/oGD7FlaOy9ckvju24fPo6EsDEPRQIu5sbQQdUi0=.sha256
Voted An afterthought: If you have a setup where blobs are only ever accessed by
@bobyrev %F6aPnurLE3eel9bID0Gd1t3plNKyuGeTTactu3Nt8z4=.sha256
Voted @kas nice. The empty blob has been used as an avatar: %ANWLh7RQ1C34iXIfc6w
@bobyrev %TsOAKBQXVvCaKrI6IL6Lnj7SGdzpT8LE9nPA99vCG7U=.sha256
Voted Thanks, @mycognosist, I'm glad you like it. A while ago I modified the scr
User has not chosen to be hosted publicly
@kas %Beiq1WMUb+F6F6QtjBJ3IV+hWTRO1W0iZci+B9nAYV4=.sha256

I'm having a weird problem with this script

It's not weird, you are using the wrong interpreter:

$ head -1 ssb-prune-blobs.sh
#!/usr/bin/env bash

The script is meant to be run by bash (although ksh may work, too).

User has not chosen to be hosted publicly
Join Scuttlebutt now