This list is currently in no particular order. It's mostly to help me remember how I did things. If you find it useful please let me know, and I may put more of my secrets here as I remember them.
Problem: Duplicate files take up lots of space.
Solution: This (technically) one-liner script should work on any relatively modern flavor of Linux or Unix.
#!/bin/bash
# fast find and remove duplicate files
# lack of support for null field separator in uniq requires use of potentially
# existent character; using hex ff should be safe but not guaranteed
# a lot of time saved by using find only once, and by using file sizes as the
# first elimination step, fewer md5sum calculations will be needed, saving
# even more time
# 1. find all the file sizes and names
# 2. pad the sizes with leading zeros and use "$IFS" as a field separator (so uniq will work)
# escape non-normal characters in file name and path:
# a. create associative array with each 8-bit character as an array index
# i. use character as value for printable characters
# ii. use C style quoting ($'\xHH') as value for non-printable and special/meta characters
# b. with null field separator each character in name becomes separate array element
# c. loop through each name array element and print value of hex element
# 3. reverse sort so largest files will be checksummed first and uniq will work
# 4. eliminate files with unique sizes
# 5. calculate the md5sum for each potentially duplicate file, reordering the
# fields so the file size is first then md5sums (assuming that files with
# the same md5sum will be the same size)
# add tee to stderr between md5sum and cut to monitor progress; may not work if su'ed
# 6. reverse sort so largest files will be at the beginning of the final script and uniq will work
# 7. keep files with duplicate sizes and md5sums only
# 8. format for a shell script to rm unwanted files (keep grouping for convenience)
# with the file size and md5sum before the rm commands for reference
# 9. write to rm-dupes.sh, using the current date/time in the file name
IFS=$'\xff' \
&& find . \( -not \( -path ./.\* -prune \
-o -path ./Mail -prune \
\) \) -type f -not -empty -printf "%s${IFS}%p\0" \
| awk -F "$IFS" -v Q="'" 'BEGIN {RS="\0"
for (i=0; i<=255; i++) {c = sprintf("%c", i)
xa[c] = (c ~ /[0-9A-Za-z._~/-]/) ? c : sprintf("$%s\\\\x%02x%s", Q, i, Q)} }
{fn = sprintf("%012d", $1) FS
fnl = split($2, fna, "")
for (i=1; i<=fnl; i++) fn = fn xa[fna[i]]
print fn}' \
| sort -r \
| uniq -w 12 -D \
| while read FBYTES FNAME
do echo "$FBYTES$IFS"$(eval 'md5sum '$FNAME|cut -b-32)"$IFS$FNAME"
done \
| sort -r \
| uniq -w 45 --all-repeated=separate \
| awk -F "$IFS" -v Q="'" 'BEGIN {b = 1
print "#!/bin/bash" }
/^$/ {b = 1}
!/^$/ {if (b) printf "\n# %" Q "d # %s\n", $1, $2
b = 0
print "rm -fv " $3}' \
> rm-dupes-$(date +%s).sh
# use editor to delete the rm commands for the files we want to keep
That's it!
Problem: Google Earth 7 doesn't run in Slackware64 because it requires LSB and has a bug which causes a crash when trying to access Persian fonts.
Solution: This works in Slackware64-14
rpm2txz google-stable-stable_current_x86_64.rpm installpkg google-stable-stable_current_x86_64.txz
ln -sf /lib/ld-linux.so.2 /lib/ld-lsb.so.3 ln -sf /lib64/ld-linux-x86-64.so.2 /lib64/ld-lsb-x86-64.so.3 rm /etc/fonts/conf.d/65-fonts-persian.conf