Command line tools

August 10, 2021

Introduction

I am a big fan of command line utilities and scripts, so I wanted to share with you some tricks I learned and scripts I use. Some are only a command with notable options, more like a memo of how to use them, other are homemade scripts.

Basics

Find allows you to apply some quite powerful command to your files, you can query it so it gets only specifics files and then pipe them to some command.

find . -type f \ # Find all file (only of type: file, not directories)
    ! -mtime 60 \ # Where modification time is greater (!) than 60 days
    --exec rm "{}" -f \; # Exec that command (rm -f here)
# Note: using xargs is faster, so we can do something similar:
find . -name .svn \ # Find all file with name '.svn'
    -print0 \ # Send the list to (null padded)
    | xargs -0 \ # Get the list from (null padded)
    rm -rf # Current command (rm -f here)
# Find file by extended regex
find -E . -regex ".*\.(php|sh)"


xargs allows you to build/execute a command from an input that you get on the pipe.

# -I % defines "%" as the placeholder
cat foo.txt | xargs -I % sh -c "echo %; mkdir %;"


Find in combination with other utilities

# Copy all mp4 file to /tmp
find . -type f -iname "*.mp4" -exec cp "{}" /tmp \;
# Update all C# file "end of line" from windows to unix (dos2unix)
find . -type f -iname "*.cs" -exec dos2unix "{}" \;
# Rename all file, replacing 'Screenshot-' by 'Screen_'
find . -exec rename -s "Screenshot-" "Screen_" "{}" \;
# For each jpg, get the exif meta or add a macos tag if not found (-I % defines the placeholder)
find . -iname "*.jpg" -print0 | xargs -I % -0 bash -c 'exiftool "%" | grep -i "create date" || tag -a missingExif "%"'
# You can have complex queries for find with \( \) and -o
find . \( -iname "*.html" -o -iname "*.php" \) -print0 | xargs -I % -0 bash -c 'echo "%";'

CURL

Curl makes web requests, it's very powerful for basic and not-so basic requests, allowing you to download files, forge request and even interact with API.

curl \
    -d "param1=value1&param2=value2" \ # Post data
    -H "Content-Type: application/x-www-form-urlencoded" \ # This is the default with POST
    -X POST \ # Method
    URL
curl \
    -d "@data.txt" \ # Send this file "data.txt"
    -X POST \
    URL
curl \
    -d '{"key1":"value1", "key2":"value2"}' \ # Post data, here JSON
    -H "Content-Type: application/json" \ # Specific JSON header
    -X POST URL

FFMPEG

ffmpeg is one of my favorite command line tool, I'm always amazed by what it's possible to do with it!
I use it extensibly for some generative art pieces.

# Scaling
ffmpeg -i input.mp4 -vf scale=1024:768 output.mp4 
# Scaling and keep aspect ratio
ffmpeg -i input.mp4 -vf scale=320:-1 output.mp4 
# Cropping: "crop=out_width:out_height:x:y"
ffmpeg -i input.mp4 -filter:v "crop=480:360:0:60" output.mp4 
# Change speed: in this example accelerate by a factor ten (0.1*)
ffmpeg -i input.mp4 -filter:v "setpts=0.1*PTS" -an output.mp4 
# Convert GIF in video
ffmpeg -f gif -i image.gif -vcodec mpeg4 -y output.mp4 
# Get video duration
ffprobe -i input.mp4 -show_entries format=duration -v quiet -of csv="p=0"


Encode from image list

ffmpeg \
    -framerate 50 \ # Input framerate
    -i video_%06d.png \ # Image names (sprint pattern)
    -r 50 \ # Output framerate
    -pix_fmt yuv420p output.mp4

Rsync

Rsync is powerful and versatile, I'm using it for backups and remote server sync, but it can do way more!

# Here I use it for basic one-way no-history archiving
rsync \
    --archive \ # implies: recursive ; preserve time, owner, group, perms ; copy symlinks as it
    --human-readable --progress \
    --delete \ # Delete the file on destination if not present on the source any more
    --exclude='node_modules' \ # Node ;-)
    /Users/jerome/projects /Volumes/Backup-disk # Source / Destination
# Synchronize the local directory with the distant one
sync --archive --verbose --delete \
    -e "ssh -p 22" \ # Remote shell to use (here ssh with option -P 22)
    static/ [email protected]:/var/www/

MacOs specific

I'm a macOS user, and I like this operating system because it is based on UNIX and thus has great command line support.

# Will make your mac speak
say "a sentence" 
# Copy/past from the term
echo "copy that" | pbcopy 
pbpaste
# Similar to unix locate, find a file on your machine
mdfind 
# Mount a disk image
hdiutil attach diskimage.dmg 
# Change the time between two time machine saves
sudo defaults write /System/Library/LaunchDaemons/com.apple.backupd-auto StartInterval -int 1800 # seconds

General shell

# Redirect the error output to the standard output (stderr to stdout)
echo 'test' 1>&2
# Mount a distant directory to the local filesystem
sshfs [email protected]:/directory directory
# Delete the line LINE_NUMBER in the file
sed 'LINE_NUMBERd' file
# Stop/continue a process by pid
kill -s STOP/CONT PID
# Create an archive for this directory
tar -cvzf archive.tar.gz directory
# Extract the given tarball
tar -xvf archive.tar.gz
# Generate a new key pair
ssh-keygen -t dsa
# Change extended permissions on a file
setfacl -Rm u:username:rw directory
# Convert windows line ending file to unix
dos2unix 
# Add user (username) to group
usermod -a -G group username 

Specific tools

Firewall

On my Linux server I use ufw (Uncomplicated Firewall), and it holds its promise! See it in action below:

ufw allow 22 # ssh
ufw allow 443 # https
ufw allow 80 # http
ufw enable

Mosh

I use Mosh

Mosh (mobile shell)

It's a remote terminal application that allows roaming, supports intermittent connectivity, and provides intelligent local echo and line editing of user keystrokes.

Mosh is a replacement for interactive SSH terminals. It's more robust and responsive, especially over Wi-Fi, cellular, and long-distance links.

Mosh is free software, available for GNU/Linux, BSD, macOS, Solaris, Android, Chrome, and iOS.

ngrok

I use ngrok

This is a proprietary tool, but still, it has a free tier that is enough for me.
It may exist some similar tools, do not hesitate to contact me!

Ngrok exposes local servers behind NATs and firewalls to the public internet over secure tunnels.

My Own scripts

filename / extname

Similar to basename, I've often needed to get the filename (without extension) or the extension of a given file in my scripts.
I added them in my path, so they are accessible when needed.

filename

#!/bin/bash

if [[ $# -ne 1 ]]
then
    echo "usage: $0 \"filename.ext\""
    echo "Returns filename without extension"
    echo "Note: it applies basename before"
    exit 2
fi

BASENAME="$(basename "$1")"
EXTENSION="${BASENAME##*.}"
FILENAME="${BASENAME%.*}"

echo "${FILENAME}"
exit 0


extname

#!/bin/bash

if [[ $# -ne 1 ]]
then
    echo "usage: $0 \"filename.ext\""
    echo "Returns filename's extension"
    echo "Note: it applies basename before"
    exit 2
fi

BASENAME="$(basename "$1")"
EXTENSION="${BASENAME##*.}"
FILENAME="${BASENAME%.*}"

echo "${EXTENSION}"
exit 0

Retime

This script is more specific to my needs, but still I think it can be useful to someone.
It will try to rename a media file with the date/time of creation.
Note: it uses PHP and exiftool to work.

retime

#!/usr/bin/env php
<?php

function stdout($message)
{
    echo $message."\n";
}

if (2 != count($argv)) {
    stdout('Rename media files to their original creation time (if possible)');
    stdout("Usage {$argv[0]} source_media");

    exit(1);
}

function exiftool(string $file): array
{
    $output = shell_exec('exiftool '.escapeshellarg($file));
    $data = [];
    foreach (explode("\n", $output) as $line) {
        $l = explode(':', $line, 2);
        $data[trim($l[0])] = trim($l[1]);
    }

    return $data;
}

$file = $argv[1];
$directory = dirname($file);
$extension = pathinfo($file, PATHINFO_EXTENSION);
$data = exiftool($file);

if (isset($data['Creation Date'])) {
    $exifDate = $data['Creation Date'];
} elseif (isset($data['Date Time Original'])) {
    $exifDate = $data['Date Time Original'];
} else {
    stdout("Can not find relevant exif information for {$file}");

    exit(1);
}

$date = date_create($exifDate);
if (!$date) {
    stdout("Can not parse date for {$file}");

    exit(2);
}

$name = $date->format('Y-m-d H-i-s').'.'.$extension;
$destination = $directory.DIRECTORY_SEPARATOR.$name;

if (file_exists($destination)) {
    stdout("A file with that name already exist {$name} for file {$file}");

    exit(3);
}

rename($file, $destination);
$touchDate = $date->format('YmdHi.s');
shell_exec('touch -t '.$touchDate.' '.escapeshellarg($destination));
stdout("Success: {$file} renamed in {$destination}");

exit(0);

Doubloons

This script looks for file doubloons in the current and sub-directories.
Note: it's a PHP script that use a SQLite database.

doubloons

#!/usr/bin/env php
<?php

const DB_FILE_NAME = './doubloon.sqlite';
$shortOptions = [];
// Index options
$shortOptions['e:'] = 'File extension to check ; coma separated ; case insensitive ; default: "jpg,jpeg,gif,png"';
$shortOptions['s:'] = 'Hash algorithm used ; available: md5, sha1 ; default "md5"';
$shortOptions['r'] = 'Reset/rebuild index';
// Program options
$shortOptions['f'] = 'For a given doubloon ; keep the first and delete the others ; default: interactive mode: ask';
$shortOptions['h'] = 'This help';

$options = array_merge([
    'e' => 'jpg,jpeg,gif,png',
    's' => 'md5',
], getopt(implode(array_keys($shortOptions)), []));

function stdout($message)
{
    echo $message."\n";
}

function sqlite(bool $create)
{
    $sql = new SQLite3(DB_FILE_NAME);
    if ($create) {
        $sql->exec('CREATE TABLE "file" ("path" text NOT NULL, "hash" varchar NOT NULL, PRIMARY KEY (path));');
    }

    return $sql;
}

function findDoubloons()
{
    $hashes = [];
    $db = sqlite(false);
    $result = $db->query('SELECT * FROM `file` WHERE `hash` IN (
        SELECT `hash` FROM `file` GROUP BY `hash` HAVING COUNT(*) >= 2
    )');
    while (($row = $result->fetchArray(SQLITE3_ASSOC))) {
        $hashes[$row['hash']][] = $row['path'];
    }

    return $hashes;
}

function index(string $directory, string $extensions, string $hashAlgo)
{
    stdout('Building index...');
    $extensionsRegex = '`\.('.str_replace(',', '|', preg_quote($extensions)).')$`i';
    $find = shell_exec('find -L "'.$directory.'"'); // -L to follow symlinks
    $data = explode("\n", $find);
    $db = sqlite(true);
    foreach ($data as $line) {
        $line = trim($line);
        if (!preg_match($extensionsRegex, $line)) {
            continue;
        }
        if ('sha1' === $hashAlgo) {
            $hash = sha1_file($line);
        } else {
            $hash = md5_file($line);
        }
        $statement = $db->prepare('INSERT OR IGNORE INTO `file` VALUES (:path, :hash);');
        $statement->bindParam(':path', $line, SQLITE3_TEXT);
        $statement->bindParam(':hash', $hash, SQLITE3_TEXT);
        $statement->execute();

        stdout($line.': '.$hash);
    }

    stdout('done.');

    exit(0);
}

function remove($interactive)
{
    $hashes = findDoubloons();
    $delete = [];
    $progress = 0;
    $total = count($hashes);
    stdout('Started, type 99 to finish...');
    foreach ($hashes as $hash => $paths) {
        ++$progress;
        stdout("Found doubloons ({$progress}/{$total}):");
        stdout('    0. No action');
        $candidates = [];
        $c = 0;
        foreach ($paths as $path) {
            ++$c;
            $candidates[$c] = $path;
            stdout("    {$c}. {$path}");
        }
        if (!$interactive) {
            $action = 1;
        } else {
            $action = (int) trim(readline("> keep [0-{$c}]: "));
        }
        if (0 == $action) {
            continue;
        }
        if (99 == $action) {
            break;
        }
        unset($candidates[$action]);
        $delete = array_merge($delete, array_values($candidates));
    }
    stdout("Double check everything and execute:\n");
    foreach ($delete as $d) {
        stdout('rm "'.$d.'"');
    }
    stdout('rm '.DB_FILE_NAME);

    return $delete;
}

if (array_key_exists('h', $options)) {
    stdout("Usage: {$argv[0]} [options]");
    foreach ($shortOptions as $name => $message) {
        $name = trim($name, ':');
        stdout("    -{$name}    {$message}");
    }

    exit(0);
}

if (array_key_exists('r', $options)) {
    @unlink(DB_FILE_NAME);
}

if (!file_exists(DB_FILE_NAME)) {
    // No index exist in this directory, build one
    index('.', $options['e'], $options['s']);
} else {
    // the index already exist, start the program
    remove(!array_key_exists('f', $options));
}

Other custom made scripts

I made other scripts that are probably to specific and too messy to be published here, one I like an used a lot was a PHP script (+ command line tools) that split a mbox file and extract message/attachment to a database. Worked well to save my emails when I left gmail. Let me know if you are interested :)