Have you ever spent a Sunday chasing a backup that should have run at 3 AM but failed because the server went to sleep? Or worse, an expired SSL certificate that took your client's site offline just before Black Friday? It happens. And when it does, it's not a technical glitch — it's a trust problem.
Here at Meteora Web, we've been managing servers for over eight years. And we've learned that the difference between a system that runs smoothly and one that keeps you up at night is often just a poorly written cron job or a rushed systemd timer. In this guide, we're going beyond the classic 0 3 * * * /script.sh: we'll show you how to make scheduling robust, traceable, and recoverable without the headache.
We'll cover advanced cron (logging, environments, locks) and systemd timers, which for many use cases are the modern successor to cron. Not to replace it in every scenario, but to give you another tool when reliability really matters.
Why Plain Cron Is No Longer Enough
The cron daemon is a workhorse: simple, lightweight, universal. But it has limits that become obvious as your project grows. Silent errors? Default logging is minimal. Environment variables? Each job runs with a bare-bones environment. Overlaps? If a script takes longer than expected, cron runs it again anyway, spawning two instances fighting over the same file.
Real example: A client e-commerce site had a cron job generating the Google Shopping feed every hour. The script took 45 minutes, but cron kicked it off every hour without checking. After a couple of hours, four processes were writing to the same XML file. Corrupted feed, unindexed products, lost sales. All because of a missing lock.
systemd timers solve many of these problems: dependencies, resource isolation, structured logging, and configurable failure policies. But they're not a magic wand — you need to understand and configure them properly.
Foundations: Writing a Cron Job That Won't Betray You
Before we jump into timers, let's solidify best practices for traditional cron jobs. Even if you move to timers later, these rules apply to any scheduled task.
1. Always Use Locks to Prevent Overlaps
The double-execution problem is avoidable with a lock file. Here's a reliable shell script using flock:
#!/bin/bash
LOCKFILE="/tmp/myscript.lock"
exec 200>"$LOCKFILE"
flock -n 200 || { echo "Already running, exiting"; exit 1; }
# Your code here
sleep 60 # simulate work
rm -f "$LOCKFILE"
But if the script crashes, the lock file remains. Better to use a PID-based lock with automatic cleanup at exit:
LOCKFILE="/var/lock/myscript.lock"
if [ -f "$LOCKFILE" ]; then
pid=$(cat "$LOCKFILE")
if kill -0 "$pid" 2>/dev/null; then
echo "Process $pid still running. Aborting."
exit 1
fi
fi
echo $$ > "$LOCKFILE"
trap "rm -f $LOCKFILE" EXIT
# work...
2. Logging That Really Helps
By default cron sends output to syslog and emails to root. Not manageable. Redirect stdout and stderr to a timestamped file directly in your script:
LOGDIR="/var/log/myjobs"
mkdir -p "$LOGDIR"
LOG="$LOGDIR/$(date +\%Y-\%m-\%d_\%H:\%M)-myscript.log"
exec > >(tee -a "$LOG")
exec 2>&1
echo "$(date) - Start"
# work
echo "$(date) - End"
Then, in crontab, suppress cron's own output:
0 3 * * * /usr/local/bin/myscript.sh >/dev/null 2>&1
3. Explicit Environment Variables
Cron runs jobs with a minimal environment: PATH=/usr/bin:/bin, no custom vars. If your script needs JAVA_HOME or an extended PATH, load them at the top:
#!/bin/bash
source /etc/profile # or ~/.bashrc
export PATH="/usr/local/bin:$PATH"
4. Failure Notifications
Add a notification mechanism. On production servers, we often use curl to a Slack or Telegram webhook at the end of the job with the result:
if [ $? -ne 0 ]; then
curl -s -X POST -H "Content-Type: application/json" \
-d '{"text":"Backup failed on server X"}' \
https://hooks.slack.com/services/TTT/BBB/KKK
fi
systemd Timers: When Cron Isn't Enough
systemd is the init system on all modern Linux distributions. Its timers let you schedule service units with a level of control cron can't match. Four concrete advantages:
- Real-time dependencies: you can specify that a job runs only after a certain service is active (e.g., only after MySQL is up).
- RandomDelay: avoids the “thundering herd” of jobs all starting at the same second.
- Persistence across reboots: if the server is shut down and restarted, systemd catches up on missed jobs (option
Persistent=true). - Centralized logging via journald: all job output goes to the journal, queryable with
journalctl.
Minimal Timer Structure
You need two files: a service (what to do) and a timer (when to do it).
1. Service: /etc/systemd/system/backup-db.service
[Unit]
Description=Daily database backup
[Service]
Type=oneshot
ExecStart=/usr/local/bin/backup_db.sh
User=backup
2. Timer: /etc/systemd/system/backup-db.timer
[Unit]
Description=Run backup-db every day at 3 AM
[Timer]
OnCalendar=daily
RandomizedDelaySec=30m
Persistent=true
[Install]
WantedBy=timers.target
Then enable and start the timer:
sudo systemctl daemon-reload
sudo systemctl enable backup-db.timer
sudo systemctl start backup-db.timer
Check its status:
systemctl list-timers --all
# or
systemctl status backup-db.timer
The OnCalendar syntax is flexible: *-*-* 03:00:00 for every day, Mon..Fri 10:00 for weekdays, or hourly. See man systemd.time for all options.
Handling Failures with Timers
Unlike cron, systemd can retry a failed job. Add to the service:
[Service]
Restart=on-failure
RestartSec=5min
If the script exits with a non-zero code, systemd tries again after 5 minutes. Limit retry attempts with:
StartLimitInterval=1h
StartLimitBurst=3
RandomizedDelaySec for Staggered Jobs
If you have multiple timers scheduled at the same minute, the server might spike. Add RandomizedDelaySec=30m and systemd will delay each execution by a random value between 0 and 30 minutes. Perfect for backup or sync jobs across multiple servers.
When to Use Cron vs systemd Timers
There's no one-size-fits-all answer. At Meteora Web, we follow this practical rule:
- Cron for simple jobs on single servers, when no dependencies are needed and error tolerance is low (e.g., log cleanup, automatic emails).
- systemd timers for critical tasks: backups, report generation, synchronizations, jobs that depend on other services (e.g., after a volume mount).
A real case: a client had a script importing CSV files every hour. With cron, sometimes the script ran while the file was still being written by another process. With a systemd timer we added a Requires= on the service that generates the file, and After=. Zero issues.
Monitoring: Don't Trust, Verify
Setting up a timer or a cron is only the beginning. You need to know when it fails. Here's a minimal checklist:
- Check logs:
journalctl -u backup-db.service -n 20for systemd, or the script's log file for cron. - Set up an external healthcheck: use a service like healthchecks.io or simply a cron job that writes a timestamp to a static file every hour. An external monitor (UptimeRobot, Better Uptime) checks that the timestamp is recent.
- For cron, add a daily report line in crontab:
0 8 * * * echo "Yesterday's job summary" | mail -s "Cron report" admin@yourdomain.com
Complete Example: Backup with Notification System
Let's put it all together. We'll create a service that backs up a database and sends a Telegram notification if it fails.
Service: /etc/systemd/system/backup-mysql.service
[Unit]
Description=MySQL Backup
After=mysql.service
Requires=mysql.service
[Service]
Type=oneshot
ExecStart=/usr/local/bin/backup_mysql.sh
User=backup
Restart=on-failure
RestartSec=5min
StartLimitInterval=1h
StartLimitBurst=3
Timer: /etc/systemd/system/backup-mysql.timer
[Unit]
Description=MySQL backup at 2 AM
[Timer]
OnCalendar=*-*-* 02:00:00
RandomizedDelaySec=15m
Persistent=true
[Install]
WantedBy=timers.target
Backup script: /usr/local/bin/backup_mysql.sh
#!/bin/bash
set -e
BACKUP_DIR="/var/backups/mysql"
LOGFILE="$BACKUP_DIR/backup_$(date +%Y%m%d_%H%M).log"
OUTFILE="$BACKUP_DIR/db_$(date +%Y%m%d_%H%M).sql.gz"
mkdir -p "$BACKUP_DIR"
exec > "$LOGFILE" 2>&1
echo "$(date) - Backup started"
mysqldump --single-transaction --routines --events \
--user backup --password=XXXXXXXX \
--all-databases | gzip > "$OUTFILE"
echo "$(date) - Backup completed: $OUTFILE"
# Optional Telegram notification on success
curl -s -X POST https://api.telegram.org/botTOKEN/sendMessage \
-d chat_id=ID \
-d text="MySQL backup succeeded: $(du -sh $OUTFILE | cut -f1)"
Don't forget to make the script executable and test it manually before activating the timer.
In Summary — What to Do Now
- Audit your existing cron jobs: add locks, logging, and failure notifications. Start with the most critical ones.
- Pick one job to migrate to a systemd timer: start with a backup or a task that depends on a service. Follow the service + timer pattern.
- Set up monitoring: if you don't have external healthchecks yet, configure a simple periodic ping to a static URL on your server. We use UptimeRobot for this.
- Document everything: keep a markdown file in your project repository listing all scheduled jobs, their logs, and how to verify they work. Next time the server crashes, you'll thank yourself.
If you manage servers for clients or internal projects, these practices aren't optional. They are what separates a system that “almost always works” from one you can sleep soundly on. We learned them the hard way — but you can skip the painful part.
Need a hand with your automation? We at Meteora Web have been working on Linux stacks for eight years. If your cron job isn't up to scratch, let's talk.
Sponsored Protocol