Module 4: Service Management with Systemd

Learning Objectives

By the end of this module, you will be able to:

Understand systemd's architecture and its role in modern Linux systems
Identify and work with different systemd unit types
Manage services using systemd commands
Create and modify custom service units
Configure service dependencies
Access and analyze systemd journal logs
Troubleshoot common service-related issues

Introduction to Systemd

Systemd is the default init system and service manager for most modern Linux distributions. It was designed to overcome limitations in the traditional SysV init system by providing parallel service startup, on-demand service activation, and dependency management.

An init system is the first process (PID 1) started by the kernel during boot, responsible for initializing the system and managing services. Unlike its predecessors, systemd goes beyond just starting services at boot—it provides a comprehensive framework for service management throughout the system's lifecycle.

Systemd Architecture and Design Philosophy

Systemd was created with several design principles in mind:

Parallelization: Unlike the sequential SysV init, systemd starts services in parallel whenever possible, significantly reducing boot time. It achieves this by building a dependency tree of services and starting independent branches simultaneously.
Unified Resource Management: Systemd doesn't just start and stop services; it provides a comprehensive framework for managing all system resources. Each service runs in its own control group (cgroup), allowing for fine-grained resource control and process tracking.
Socket and Bus Activation: Services can be started on-demand when needed, rather than at boot time. For example, a network service can be configured to start only when a connection request arrives at its socket.

Under the hood, systemd implements these features through a modular architecture consisting of:

systemd Core: The main daemon that initializes the system
Unit System: A framework for defining different types of system resources
Journal: A unified logging system
Control Groups Integration: For resource management and process tracking

The systemd process hierarchy looks like this:

systemd (PID 1)
├── systemd-journald (logging)
├── systemd-udevd (device management)
├── systemd-logind (login management)
├── Various service processes
└── User systemd instances (one per logged-in user)

When the Linux kernel boots, it starts systemd as PID 1. Systemd then reads its configuration files (primarily from /etc/systemd/ and /lib/systemd/), builds a dependency graph of units to start, and begins activating them in the correct order.

Systemd Unit Types

Units are systemd's fundamental building blocks – standardized configuration files that describe resources systemd can manage. Each unit is defined by a configuration file with a specific suffix indicating its type.

Service Units (.service)

Service units define daemons or processes that systemd manages. These are the most common unit type and represent the actual applications running on your system.

Example service unit file (sshd.service):

[Unit]
Description=OpenSSH server daemon
Documentation=man:sshd(8) man:sshd_config(5)
After=network.target

[Service]
Type=notify
ExecStart=/usr/sbin/sshd -D
ExecReload=/bin/kill -HUP $MAINPID

[Install]
WantedBy=multi-user.target

Under the hood, service units configure how systemd manages a process:

How to start, stop, and reload it
What environment it runs in
What permissions it has
What happens if it crashes
What other units it depends on

Target Units (.target)

Targets are groups of units that represent system states (similar to runlevels in SysV init). They provide synchronization points during boot and allow for organized service activation.

Common targets include:

graphical.target: Full graphical desktop environment
multi-user.target: Text-mode multi-user system
emergency.target: Minimal system for recovery

When you boot to a target, systemd ensures all units in that target are activated.

Timer Units (.timer)

Timer units trigger other units based on time events, functioning similarly to cron jobs but with greater flexibility. A timer unit activates a corresponding service unit when its timer elapses.

Example timer unit (backup.timer):

[Unit]
Description=Daily backup timer

[Timer]
OnCalendar=*-*-* 02:00:00
Persistent=true

[Install]
WantedBy=timers.target

This timer would trigger a corresponding backup.service unit daily at 2:00 AM.

Socket Units (.socket)

Socket units define network or IPC sockets that can be used for socket activation. When a connection request arrives at the socket, systemd automatically starts the corresponding service.

Example socket unit (ssh.socket):

[Unit]
Description=SSH Socket for On-Demand SSH Server

[Socket]
ListenStream=22
Accept=yes

[Install]
WantedBy=sockets.target

This enables on-demand activation of SSH services only when a connection is attempted.

Other Unit Types

Several other specialized unit types exist:

Mount units (.mount): Define filesystem mount points
Device units (.device): Represent hardware devices
Snapshot units (.snapshot): Store the current state of systemd for restoration
Scope units (.scope): Organize externally created processes
Slice units (.slice): Group and organize processes for resource management
Path units (.path): Trigger actions based on filesystem changes

Each unit type serves a specific role in the comprehensive system management that systemd provides.

Service Control Commands

Basic Service Management

Starting a service:

systemctl start nginx.service

This command sends a start request to systemd, which then executes the command defined in the ExecStart directive of the service unit file.

Stopping a service:

systemctl stop nginx.service

This triggers the graceful shutdown procedure for the service, sending the appropriate termination signal to the process.

Restarting a service:

systemctl restart nginx.service

This is equivalent to running stop followed by start, but as a single atomic operation.

Reloading a service's configuration:

systemctl reload nginx.service

This sends a signal (usually SIGHUP) to the service, instructing it to reload its configuration without restarting.

Checking a service's status:

systemctl status nginx.service

This provides comprehensive information about the service, including whether it's running, process ID, resource usage, recent log messages, and startup time.

Enabling and Disabling Services

Enabling a service:

systemctl enable nginx.service

This creates symbolic links from the service file to the appropriate location (typically in /etc/systemd/system/multi-user.target.wants/), ensuring the service starts automatically at boot.

Disabling a service:

systemctl disable nginx.service

This removes those symbolic links, preventing the service from starting at boot.

One-time enablement:

systemctl enable --now nginx.service

This both enables the service for future boots and starts it immediately.

Under the hood, enabling a service creates symbolic links between the service file and the .wants/ directory of its target, establishing the relationship between the service and when it should be started.

Masking and Unmasking

Masking a service:

systemctl mask nginx.service

This creates a symbolic link from the service file to /dev/null, making it impossible to start the service until it is unmasked.

Unmasking a service:

systemctl unmask nginx.service

This removes the symbolic link to /dev/null, allowing the service to be started again.

System-wide Service Management

Listing all services:

systemctl list-units --type=service

Listing all active services:

systemctl list-units --type=service --state=active

Viewing failed services:

systemctl --failed

Reloading systemd configuration:

systemctl daemon-reload

This command is essential after modifying any unit files, as it instructs systemd to reload its configuration.

Creating and Modifying Service Units

Understanding how to create custom service units is a key skill for system administrators, allowing you to automate and manage your own applications.

Service Unit File Structure

Service unit files typically contain three sections:

[Unit]: Provides metadata and defines dependencies
[Service]: Specifies how the service should be started and managed
[Install]: Determines how the service should be enabled

Creating a Basic Service Unit

Let's create a simple service for a Python web application:

Create a new file in /etc/systemd/system/myapp.service:

[Unit]
Description=My Python Web Application
After=network.target

[Service]
Type=simple
User=webuser
WorkingDirectory=/opt/myapp
ExecStart=/usr/bin/python3 /opt/myapp/app.py
Restart=on-failure

[Install]
WantedBy=multi-user.target

Reload the systemd configuration:

systemctl daemon-reload

Start and enable the service:

systemctl enable --now myapp.service

Key Service Configuration Options

[Unit] Section Options

Description: A human-readable description of the unit
Documentation: URLs or man pages with documentation
After/Before: Ordering dependencies (doesn't create activation dependencies)
Requires: Units that must be successfully activated for this unit to start
Wants: Units that should be activated alongside this unit but aren't required
Conflicts: Units that cannot be active at the same time as this unit

[Service] Section Options

Type: Defines the startup type of the service:
- simple: Default; main process is the process started with ExecStart
- forking: Process spawns a child and then exits
- oneshot: Process exits before starting follow-up units
- notify: Process sends notification when startup is complete
- dbus: Process registers with D-Bus to signal completion
- idle: Similar to simple, but delayed until other jobs complete
ExecStart: Command to start the service
ExecStop: Command to stop the service (if not specified, process is killed)
ExecReload: Command to reload the service
Restart: When to restart the service automatically
- no: Don't restart automatically (default)
- on-success: Restart only if the service exited cleanly
- on-failure: Restart if the service exited with an error
- on-abnormal: Restart if the service was terminated by a signal
- on-abort: Restart if the service was aborted
- on-watchdog: Restart if the watchdog timeout expires
- always: Always restart regardless of exit status
RestartSec: Time to wait before restarting the service
TimeoutStartSec: Maximum time for the service to start
TimeoutStopSec: Maximum time for the service to stop
Environment: Environment variables for the service
WorkingDirectory: Working directory for the service
User/Group: User and group to run the service as

[Install] Section Options

WantedBy: Specifies which target should want this service when enabled
RequiredBy: Specifies which target requires this service when enabled
Alias: Additional names for the unit
Also: Additional units to enable/disable alongside this one

Modifying Existing Service Units

To modify an existing system service, it's best to create an override rather than editing the original unit file:

systemctl edit nginx.service

This opens a text editor where you can add override directives. For example, to change the number of worker processes for Nginx:

[Service]
Environment=NGINX_WORKER_PROCESSES=2

Save and close the editor. Systemd will automatically create an override file in /etc/systemd/system/nginx.service.d/override.conf and reload the configuration.

Dependencies Between Services

Systemd provides several ways to express dependencies between units, allowing for complex startup sequences and failure handling.

Types of Dependencies

Ordering Dependencies:
- After: This unit starts after the listed units
- Before: This unit starts before the listed units
These only affect the order of startup, not whether units are started.
Requirement Dependencies:
- Requires: Units that must be successfully activated for this unit to run
- Requisite: Units that must already be active for this unit to run
- BindsTo: Similar to Requires, but this unit also stops if the dependency stops
Soft Dependencies:
- Wants: Units that should be started alongside this unit, but aren't required
- Conflicts: Units that cannot be active at the same time as this unit
Logical Dependencies:
- PartOf: This unit is part of the listed units (stops when they stop)
- PropagatesReloadTo: Reload commands are propagated to the listed units

Creating Service Dependencies

Dependencies are defined in the [Unit] section of service files. For example:

[Unit]
Description=Web Application
After=network.target postgresql.service
Requires=postgresql.service
Wants=monitoring.service

This configuration ensures that:

The service starts after both the network and PostgreSQL service
The service will only start if PostgreSQL is successfully started
The monitoring service is started alongside the web application, but failure of the monitoring service won't prevent the web application from starting

Under the Hood: Dependency Resolution

When systemd starts, it builds a directed graph of all units and their dependencies. It then:

Resolves ordering dependencies to determine the sequence of activation
Checks requirement dependencies to ensure all prerequisites are met
Adds any units connected via soft dependencies to the activation list
Ensures conflicting units are not activated simultaneously

If a dependency fails, systemd handles the failure according to the dependency type.

Practical Example: Web Application Stack

Let's create a dependency chain for a web application stack with a database, application server, and load balancer:

Database service (db.service):

[Unit]
Description=Database Server
After=network.target

[Service]
Type=simple
ExecStart=/usr/bin/database-server
Restart=on-failure

[Install]
WantedBy=multi-user.target

Application service (app.service):

[Unit]
Description=Application Server
After=network.target db.service
Requires=db.service

[Service]
Type=simple
ExecStart=/usr/bin/app-server
Restart=on-failure

[Install]
WantedBy=multi-user.target

Load balancer service (lb.service):

[Unit]
Description=Load Balancer
After=network.target app.service
Requires=app.service

[Service]
Type=simple
ExecStart=/usr/bin/load-balancer
Restart=on-failure

[Install]
WantedBy=multi-user.target

With this configuration:

The database starts first
The application server starts only after the database and only if the database started successfully
The load balancer starts only after the application server and only if the application server started successfully

Systemd Journal Logs and Logging Configuration

Systemd includes its own logging system called the journal, which collects and manages log data from various sources in a structured, indexed format.

Accessing Journal Logs

The primary tool for accessing journal logs is journalctl:

Viewing all logs:

journalctl

Viewing logs for a specific service:

journalctl -u nginx.service

Viewing logs since a specific time:

journalctl --since "2023-01-01 12:00:00"

Following logs in real-time:

journalctl -f

Viewing logs from the current boot:

journalctl -b

Viewing kernel messages:

journalctl -k

Journal Log Structure

Under the hood, the journal stores logs in a binary format in /var/log/journal/ (if persistent storage is enabled) or /run/log/journal/ (for volatile storage). Each log entry contains:

A timestamp
The source (service, process, etc.)
Priority level
Message content
Metadata (UID, GID, PID, etc.)

Configuring Journal Logging

The journal's behavior is configured in /etc/systemd/journald.conf. Key configuration options include:

Storage: Determines how logs are stored (persistent, volatile, auto, or none)
Compress: Whether to compress journal files
SystemMaxUse: Maximum disk space used by journals
SystemKeepFree: Minimum free space to preserve on the filesystem
MaxFileSec: Maximum time to store entries
ForwardToSyslog: Whether to forward to the traditional syslog daemon
ForwardToConsole: Whether to forward to the system console
ForwardToWall: Whether to forward to all logged-in users

Example configuration to limit journal size:

[Journal]
SystemMaxUse=1G
SystemKeepFree=500M
MaxFileSec=1month

After modifying the configuration, restart the journal service:

systemctl restart systemd-journald

Service-Specific Logging Configuration

Individual services can configure their logging behavior through directives in their service unit files:

[Service]
StandardOutput=journal
StandardError=journal
SyslogIdentifier=myapp
LogLevelMax=info

Structured Logging

One of the journal's key features is support for structured logging. Programs can send structured metadata along with log messages, which can then be used for filtering:

logger -t myapp --id=$$ "User login" --structured '{"user":"alice","ip":"192.168.1.10"}'

To filter logs based on structured fields:

journalctl _STRUCTURED_DATA_user=alice

Hands-on Exercises

Exercise 1: Creating a Custom Service

In this exercise, you'll create a simple service that runs a script periodically.

Create a simple script that logs the current date and system load:

cat > /usr/local/bin/system-monitor.sh << 'EOF'
#!/bin/bash
echo "System Monitor Report - $(date)"
echo "Load Average: $(cat /proc/loadavg)"
echo "Memory Usage: $(free -h | grep Mem)"
echo "----------------------------------------"
EOF

chmod +x /usr/local/bin/system-monitor.sh

Create a systemd service unit for this script:

cat > /etc/systemd/system/system-monitor.service << 'EOF'
[Unit]
Description=System Monitoring Service
After=network.target

[Service]
Type=oneshot
ExecStart=/usr/local/bin/system-monitor.sh
StandardOutput=journal

[Install]
WantedBy=multi-user.target
EOF

Create a timer unit to run the service every 5 minutes:

cat > /etc/systemd/system/system-monitor.timer << 'EOF'
[Unit]
Description=Run system monitor every 5 minutes
Requires=system-monitor.service

[Timer]
Unit=system-monitor.service
OnBootSec=1min
OnUnitActiveSec=5min
AccuracySec=1s

[Install]
WantedBy=timers.target
EOF

Enable and start the timer:

systemctl daemon-reload
systemctl enable --now system-monitor.timer

Verify it's working:

systemctl list-timers | grep system-monitor
journalctl -u system-monitor.service

Exercise 2: Troubleshooting a Failing Service

In this exercise, you'll diagnose and fix a problematic service.

Create a deliberately failing service:

cat > /etc/systemd/system/failing-service.service << 'EOF'
[Unit]
Description=A Service That Will Fail

[Service]
Type=simple
ExecStart=/usr/bin/nonexistent-command
Restart=on-failure
RestartSec=10

[Install]
WantedBy=multi-user.target
EOF

Enable and try to start the service:

systemctl daemon-reload
systemctl enable --now failing-service.service

Investigate the failure:

systemctl status failing-service.service
journalctl -u failing-service.service

Fix the service by creating an override:

systemctl edit failing-service.service

Add the following content:

[Service]
ExecStart=
ExecStart=/bin/bash -c "echo 'Service is now working!'"

Restart the service and verify it works:

systemctl restart failing-service.service
systemctl status failing-service.service

Exercise 3: Creating Service Dependencies

In this exercise, you'll create a chain of dependent services to understand how dependencies work.

Create three simple services:

Service A

cat > /etc/systemd/system/service-a.service << 'EOF'
[Unit]
Description=Service A - Independent Service

[Service]
Type=oneshot
RemainAfterExit=yes
ExecStart=/bin/bash -c "echo 'Service A started' > /tmp/service-a.log"
ExecStop=/bin/bash -c "echo 'Service A stopped' >> /tmp/service-a.log"

[Install]
WantedBy=multi-user.target
EOF

Service B

cat > /etc/systemd/system/service-b.service << 'EOF'
[Unit]
Description=Service B - Depends on Service A
After=service-a.service
Requires=service-a.service

[Service]
Type=oneshot
RemainAfterExit=yes
ExecStart=/bin/bash -c "echo 'Service B started' > /tmp/service-b.log"
ExecStop=/bin/bash -c "echo 'Service B stopped' >> /tmp/service-b.log"

[Install]
WantedBy=multi-user.target
EOF

Service C

cat > /etc/systemd/system/service-c.service << 'EOF'
[Unit]
Description=Service C - Depends on Service B
After=service-b.service
Requires=service-b.service

[Service]
Type=oneshot
RemainAfterExit=yes
ExecStart=/bin/bash -c "echo 'Service C started' > /tmp/service-c.log"
ExecStop=/bin/bash -c "echo 'Service C stopped' >> /tmp/service-c.log"

[Install]
WantedBy=multi-user.target
EOF

Enable and start Service C:

systemctl daemon-reload
systemctl enable --now service-c.service

Observe the dependency chain:

systemctl status service-a.service service-b.service service-c.service
cat /tmp/service-*.log

Test the dependency behavior by stopping Service A:
```
systemctl stop service-a.service
systemctl status service-a.service service-b.service service-c.service
cat /tmp/service-*.log
```
Notice how stopping Service A affects the entire chain due to the "Requires" dependency.
Modify Service C to use "Wants" instead of "Requires":
```
systemctl edit service-c.service
```
Add the following content:
```
[Unit]
Requires=
Wants=service-b.service
```
Test the new behavior:
```
systemctl start service-a.service service-b.service service-c.service
systemctl stop service-b.service
systemctl status service-c.service
```
Notice that Service C remains running even though Service B stopped, demonstrating the difference between "Requires" and "Wants".

Common Pitfalls and Troubleshooting

Common Service Management Issues

Service fails to start
- Check the service status and journal logs:
```
systemctl status myservice.service
journalctl -u myservice.service
```
- Verify permissions on executable files and directories

- Check for missing dependencies

- Validate the ExecStart command
Service starts but immediately exits
- This often indicates a configuration error in the application

- Check the application's own logs and try running the ExecStart command manually

- Ensure the service Type is appropriate (e.g., "forking" for daemons that fork)
Service works when started manually but fails at boot
- This may be due to ordering issues with dependencies. Add appropriate "After=" directives and check resource availability.
Changes to unit files aren't taking effect
- Remember to run systemctl daemon-reload after modifying unit files. For overrides, ensure they're in the correct .d directory.
Service doesn't stop cleanly
- Configure an appropriate ExecStop command, adjust TimeoutStopSec, and ensure the service handles SIGTERM signals properly.

Advanced Troubleshooting Techniques

Debugging environment issues

systemd-analyze
systemd-analyze critical-chain

Check environment variables with:

systemctl show myservice.service -p Environment

Checking unit file syntax

systemd-analyze verify /etc/systemd/system/myservice.service

Finding dependency problems

systemctl list-dependencies myservice.service
systemctl list-dependencies --reverse myservice.service

Inspecting service resource usage
```
systemd-cgtop
```

Testing service configurations

systemd-run --unit=test-service --property=ExecStart=/path/to/executable

Common Journal Issues

Journal fills up disk space
- Configure journal size limits in /etc/systemd/journald.conf

- Manually clear old entries:
```
journalctl --vacuum-time=1d
journalctl --vacuum-size=500M
```
Logs not persistent across reboots
- Ensure persistent storage is enabled:
```
mkdir -p /var/log/journal
systemctl restart systemd-journald
```
Too many or too few logs
- Adjust the LogLevelMax in the service unit file and configure rate limiting in journald.conf

Quick Reference Summary

Essential Commands

Command	Description
`systemctl start [unit]`	Start a unit
`systemctl stop [unit]`	Stop a unit
`systemctl restart [unit]`	Restart a unit
`systemctl status [unit]`	Check unit status
`systemctl enable [unit]`	Enable unit to start at boot
`systemctl disable [unit]`	Disable unit from starting at boot
`systemctl daemon-reload`	Reload unit files after changes
`journalctl -u [unit]`	View logs for a specific unit
`journalctl -f`	Follow logs in real-time
`systemctl edit [unit]`	Create/edit unit file overrides

Key Unit File Sections

[Unit]: Metadata, descriptions, and dependencies
[Service]: Service-specific behaviors and commands
[Install]: Installation information for enabling/disabling

Common Service Types

simple: Main process is started directly with ExecStart
forking: Process forks and parent exits
oneshot: Process runs and exits before starting follow-up units
notify: Process signals when it's ready
dbus: Process signals readiness via D-Bus

Service Dependency Types

After/Before: Ordering without activation dependency
Requires: Hard dependency, must be active
Wants: Soft dependency, try to activate but not required
BindsTo: Hard dependency that extends to runtime

Journal Storage Options

persistent: Stored on disk in /var/log/journal/
volatile: Stored in memory in /run/log/journal/
auto: Persistent if directory exists, otherwise volatile
none: No storage, discards all messages

With this knowledge, you now have a comprehensive understanding of systemd service management in Linux, from basic service control to creating complex dependency chains and troubleshooting issues. This foundation will serve you well as you continue to work with and maintain Linux systems.