Module 4: Service Management with Systemd

Learning Objectives

By the end of this module, you will be able to:

Introduction to Systemd

Systemd is the default init system and service manager for most modern Linux distributions. It was designed to overcome limitations in the traditional SysV init system by providing parallel service startup, on-demand service activation, and dependency management.

An init system is the first process (PID 1) started by the kernel during boot, responsible for initializing the system and managing services. Unlike its predecessors, systemd goes beyond just starting services at boot—it provides a comprehensive framework for service management throughout the system's lifecycle.

Systemd Architecture and Design Philosophy

Systemd was created with several design principles in mind:

Under the hood, systemd implements these features through a modular architecture consisting of:

  1. systemd Core: The main daemon that initializes the system
  2. Unit System: A framework for defining different types of system resources
  3. Journal: A unified logging system
  4. Control Groups Integration: For resource management and process tracking

The systemd process hierarchy looks like this:

systemd (PID 1)
├── systemd-journald (logging)
├── systemd-udevd (device management)
├── systemd-logind (login management)
├── Various service processes
└── User systemd instances (one per logged-in user)

When the Linux kernel boots, it starts systemd as PID 1. Systemd then reads its configuration files (primarily from /etc/systemd/ and /lib/systemd/), builds a dependency graph of units to start, and begins activating them in the correct order.

Systemd Unit Types

Units are systemd's fundamental building blocks – standardized configuration files that describe resources systemd can manage. Each unit is defined by a configuration file with a specific suffix indicating its type.

Service Units (.service)

Service units define daemons or processes that systemd manages. These are the most common unit type and represent the actual applications running on your system.

Example service unit file (sshd.service):

[Unit]
Description=OpenSSH server daemon
Documentation=man:sshd(8) man:sshd_config(5)
After=network.target

[Service]
Type=notify
ExecStart=/usr/sbin/sshd -D
ExecReload=/bin/kill -HUP $MAINPID

[Install]
WantedBy=multi-user.target

Under the hood, service units configure how systemd manages a process:

Target Units (.target)

Targets are groups of units that represent system states (similar to runlevels in SysV init). They provide synchronization points during boot and allow for organized service activation.

Common targets include:

When you boot to a target, systemd ensures all units in that target are activated.

Timer Units (.timer)

Timer units trigger other units based on time events, functioning similarly to cron jobs but with greater flexibility. A timer unit activates a corresponding service unit when its timer elapses.

Example timer unit (backup.timer):

[Unit]
Description=Daily backup timer

[Timer]
OnCalendar=*-*-* 02:00:00
Persistent=true

[Install]
WantedBy=timers.target

This timer would trigger a corresponding backup.service unit daily at 2:00 AM.

Socket Units (.socket)

Socket units define network or IPC sockets that can be used for socket activation. When a connection request arrives at the socket, systemd automatically starts the corresponding service.

Example socket unit (ssh.socket):

[Unit]
Description=SSH Socket for On-Demand SSH Server

[Socket]
ListenStream=22
Accept=yes

[Install]
WantedBy=sockets.target

This enables on-demand activation of SSH services only when a connection is attempted.

Other Unit Types

Several other specialized unit types exist:

Each unit type serves a specific role in the comprehensive system management that systemd provides.

Service Control Commands

Basic Service Management

Starting a service:

systemctl start nginx.service

This command sends a start request to systemd, which then executes the command defined in the ExecStart directive of the service unit file.

Stopping a service:

systemctl stop nginx.service

This triggers the graceful shutdown procedure for the service, sending the appropriate termination signal to the process.

Restarting a service:

systemctl restart nginx.service

This is equivalent to running stop followed by start, but as a single atomic operation.

Reloading a service's configuration:

systemctl reload nginx.service

This sends a signal (usually SIGHUP) to the service, instructing it to reload its configuration without restarting.

Checking a service's status:

systemctl status nginx.service

This provides comprehensive information about the service, including whether it's running, process ID, resource usage, recent log messages, and startup time.

Enabling and Disabling Services

Enabling a service:

systemctl enable nginx.service

This creates symbolic links from the service file to the appropriate location (typically in /etc/systemd/system/multi-user.target.wants/), ensuring the service starts automatically at boot.

Disabling a service:

systemctl disable nginx.service

This removes those symbolic links, preventing the service from starting at boot.

One-time enablement:

systemctl enable --now nginx.service

This both enables the service for future boots and starts it immediately.

Under the hood, enabling a service creates symbolic links between the service file and the .wants/ directory of its target, establishing the relationship between the service and when it should be started.

Masking and Unmasking

Masking a service:

systemctl mask nginx.service

This creates a symbolic link from the service file to /dev/null, making it impossible to start the service until it is unmasked.

Unmasking a service:

systemctl unmask nginx.service

This removes the symbolic link to /dev/null, allowing the service to be started again.

System-wide Service Management

Listing all services:

systemctl list-units --type=service

Listing all active services:

systemctl list-units --type=service --state=active

Viewing failed services:

systemctl --failed

Reloading systemd configuration:

systemctl daemon-reload

This command is essential after modifying any unit files, as it instructs systemd to reload its configuration.

Creating and Modifying Service Units

Understanding how to create custom service units is a key skill for system administrators, allowing you to automate and manage your own applications.

Service Unit File Structure

Service unit files typically contain three sections:

  1. [Unit]: Provides metadata and defines dependencies
  2. [Service]: Specifies how the service should be started and managed
  3. [Install]: Determines how the service should be enabled

Creating a Basic Service Unit

Let's create a simple service for a Python web application:

  1. Create a new file in /etc/systemd/system/myapp.service:
[Unit]
Description=My Python Web Application
After=network.target

[Service]
Type=simple
User=webuser
WorkingDirectory=/opt/myapp
ExecStart=/usr/bin/python3 /opt/myapp/app.py
Restart=on-failure

[Install]
WantedBy=multi-user.target
  1. Reload the systemd configuration:
systemctl daemon-reload
  1. Start and enable the service:
systemctl enable --now myapp.service

Key Service Configuration Options

[Unit] Section Options

[Service] Section Options

[Install] Section Options

Modifying Existing Service Units

To modify an existing system service, it's best to create an override rather than editing the original unit file:

systemctl edit nginx.service

This opens a text editor where you can add override directives. For example, to change the number of worker processes for Nginx:

[Service]
Environment=NGINX_WORKER_PROCESSES=2

Save and close the editor. Systemd will automatically create an override file in /etc/systemd/system/nginx.service.d/override.conf and reload the configuration.

Dependencies Between Services

Systemd provides several ways to express dependencies between units, allowing for complex startup sequences and failure handling.

Types of Dependencies

  1. Ordering Dependencies:
    • After: This unit starts after the listed units
    • Before: This unit starts before the listed units

    These only affect the order of startup, not whether units are started.

  2. Requirement Dependencies:
    • Requires: Units that must be successfully activated for this unit to run
    • Requisite: Units that must already be active for this unit to run
    • BindsTo: Similar to Requires, but this unit also stops if the dependency stops
  3. Soft Dependencies:
    • Wants: Units that should be started alongside this unit, but aren't required
    • Conflicts: Units that cannot be active at the same time as this unit
  4. Logical Dependencies:
    • PartOf: This unit is part of the listed units (stops when they stop)
    • PropagatesReloadTo: Reload commands are propagated to the listed units

Creating Service Dependencies

Dependencies are defined in the [Unit] section of service files. For example:

[Unit]
Description=Web Application
After=network.target postgresql.service
Requires=postgresql.service
Wants=monitoring.service

This configuration ensures that:

Under the Hood: Dependency Resolution

When systemd starts, it builds a directed graph of all units and their dependencies. It then:

  1. Resolves ordering dependencies to determine the sequence of activation
  2. Checks requirement dependencies to ensure all prerequisites are met
  3. Adds any units connected via soft dependencies to the activation list
  4. Ensures conflicting units are not activated simultaneously

If a dependency fails, systemd handles the failure according to the dependency type.

Practical Example: Web Application Stack

Let's create a dependency chain for a web application stack with a database, application server, and load balancer:

  1. Database service (db.service):
[Unit]
Description=Database Server
After=network.target

[Service]
Type=simple
ExecStart=/usr/bin/database-server
Restart=on-failure

[Install]
WantedBy=multi-user.target
  1. Application service (app.service):
[Unit]
Description=Application Server
After=network.target db.service
Requires=db.service

[Service]
Type=simple
ExecStart=/usr/bin/app-server
Restart=on-failure

[Install]
WantedBy=multi-user.target
  1. Load balancer service (lb.service):
[Unit]
Description=Load Balancer
After=network.target app.service
Requires=app.service

[Service]
Type=simple
ExecStart=/usr/bin/load-balancer
Restart=on-failure

[Install]
WantedBy=multi-user.target

With this configuration:

Systemd Journal Logs and Logging Configuration

Systemd includes its own logging system called the journal, which collects and manages log data from various sources in a structured, indexed format.

Accessing Journal Logs

The primary tool for accessing journal logs is journalctl:

journalctl
journalctl -u nginx.service
journalctl --since "2023-01-01 12:00:00"
journalctl -f
journalctl -b
journalctl -k

Journal Log Structure

Under the hood, the journal stores logs in a binary format in /var/log/journal/ (if persistent storage is enabled) or /run/log/journal/ (for volatile storage). Each log entry contains:

Configuring Journal Logging

The journal's behavior is configured in /etc/systemd/journald.conf. Key configuration options include:

Example configuration to limit journal size:

[Journal]
SystemMaxUse=1G
SystemKeepFree=500M
MaxFileSec=1month

After modifying the configuration, restart the journal service:

systemctl restart systemd-journald

Service-Specific Logging Configuration

Individual services can configure their logging behavior through directives in their service unit files:

[Service]
StandardOutput=journal
StandardError=journal
SyslogIdentifier=myapp
LogLevelMax=info

Structured Logging

One of the journal's key features is support for structured logging. Programs can send structured metadata along with log messages, which can then be used for filtering:

logger -t myapp --id=$$ "User login" --structured '{"user":"alice","ip":"192.168.1.10"}'

To filter logs based on structured fields:

journalctl _STRUCTURED_DATA_user=alice

Hands-on Exercises

Exercise 1: Creating a Custom Service

In this exercise, you'll create a simple service that runs a script periodically.

  1. Create a simple script that logs the current date and system load:
    cat > /usr/local/bin/system-monitor.sh << 'EOF'
    #!/bin/bash
    echo "System Monitor Report - $(date)"
    echo "Load Average: $(cat /proc/loadavg)"
    echo "Memory Usage: $(free -h | grep Mem)"
    echo "----------------------------------------"
    EOF
    
    chmod +x /usr/local/bin/system-monitor.sh
    
  2. Create a systemd service unit for this script:
    cat > /etc/systemd/system/system-monitor.service << 'EOF'
    [Unit]
    Description=System Monitoring Service
    After=network.target
    
    [Service]
    Type=oneshot
    ExecStart=/usr/local/bin/system-monitor.sh
    StandardOutput=journal
    
    [Install]
    WantedBy=multi-user.target
    EOF
    
  3. Create a timer unit to run the service every 5 minutes:
    cat > /etc/systemd/system/system-monitor.timer << 'EOF'
    [Unit]
    Description=Run system monitor every 5 minutes
    Requires=system-monitor.service
    
    [Timer]
    Unit=system-monitor.service
    OnBootSec=1min
    OnUnitActiveSec=5min
    AccuracySec=1s
    
    [Install]
    WantedBy=timers.target
    EOF
    
  4. Enable and start the timer:
    systemctl daemon-reload
    systemctl enable --now system-monitor.timer
    
  5. Verify it's working:
    systemctl list-timers | grep system-monitor
    journalctl -u system-monitor.service
    

Exercise 2: Troubleshooting a Failing Service

In this exercise, you'll diagnose and fix a problematic service.

  1. Create a deliberately failing service:
    cat > /etc/systemd/system/failing-service.service << 'EOF'
    [Unit]
    Description=A Service That Will Fail
    
    [Service]
    Type=simple
    ExecStart=/usr/bin/nonexistent-command
    Restart=on-failure
    RestartSec=10
    
    [Install]
    WantedBy=multi-user.target
    EOF
    
  2. Enable and try to start the service:
    systemctl daemon-reload
    systemctl enable --now failing-service.service
    
  3. Investigate the failure:
    systemctl status failing-service.service
    journalctl -u failing-service.service
    
  4. Fix the service by creating an override:
    systemctl edit failing-service.service
    

    Add the following content:

    [Service]
    ExecStart=
    ExecStart=/bin/bash -c "echo 'Service is now working!'"
    
  5. Restart the service and verify it works:
    systemctl restart failing-service.service
    systemctl status failing-service.service
    

Exercise 3: Creating Service Dependencies

In this exercise, you'll create a chain of dependent services to understand how dependencies work.

  1. Create three simple services:

    Service A

    cat > /etc/systemd/system/service-a.service << 'EOF'
    [Unit]
    Description=Service A - Independent Service
    
    [Service]
    Type=oneshot
    RemainAfterExit=yes
    ExecStart=/bin/bash -c "echo 'Service A started' > /tmp/service-a.log"
    ExecStop=/bin/bash -c "echo 'Service A stopped' >> /tmp/service-a.log"
    
    [Install]
    WantedBy=multi-user.target
    EOF
    

    Service B

    cat > /etc/systemd/system/service-b.service << 'EOF'
    [Unit]
    Description=Service B - Depends on Service A
    After=service-a.service
    Requires=service-a.service
    
    [Service]
    Type=oneshot
    RemainAfterExit=yes
    ExecStart=/bin/bash -c "echo 'Service B started' > /tmp/service-b.log"
    ExecStop=/bin/bash -c "echo 'Service B stopped' >> /tmp/service-b.log"
    
    [Install]
    WantedBy=multi-user.target
    EOF
    

    Service C

    cat > /etc/systemd/system/service-c.service << 'EOF'
    [Unit]
    Description=Service C - Depends on Service B
    After=service-b.service
    Requires=service-b.service
    
    [Service]
    Type=oneshot
    RemainAfterExit=yes
    ExecStart=/bin/bash -c "echo 'Service C started' > /tmp/service-c.log"
    ExecStop=/bin/bash -c "echo 'Service C stopped' >> /tmp/service-c.log"
    
    [Install]
    WantedBy=multi-user.target
    EOF
    
  2. Enable and start Service C:
    systemctl daemon-reload
    systemctl enable --now service-c.service
    
  3. Observe the dependency chain:
    systemctl status service-a.service service-b.service service-c.service
    cat /tmp/service-*.log
    
  4. Test the dependency behavior by stopping Service A:
    systemctl stop service-a.service
    systemctl status service-a.service service-b.service service-c.service
    cat /tmp/service-*.log
    

    Notice how stopping Service A affects the entire chain due to the "Requires" dependency.

  5. Modify Service C to use "Wants" instead of "Requires":
    systemctl edit service-c.service
    

    Add the following content:

    [Unit]
    Requires=
    Wants=service-b.service
    
  6. Test the new behavior:
    systemctl start service-a.service service-b.service service-c.service
    systemctl stop service-b.service
    systemctl status service-c.service
    

    Notice that Service C remains running even though Service B stopped, demonstrating the difference between "Requires" and "Wants".

Common Pitfalls and Troubleshooting

Common Service Management Issues

  1. Service fails to start

    - Check the service status and journal logs:

    systemctl status myservice.service
    journalctl -u myservice.service
    

    - Verify permissions on executable files and directories

    - Check for missing dependencies

    - Validate the ExecStart command

  2. Service starts but immediately exits

    - This often indicates a configuration error in the application

    - Check the application's own logs and try running the ExecStart command manually

    - Ensure the service Type is appropriate (e.g., "forking" for daemons that fork)

  3. Service works when started manually but fails at boot

    - This may be due to ordering issues with dependencies. Add appropriate "After=" directives and check resource availability.

  4. Changes to unit files aren't taking effect

    - Remember to run systemctl daemon-reload after modifying unit files. For overrides, ensure they're in the correct .d directory.

  5. Service doesn't stop cleanly

    - Configure an appropriate ExecStop command, adjust TimeoutStopSec, and ensure the service handles SIGTERM signals properly.

Advanced Troubleshooting Techniques

  1. Debugging environment issues
    systemd-analyze
    systemd-analyze critical-chain
    

    Check environment variables with:

    systemctl show myservice.service -p Environment
    
  2. Checking unit file syntax
    systemd-analyze verify /etc/systemd/system/myservice.service
    
  3. Finding dependency problems
    systemctl list-dependencies myservice.service
    systemctl list-dependencies --reverse myservice.service
    
  4. Inspecting service resource usage
    systemd-cgtop
    
  5. Testing service configurations
    systemd-run --unit=test-service --property=ExecStart=/path/to/executable
    

Common Journal Issues

  1. Journal fills up disk space

    - Configure journal size limits in /etc/systemd/journald.conf

    - Manually clear old entries:

    journalctl --vacuum-time=1d
    journalctl --vacuum-size=500M
    
  2. Logs not persistent across reboots

    - Ensure persistent storage is enabled:

    mkdir -p /var/log/journal
    systemctl restart systemd-journald
    
  3. Too many or too few logs

    - Adjust the LogLevelMax in the service unit file and configure rate limiting in journald.conf

Quick Reference Summary

Essential Commands

Command Description
systemctl start [unit] Start a unit
systemctl stop [unit] Stop a unit
systemctl restart [unit] Restart a unit
systemctl status [unit] Check unit status
systemctl enable [unit] Enable unit to start at boot
systemctl disable [unit] Disable unit from starting at boot
systemctl daemon-reload Reload unit files after changes
journalctl -u [unit] View logs for a specific unit
journalctl -f Follow logs in real-time
systemctl edit [unit] Create/edit unit file overrides

Key Unit File Sections

Common Service Types

Service Dependency Types

Journal Storage Options

With this knowledge, you now have a comprehensive understanding of systemd service management in Linux, from basic service control to creating complex dependency chains and troubleshooting issues. This foundation will serve you well as you continue to work with and maintain Linux systems.