Container Security Hardening Guide with Seccomp and AppArmor

Introduction: The Growing Container Security Challenge

Container adoption has exploded across enterprises, with organizations running thousands of containerized workloads in production. However, this rapid adoption has created a critical security gap: most containers run with excessive privileges and unrestricted system access. A single compromised container can become a launching pad for lateral movement, privilege escalation, and data exfiltration. The 2024 Cloud Native Security Report revealed that 67% of organizations experienced container security incidents, with misconfigured security profiles being the leading cause. Without proper hardening through mandatory access controls and system call filtering, your containers remain vulnerable to kernel exploits, breakout attempts, and supply chain attacks.

Why Traditional Container Security Approaches Fall Short

Traditional container security has relied heavily on image scanning and network policies, but these measures address only surface-level vulnerabilities. Image scanning identifies known CVEs in packages but cannot prevent runtime exploitation of zero-day vulnerabilities or malicious behavior from legitimate binaries. Network policies control traffic flow but do nothing to restrict what processes can do once they're running inside a container.

The default Docker and Kubernetes configurations prioritize convenience over security. Containers run with broad system call access—over 300 syscalls available on Linux—when most applications need fewer than 50. This excessive privilege creates an enormous attack surface. Without mandatory access control (MAC) systems like AppArmor or SELinux, containers can access any file the container runtime permits, read sensitive host information through /proc and /sys filesystems, and potentially exploit kernel vulnerabilities.

Furthermore, many security teams lack visibility into what their containers actually do at runtime. They deploy applications without understanding which system calls are necessary, which files need access, and what network capabilities are required. This knowledge gap makes it nearly impossible to implement least-privilege security models effectively.

The namespace and cgroup isolation that containers provide offers basic separation but wasn't designed as a security boundary. Kernel vulnerabilities affecting system calls can allow container escape, and without additional hardening layers, attackers who compromise a container gain a foothold to probe for these weaknesses.

Modern Solution: Implementing Defense-in-Depth with Seccomp and AppArmor

Container security hardening requires a defense-in-depth approach combining Seccomp (Secure Computing Mode) for system call filtering and AppArmor for mandatory access control. Let's implement a comprehensive hardening strategy with practical examples.

Creating a Custom Seccomp Profile

Seccomp filters system calls at the kernel level, blocking unnecessary syscalls before they execute. Here's a TypeScript-based tool to generate custom Seccomp profiles:

interface SeccompProfile {
  defaultAction: string;
  architectures: string[];
  syscalls: SyscallRule[];
}

interface SyscallRule {
  names: string[];
  action: string;
  args?: SyscallArg[];
}

interface SyscallArg {
  index: number;
  value: number;
  op: string;
}

class SeccompProfileGenerator {
  private profile: SeccompProfile;

  constructor() {
    this.profile = {
      defaultAction: "SCMP_ACT_ERRNO",
      architectures: ["SCMP_ARCH_X86_64", "SCMP_ARCH_X86", "SCMP_ARCH_X32"],
      syscalls: []
    };
  }

  allowEssentialSyscalls(): this {
    const essential = [
      "read", "write", "open", "openat", "close", "stat", "fstat",
      "lstat", "poll", "lseek", "mmap", "mprotect", "munmap", "brk",
      "rt_sigaction", "rt_sigprocmask", "rt_sigreturn", "ioctl",
      "pread64", "pwrite64", "readv", "writev", "access", "pipe",
      "select", "sched_yield", "mremap", "msync", "mincore", "madvise",
      "shmget", "shmat", "shmctl", "dup", "dup2", "pause", "nanosleep",
      "getitimer", "alarm", "setitimer", "getpid", "sendfile", "socket",
      "connect", "accept", "sendto", "recvfrom", "sendmsg", "recvmsg",
      "shutdown", "bind", "listen", "getsockname", "getpeername",
      "socketpair", "setsockopt", "getsockopt", "clone", "fork", "vfork",
      "execve", "exit", "wait4", "kill", "uname", "fcntl", "flock",
      "fsync", "fdatasync", "truncate", "ftruncate", "getdents",
      "getcwd", "chdir", "fchdir", "rename", "mkdir", "rmdir", "creat",
      "link", "unlink", "symlink", "readlink", "chmod", "fchmod",
      "chown", "fchown", "lchown", "umask", "gettimeofday", "getrlimit",
      "getrusage", "sysinfo", "times", "ptrace", "getuid", "syslog",
      "getgid", "setuid", "setgid", "geteuid", "getegid", "setpgid",
      "getppid", "getpgrp", "setsid", "setreuid", "setregid",
      "getgroups", "setgroups", "setresuid", "getresuid", "setresgid",
      "getresgid", "getpgid", "setfsuid", "setfsgid", "getsid",
      "capget", "capset", "rt_sigpending", "rt_sigtimedwait",
      "rt_sigqueueinfo", "rt_sigsuspend", "sigaltstack", "utime",
      "mknod", "uselib", "personality", "ustat", "statfs", "fstatfs",
      "sysfs", "getpriority", "setpriority", "sched_setparam",
      "sched_getparam", "sched_setscheduler", "sched_getscheduler",
      "sched_get_priority_max", "sched_get_priority_min",
      "sched_rr_get_interval", "mlock", "munlock", "mlockall",
      "munlockall", "vhangup", "modify_ldt", "pivot_root", "_sysctl",
      "prctl", "arch_prctl", "adjtimex", "setrlimit", "chroot", "sync",
      "acct", "settimeofday", "mount", "umount2", "swapon", "swapoff",
      "reboot", "sethostname", "setdomainname", "iopl", "ioperm",
      "init_module", "delete_module", "quotactl", "gettid", "readahead",
      "setxattr", "lsetxattr", "fsetxattr", "getxattr", "lgetxattr",
      "fgetxattr", "listxattr", "llistxattr", "flistxattr",
      "removexattr", "lremovexattr", "fremovexattr", "tkill", "time",
      "futex", "sched_setaffinity", "sched_getaffinity", "io_setup",
      "io_destroy", "io_getevents", "io_submit", "io_cancel",
      "lookup_dcookie", "epoll_create", "getdents64", "set_tid_address",
      "restart_syscall", "semtimedop", "fadvise64", "timer_create",
      "timer_settime", "timer_gettime", "timer_getoverrun",
      "timer_delete", "clock_settime", "clock_gettime", "clock_getres",
      "clock_nanosleep", "exit_group", "epoll_wait", "epoll_ctl",
      "tgkill", "utimes", "mbind", "set_mempolicy", "get_mempolicy",
      "mq_open", "mq_unlink", "mq_timedsend", "mq_timedreceive",
      "mq_notify", "mq_getsetattr", "waitid", "splice", "tee",
      "sync_file_range", "vmsplice", "move_pages", "utimensat",
      "epoll_pwait", "signalfd", "timerfd_create", "eventfd",
      "fallocate", "timerfd_settime", "timerfd_gettime", "accept4",
      "signalfd4", "eventfd2", "epoll_create1", "dup3", "pipe2",
      "inotify_init1", "preadv", "pwritev", "rt_tgsigqueueinfo",
      "perf_event_open", "recvmmsg", "prlimit64"
    ];

    this.profile.syscalls.push({
      names: essential,
      action: "SCMP_ACT_ALLOW"
    });

    return this;
  }

  blockDangerousSyscalls(): this {
    const dangerous = [
      "keyctl", "add_key", "request_key", "ptrace", "mbind",
      "migrate_pages", "move_pages", "set_mempolicy", "userfaultfd",
      "perf_event_open", "bpf"
    ];

    // These are already blocked by default action
    // but explicitly listing them for documentation
    return this;
  }

  restrictNetworkSyscalls(): this {
    // Allow only specific socket families
    this.profile.syscalls.push({
      names: ["socket"],
      action: "SCMP_ACT_ALLOW",
      args: [{
        index: 0,
        value: 2, // AF_INET
        op: "SCMP_CMP_EQ"
      }]
    });

    this.profile.syscalls.push({
      names: ["socket"],
      action: "SCMP_ACT_ALLOW",
      args: [{
        index: 0,
        value: 10, // AF_INET6
        op: "SCMP_CMP_EQ"
      }]
    });

    return this;
  }

  generateProfile(): string {
    return JSON.stringify(this.profile, null, 2);
  }

  saveToFile(filename: string): void {
    const fs = require('fs');
    fs.writeFileSync(filename, this.generateProfile());
  }
}

// Usage example
const generator = new SeccompProfileGenerator();
generator
  .allowEssentialSyscalls()
  .blockDangerousSyscalls()
  .restrictNetworkSyscalls()
  .saveToFile('custom-seccomp.json');

Implementing AppArmor Profiles

AppArmor provides path-based mandatory access control. Here's a TypeScript utility to manage AppArmor profiles:

interface AppArmorRule {
  path: string;
  permissions: string;
  condition?: string;
}

class AppArmorProfileBuilder {
  private profileName: string;
  private rules: AppArmorRule[] = [];
  private capabilities: string[] = [];
  private networkRules: string[] = [];

  constructor(profileName: string) {
    this.profileName = profileName;
  }

  allowFileAccess(path: string, permissions: string): this {
    this.rules.push({ path, permissions });
    return this;
  }

  allowCapability(capability: string): this {
    this.capabilities.push(capability);
    return this;
  }

  allowNetwork(family: string, type: string): this {
    this.networkRules.push(`network ${family} ${type}`);
    return this;
  }

  denyFileAccess(path: string): this {
    this.rules.push({ path, permissions: "" });
    return this;
  }

  generateProfile(): string {
    let profile = `#include <tunables/global>\n\n`;
    profile += `profile ${this.profileName} flags=(attach_disconnected,mediate_deleted) {\n`;
    profile += `  #include <abstractions/base>\n\n`;

    // Add capabilities
    if (this.capabilities.length > 0) {
      profile += `  # Capabilities\n`;
      this.capabilities.forEach(cap => {
        profile += `  capability ${cap},\n`;
      });
      profile += `\n`;
    }

    // Add network rules
    if (this.networkRules.length > 0) {
      profile += `  # Network access\n`;
      this.networkRules.forEach(rule => {
        profile += `  ${rule},\n`;
      });
      profile += `\n`;
    }

    // Add file rules
    profile += `  # File access\n`;
    this.rules.forEach(rule => {
      if (rule.permissions) {
        profile += `  ${rule.path} ${rule.permissions},\n`;
      } else {
        profile += `  deny ${rule.path} rwx,\n`;
      }
    });

    // Deny sensitive paths
    profile += `\n  # Deny sensitive paths\n`;
    profile += `  deny /proc/sys/kernel/** rwx,\n`;
    profile += `  deny /sys/kernel/security/** rwx,\n`;
    profile += `  deny /proc/kcore rwx,\n`;

    profile += `}\n`;
    return profile;
  }
}

// Example: Web application profile
const webAppProfile = new AppArmorProfileBuilder('docker-webapp');
webAppProfile
  .allowCapability('net_bind_service')
  .allowCapability('setuid')
  .allowCapability('setgid')
  .allowNetwork('inet', 'tcp')
  .allowNetwork('inet', 'udp')
  .allowFileAccess('/app/**', 'r')
  .allowFileAccess('/tmp/**', 'rw')
  .allowFileAccess('/var/log/app/**', 'w')
  .denyFileAccess('/etc/shadow')
  .denyFileAccess('/etc/sudoers');

console.log(webAppProfile.generateProfile());

Kubernetes Integration

Deploy hardened containers in Kubernetes with security contexts:

interface KubernetesSecurityContext {
  seccompProfile: {
    type: string;
    localhostProfile?: string;
  };
  appArmorProfile?: string;
  runAsNonRoot: boolean;
  runAsUser: number;
  capabilities: {
    drop: string[];
    add: string[];
  };
  readOnlyRootFilesystem: boolean;
}

class SecureDeploymentGenerator {
  generateSecurityContext(
    seccompProfile: string,
    appArmorProfile: string
  ): KubernetesSecurityContext {
    return {
      seccompProfile: {
        type: "Localhost",
        localhostProfile: seccompProfile
      },
      appArmorProfile: `localhost/${appArmorProfile}`,
      runAsNonRoot: true,
      runAsUser: 1000,
      capabilities: {
        drop: ["ALL"],
        add: ["NET_BIND_SERVICE"]
      },
      readOnlyRootFilesystem: true
    };
  }

  generatePodSpec(securityContext: KubernetesSecurityContext): object {
    return {
      apiVersion: "v1",
      kind: "Pod",
      metadata: {
        name: "hardened-app",
        annotations: {
          "container.apparmor.security.beta.kubernetes.io/app": 
            securityContext.appArmorProfile
        }
      },
      spec: {
        securityContext: {
          runAsNonRoot: securityContext.runAsNonRoot,
          runAsUser: securityContext.runAsUser,
          fsGroup: 1000,
          seccompProfile: securityContext.seccompProfile
        },
        containers: [{
          name: "app",
          image: "myapp:latest",
          securityContext: {
            allowPrivilegeEscalation: false,
            capabilities: securityContext.capabilities,
            readOnlyRootFilesystem: securityContext.readOnlyRootFilesystem
          },
          volumeMounts: [{
            name: "tmp",
            mountPath: "/tmp"
          }]
        }],
        volumes: [{
          name: "tmp",
          emptyDir: {}
        }]
      }
    };
  }
}

Common Pitfalls to Avoid

Overly Permissive Profiles: Starting with a restrictive profile and gradually adding permissions is safer than beginning with broad access. Many teams create profiles that allow too much, negating security benefits.

Ignoring Application Dependencies: Applications often have hidden dependencies on specific syscalls or file paths. Test thoroughly in staging environments before production deployment to identify all requirements.

Not Monitoring Audit Logs: Seccomp and AppArmor generate audit logs when they block actions. Failing to monitor these logs means missing both legitimate application needs and actual attack attempts.

Hardcoding Profiles: Environment-specific paths and requirements vary. Use templating and configuration management to generate profiles dynamically rather than maintaining static files.

Forgetting Init Processes: Container init processes (PID 1) often need different permissions than application processes. Consider using specialized init systems like tini or dumb-init with appropriate profiles.

Neglecting Updates: As applications evolve, their security requirements change. Implement a process to review and update security profiles with each application release.

Best Practices for Container Security Hardening

Start with Audit Mode: Deploy AppArmor profiles in complain mode initially to log violations without blocking them. Analyze logs to understand actual requirements before enforcing.

Use Profile Generators: Tools like bane, oci-seccomp-bpf-hook, and aa-genprof can automatically generate profiles based on observed behavior, providing a solid starting point.

Implement Least Privilege: Drop all capabilities by default and add only those explicitly required. Most applications need zero Linux capabilities.

Layer Security Controls: Combine Seccomp, AppArmor, network policies, and Pod Security Standards for defense-in-depth. No single control is sufficient.

Automate Profile Management: Integrate profile generation and deployment into CI/CD pipelines. Security should be code-reviewed and version-controlled like application code.

Test Failure Scenarios: Verify that security profiles actually block malicious behavior by testing with exploit frameworks and penetration testing tools.

Document Profile Decisions: Maintain clear documentation explaining why each permission is granted. This aids future reviews and helps new team members understand security posture.

Regular Security Audits: Schedule quarterly reviews of security profiles to remove unnecessary permissions and add protections for newly discovered attack vectors.

Frequently Asked Questions

Q: Will Seccomp and AppArmor impact application performance? A: The performance impact is minimal, typically under 1-2% overhead. Seccomp filtering happens at the kernel level with negligible latency, and AppArmor's path-based checks are highly optimized. The security benefits far outweigh the minor performance cost.

Q: Can I use Seccomp and AppArmor together? A: Yes, and you should. They provide complementary protections: Seccomp filters system calls while AppArmor controls file and network access. Using both creates stronger defense-in-depth.

Q: How do I troubleshoot blocked syscalls or file access? A: Check audit logs at /var/log/audit/audit.log for AppArmor denials and use dmesg for Seccomp violations. These logs show exactly what was blocked, helping you adjust profiles appropriately.

Q: Are these security measures compatible with all container runtimes? A: Seccomp is supported by Docker, containerd, CRI-O, and other OCI-compliant runtimes. AppArmor requires kernel support and is available on Ubuntu, Debian, and SUSE-based systems. SELinux provides similar functionality on RHEL-based distributions.

Q: How do I handle third-party containers without custom profiles? A: Start with Docker's default Seccomp profile, which blocks dangerous syscalls while allowing most common operations. For AppArmor, use the docker-default profile as a

Container Security Hardening Guide with Seccomp and AppArmor

Container Security Hardening Guide with Seccomp and AppArmor

Introduction: The Growing Container Security Challenge

Why Traditional Container Security Approaches Fall Short

Modern Solution: Implementing Defense-in-Depth with Seccomp and AppArmor

Creating a Custom Seccomp Profile

Implementing AppArmor Profiles

Kubernetes Integration

Common Pitfalls to Avoid

Best Practices for Container Security Hardening

Frequently Asked Questions

Comments

More from this blog

Embedding-First Architecture for Real-World LLM Apps

AI/ML Modern Patterns

Containers/K8s Modern Patterns

Containers/K8s Modern Patterns

Containers/K8s Modern Patterns

Command Palette

Container Security Hardening Guide with Seccomp and AppArmor

Introduction: The Growing Container Security Challenge

Why Traditional Container Security Approaches Fall Short

Modern Solution: Implementing Defense-in-Depth with Seccomp and AppArmor

Creating a Custom Seccomp Profile

Implementing AppArmor Profiles

Kubernetes Integration

Common Pitfalls to Avoid

Best Practices for Container Security Hardening

Frequently Asked Questions

Comments

More from this blog