RDT Software Package

Overview

Intel(R) Resource Director Technology (Intel(R) RDT) including:

  • Cache Monitoring Technology (CMT)
  • Memory Bandwidth Monitoring (MBM)
  • Cache Allocation Technology (CAT)
  • Code and Data Prioritization (CDP)
  • Memory Bandwidth Allocation (MBA)
  • CQM (Cache QoS Monitoring) - "cqm_llc", "cqm_occup_llc"

Hardware uses CLOSid(Class of service ID) and an RMID(Resource monitoring IDs) to identify a control group and a monitoring group respectively. Each of the resource groups are mapped to these IDs based on the kind of group. The number of CLOSid and RMID are limited by the hardware and hence the creation of a "CTRL_MON" directory may fail if we run out of either CLOSID or RMID and creation of "MON" group may fail if we run out of RMIDs.

Hardware Support

Table 1. Intel(R) RDT hardware support

  CMT MBM L3 CAT L3 CDP L2 CAT MBA
Intel(R) Xeon(R) processor E5 v3 Yes No Yes (1) No No No
Intel(R) Xeon(R) processor D Yes Yes Yes (2) No No No
Intel(R) Xeon(R) processor E3 v4 No No Yes (3) No No No
Intel(R) Xeon(R) processor E5 v4 Yes Yes Yes (2) Yes No No

Intel(R) Xeon(R) Scalable Processors (6)

Intel® Xeon® Gold 6152 Processor

Yes Yes Yes (2) Yes No Yes (5)
Intel(R) Atom(R) processor for Server C3000 No No No No Yes (4) No
  1. Sixteen L3 CAT classes of service (CLOS). There are no pre-defined classes of service and they can be changed at run time. L3 CAT CLOS to hardware thread association can be changed at run time.
  1. Eight MBA classes of service (CLOS). There are no pre-defined classes of service and they can be changed at run time. MBA CLOS to hardware thread association can be changed at run time.

OS Support

  1. OS Frameworks
On modern Linux kernels, it is advised to use the kernel/OS interface when available.
  1. Interfaces

The intel-cmt-cat software library and utilities offer two interfaces to program Intel(R) RDT technologies, these are the MSR & OS interfaces.

The MSR interface is used to configure the platform by programming the hardware (MSR's) directly. This is the legacy interface and requires no kernel support for Intel(R) RDT but is limited to monitoring and managing resources on a per core basis.

The OS interface (Resctrl) was later added to the package and when selected, the library will leverage Linux kernel extensions to program these technologies.

Table 3. OS interface feature support

intel-cmt-cat
version
RDT feature
enabled
Kernel version
required
Recommended interface
0.1.4 CMT (Perf) 4.1 MSR (1)
1.0.0 MBM (Perf) 4.7 MSR (1)
1.1.0 L3 CAT, L3 CDP, L2 CAT (Resctrl) 4.10

OS for allocation only (with the exception of MBA)

MSR for allocation + monitoring (2)

1.2.0 MBA (Resctrl) 4.12

OS for allocation only

MSR for allocation + monitoring (2)

2.0.0 CMT, MBM (Resctrl) 4.14 OS
2.0.0 L2 CDP 4.16 OS
  1. Software dependencies

The only dependencies of intel-cmt-cat is access to C and pthreads libraries and:

  • without kernel extensions - 'msr' kernel module
  • with kernel extensions - Intel(R) RDT extended Perf system call and Resctrl filesystem

Enable Intel(R) RDT support in:

  • kernel v4.10 - v4.13 with kernel configuration option CONFIG_INTEL_RDT_A
  • kernel v4.14+ with kernel configuration option CONFIG_INTEL_RDT

Ubuntu version

  • 16.04: 4.4.0-121-generic
    • intel-cmt-cat, Version: 0.1.4-1
    • There is no CONFIG_INTEL_RDT setting in kernel config
  • 17.10: 4.13.0-39-generic
    • intel-cmt-cat, Version: 1.1.0-1
    • CONFIG_INTEL_RDT=y
  • 18.04: 4.15.0-20-generic
    • intel-cmt-cat, Version: 1.2.0-1
    • CONFIG_INTEL_RDT=y

Kernel version

...     EOL
mainline 4.17-rc4 2018-05-07  
stable 4.16.7 2018-05-01  
stable 4.15.18 [EOL] 2018-04-19  
longterm 4.14.39 2018-05-01 Jan, 2020
longterm 4.4.131 2018-05-02 Feb, 2022

Instructions

1
# modprobe msr

To use the feature mount the file system:

# mount -t resctrl resctrl [-o cdp[,cdpl2]] /sys/fs/resctrl

mount options are:

  • "cdp": Enable code/data prioritization in L3 cache allocations.
  • "cdpl2": Enable code/data prioritization in L2 cache allocations.

Both 2 options are not supported on ubuntu 17.10

OpenStack

1
2
3
$ openstack flavor set m1.large \
  --property hw:cpu_policy=dedicated \
  --property hw:cpu_thread_policy=isolate

Info directory

The 'info' directory contains information about the enabled resources. Each resource has its own subdirectory. The subdirectory names reflect the resource names.

Kernel source

  • intel_rdt.c

    • get_rdt_resources(void) > rdt_quirks(void)

      1
      2
      3
      4
      5
      6
      7
      8
      9
      
      switch (boot_cpu_data.x86_model) {
      case INTEL_FAM6_HASWELL_X:
          if (!rdt_options[RDT_FLAG_L3_CAT].force_off)
              cache_alloc_hsw_probe();
          break;
      case INTEL_FAM6_SKYLAKE_X:
          if (boot_cpu_data.x86_stepping <= 4)
              set_rdt_options("!cmt,!mbmtotal,!mbmlocal,!l3cat");
      }
      
    • notes of cache_alloc_hsw_probe(void)

      1
      2
      3
      4
      5
      6
      7
      8
      9
      10
      11
      12
      13
      14
      15
      16
      17
      18
      19
      20
      21
      22
      23
      24
      25
      26
      
      /*
      * cache_alloc_hsw_probe() - Have to probe for Intel haswell server CPUs
      * as they do not have CPUID enumeration support for Cache allocation.
      * The check for Vendor/Family/Model is not enough to guarantee that
      * the MSRs won't #GP fault because only the following SKUs support
      * CAT:
      *  Intel(R) Xeon(R)  CPU E5-2658  v3  @  2.20GHz
      *  Intel(R) Xeon(R)  CPU E5-2648L v3  @  1.80GHz
      *  Intel(R) Xeon(R)  CPU E5-2628L v3  @  2.00GHz
      *  Intel(R) Xeon(R)  CPU E5-2618L v3  @  2.30GHz
      *  Intel(R) Xeon(R)  CPU E5-2608L v3  @  2.00GHz
      *  Intel(R) Xeon(R)  CPU E5-2658A v3  @  2.20GHz
      *
      * Probe by trying to write the first of the L3 cach mask registers
      * and checking that the bits stick. Max CLOSids is always 4 and max cbm length
      * is always 20 on hsw server parts. The minimum cache bitmask length
      * allowed for HSW server is always 2 bits. Hardcode all of them.
      */
      static inline void cache_alloc_hsw_probe(void)
      {
        struct rdt_resource *r  = &rdt_resources_all[RDT_RESOURCE_L3];
        u32 l, h, max_cbm = BIT_MASK(20) - 1;
      
        if (wrmsr_safe(IA32_L3_CBM_BASE, max_cbm, 0))
          return;
        rdmsr(IA32_L3_CBM_BASE, l, h);
      
comments powered by Disqus