Skip to content

HWLOC_DEBUG=1 breaks hwloc on MacOS. #564

@FunMiles

Description

@FunMiles

I discovered this issue after creating a MacOS CMakeLists.txt based on the Windows one. The Windows CMakeLists.txt turns on HWLOC_DEBUG if the code is compiled under debugging. It seems to not be an issue coming from CMake compilation but also exists for autotools.
Notes:

  • I am not sure what is the proper way to turn on the flag with configure. I forced the flag by overriding the compiler (see below)
  • Some test codes, other than the simple example I gave do also crash, but this sample code show the crash faster.
  • Using HWLOC_DEBUG_VERBOSE=1 on code compiled without HWLOC_DEBUG does not trigger the same issue. It seems the HWLOC_DEBUG flag has to be defined for it to crash.
  • Since the issue seems to be only with HWLOC_DEBUG turned on, it is not a show-stopper, but IMO either that flag should not make things crash or should be absent from the code.

What version of hwloc are you using?

3.0.0a1-git
commit 7987eb4

Which operating system and hardware are you running on?

Darwin Michels-MacBook-Pro.local 22.3.0 Darwin Kernel Version 22.3.0: Thu Jan 5 20:53:49 PST 2023; root:xnu-8792.81.2~2/RELEASE_X86_64 x86_64

Details of the problem

Steps to reproduce:

  1. configure with HWLOC_DEBUG turned on: CC="gcc -DHWLOC_DEBUG" <path to hwloc>/configure --prefix=pwd/dbg
  2. compile and install make -j 16 && make install
  3. create the following test code file test.cpp:
#include <iostream>
#include <thread>

#include <hwloc.h>

inline
int numCores()
{
        hwloc_topology_t topology;
        hwloc_cpuset_t cpuset;
        hwloc_obj_t obj;

        /* Allocate and initialize topology object. */
        hwloc_topology_init(&topology);

        /* ... Optionally, put detection configuration here to ignore
           some objects types, define a synthetic topology, etc....

           The default is to detect all the objects of the machine that
           the caller is allowed to access.  See Configure Topology
           Detection. */
        hwloc_topology_set_all_types_filter(topology, HWLOC_TYPE_FILTER_KEEP_NONE);
        hwloc_topology_set_type_filter(topology, HWLOC_OBJ_CORE, HWLOC_TYPE_FILTER_KEEP_ALL);
        /* Perform the topology detection. */
        hwloc_topology_load(topology);

        /* Optionally, get some additional topology information
           in case we need the topology depth later. */
        auto topodepth = hwloc_topology_get_depth(topology);
        // Try to get the number of CPU cores from topology
        int depth = hwloc_get_type_depth(topology, HWLOC_OBJ_CORE);
        int nCores = -1;
        if (depth == HWLOC_TYPE_DEPTH_UNKNOWN)
                nCores = -std::thread::hardware_concurrency();
        else
                nCores = hwloc_get_nbobjs_by_depth(topology, depth);

        // Destroy topology object and return
        hwloc_topology_destroy(topology);
        return nCores;
}

inline
int default_num_threads() { return std::abs(numCores()); }

int main() {
    std::cout << "Number of of threads: " << default_num_threads() << std::endl;
}
  1. compile the test codeclang++ -std=c++20 test.cpp -Ldbg/lib -lhwloc -Idbg/include -framework CoreFoundation -framework IOKit
  2. run the code: ./a.out and it crashes

Additional information

sysctl hw

hw.ncpu: 16
hw.byteorder: 1234
hw.memsize: 68719476736
hw.activecpu: 16
hw.perflevel0.physicalcpu: 8
hw.perflevel0.physicalcpu_max: 8
hw.perflevel0.logicalcpu: 16
hw.perflevel0.logicalcpu_max: 16
hw.perflevel0.l1icachesize: 32768
hw.perflevel0.l1dcachesize: 32768
hw.perflevel0.l2cachesize: 262144
hw.perflevel0.cpusperl2: 2
hw.perflevel0.l3cachesize: 16777216
hw.perflevel0.cpusperl3: 16
hw.perflevel0.name: Standard
hw.features.allows_security_research: 0
hw.optional.floatingpoint: 1
hw.optional.mmx: 1
hw.optional.sse: 1
hw.optional.sse2: 1
hw.optional.sse3: 1
hw.optional.supplementalsse3: 1
hw.optional.sse4_1: 1
hw.optional.sse4_2: 1
hw.optional.x86_64: 1
hw.optional.aes: 1
hw.optional.avx1_0: 1
hw.optional.rdrand: 1
hw.optional.f16c: 1
hw.optional.enfstrg: 1
hw.optional.fma: 1
hw.optional.avx2_0: 1
hw.optional.bmi1: 1
hw.optional.bmi2: 1
hw.optional.rtm: 0
hw.optional.hle: 0
hw.optional.adx: 1
hw.optional.mpx: 0
hw.optional.sgx: 0
hw.optional.avx512f: 0
hw.optional.avx512cd: 0
hw.optional.avx512dq: 0
hw.optional.avx512bw: 0
hw.optional.avx512vl: 0
hw.optional.avx512ifma: 0
hw.optional.avx512vbmi: 0
hw.physicalcpu: 8
hw.physicalcpu_max: 8
hw.logicalcpu: 16
hw.logicalcpu_max: 16
hw.cputype: 7
hw.cpusubtype: 8
hw.cpu64bit_capable: 1
hw.cpufamily: 260141638
hw.cpusubfamily: 0
hw.cacheconfig: 16 2 2 16 0 0 0 0 0 0
hw.cachesize: 68719476736 32768 262144 16777216 0 0 0 0 0 0
hw.pagesize: 4096
hw.pagesize32: 4096
hw.busfrequency: 400000000
hw.busfrequency_min: 400000000
hw.busfrequency_max: 400000000
hw.cpufrequency: 2400000000
hw.cpufrequency_min: 2400000000
hw.cpufrequency_max: 2400000000
hw.cachelinesize: 64
hw.l1icachesize: 32768
hw.l1dcachesize: 32768
hw.l2cachesize: 262144
hw.l3cachesize: 16777216
hw.tbfrequency: 1000000000
hw.packages: 1
hw.use_kernelmanagerd: 1
hw.serialdebugmode: 0
hw.nperflevels: 1
hw.targettype: Mac
hw.cputhreadtype: 1

sysct machdep

machdep.vectors.timer: 221
machdep.vectors.IPI: 222
machdep.pmap.hashwalks: 371183049
machdep.pmap.hashcnts: 379940942
machdep.pmap.hashmax: 16
machdep.pmap.kernel_text_ps: 4096
machdep.pmap.kern_pv_reserve: 16000
machdep.memmap.Conventional: 68608507904
machdep.memmap.RuntimeServices: 1511424
machdep.memmap.ACPIReclaim: 393216
machdep.memmap.ACPINVS: 790528
machdep.memmap.PalCode: 0
machdep.memmap.Reserved: 91496448
machdep.memmap.Unusable: 0
machdep.memmap.Other: 0
machdep.tsc.nanotime.tsc_base: 55802224609698
machdep.tsc.nanotime.ns_base: 369037310150163
machdep.tsc.nanotime.scale: 1789569706
machdep.tsc.nanotime.shift: 0
machdep.tsc.nanotime.generation: 31
machdep.tsc.frequency: 2400000000
machdep.tsc.deep_idle_rebase: 1
machdep.tsc.at_boot: 44521694
machdep.tsc.rebase_abs_time: 11086057180
machdep.misc.fast_uexc_support: 1
machdep.misc.panic_restart_timeout: 2147483647
machdep.misc.interrupt_latency_max: 0x0 0x49 0x33eda8
machdep.misc.timer_queue_trace:
machdep.misc.nmis: 0
machdep.xcpm.mode: 1
machdep.xcpm.pcps_mode: 0
machdep.xcpm.hard_plimit_max_100mhz_ratio: 50
machdep.xcpm.hard_plimit_min_100mhz_ratio: 8
machdep.xcpm.soft_plimit_max_100mhz_ratio: 50
machdep.xcpm.soft_plimit_min_100mhz_ratio: 8
machdep.xcpm.tuib_plimit_max_100mhz_ratio: 50
machdep.xcpm.tuib_plimit_min_100mhz_ratio: 8
machdep.xcpm.lpm_plimit_max_100mhz_ratio: 26
machdep.xcpm.tuib_enabled: 0
machdep.xcpm.lpm_enabled: 0
machdep.xcpm.power_source: 0
machdep.xcpm.bootplim: 0
machdep.xcpm.bootpst: 50
machdep.xcpm.tuib_ns: 0
machdep.xcpm.vectors_loaded_count: 1
machdep.xcpm.ratio_change_ratelimit_ns: 3000000
machdep.xcpm.ratio_changes_total: 29483176
machdep.xcpm.maxbusdelay: 4294967295
machdep.xcpm.maxintdelay: 0
machdep.xcpm.mid_applications: 0
machdep.xcpm.mid_relaxations: 0
machdep.xcpm.mid_mode: 1
machdep.xcpm.mid_cst_control_limit: 0
machdep.xcpm.mid_mode_active: 0
machdep.xcpm.mbd_mode: 1
machdep.xcpm.mbd_applications: 14
machdep.xcpm.mbd_relaxations: 32
machdep.xcpm.forced_idle_ratio: 100
machdep.xcpm.forced_idle_period: 30000000
machdep.xcpm.deep_idle_log: 0
machdep.xcpm.qos_txfr: 1
machdep.xcpm.deep_idle_count: 26
machdep.xcpm.deep_idle_last_stats: 0:03:25 CC7:99% C2:0% C3:0% C6:0% C7:0% C8:0% C9:0% C10:99%
machdep.xcpm.deep_idle_total_stats: 12:30:31 CC7:99% C2:0% C3:0% C6:0% C7:0% C8:0% C9:0% C10:99%
machdep.xcpm.cpu_thermal_level: 17
machdep.xcpm.gpu_thermal_level: 0
machdep.xcpm.io_thermal_level: 0
machdep.xcpm.io_control_engages: 0
machdep.xcpm.io_control_disengages: 0
machdep.xcpm.io_filtered_reads: 0
machdep.xcpm.pcps_rt_override_mode: 0
machdep.xcpm.io_cst_control_enabled: 1
machdep.xcpm.ring_boost_enabled: 0
machdep.xcpm.io_epp_boost_enabled: 1
machdep.xcpm.epp_override: 0
machdep.xcpm.perf_hints: 0
machdep.xcpm.pcps_rt_override_ns: 0
machdep.cpu.tlb.inst.large: 8
machdep.cpu.tlb.data.small: 64
machdep.cpu.tlb.data.small_level1: 64
machdep.cpu.address_bits.physical: 39
machdep.cpu.address_bits.virtual: 48
machdep.cpu.tsc_ccc.numerator: 200
machdep.cpu.tsc_ccc.denominator: 2
machdep.cpu.mwait.linesize_min: 64
machdep.cpu.mwait.linesize_max: 64
machdep.cpu.mwait.extensions: 3
machdep.cpu.mwait.sub_Cstates: 286531872
machdep.cpu.thermal.sensor: 1
machdep.cpu.thermal.dynamic_acceleration: 1
machdep.cpu.thermal.invariant_APIC_timer: 1
machdep.cpu.thermal.thresholds: 2
machdep.cpu.thermal.ACNT_MCNT: 1
machdep.cpu.thermal.core_power_limits: 1
machdep.cpu.thermal.fine_grain_clock_mod: 1
machdep.cpu.thermal.package_thermal_intr: 1
machdep.cpu.thermal.hardware_feedback: 0
machdep.cpu.thermal.energy_policy: 1
machdep.cpu.xsave.extended_state: 31 832 1088 0
machdep.cpu.xsave.extended_state1: 15 832 256 0
machdep.cpu.arch_perf.version: 4
machdep.cpu.arch_perf.number: 4
machdep.cpu.arch_perf.width: 48
machdep.cpu.arch_perf.events_number: 7
machdep.cpu.arch_perf.events: 0
machdep.cpu.arch_perf.fixed_number: 3
machdep.cpu.arch_perf.fixed_width: 48
machdep.cpu.cache.linesize: 64
machdep.cpu.cache.L2_associativity: 4
machdep.cpu.cache.size: 256
machdep.cpu.max_basic: 22
machdep.cpu.max_ext: 2147483656
machdep.cpu.vendor: GenuineIntel
machdep.cpu.brand_string: Intel(R) Core(TM) i9-9980HK CPU @ 2.40GHz
machdep.cpu.family: 6
machdep.cpu.model: 158
machdep.cpu.extmodel: 9
machdep.cpu.extfamily: 0
machdep.cpu.stepping: 13
machdep.cpu.feature_bits: 9221959987971750911
machdep.cpu.leaf7_feature_bits: 43804591 1073741824
machdep.cpu.leaf7_feature_bits_edx: 3154120192
machdep.cpu.extfeature_bits: 1241984796928
machdep.cpu.signature: 591597
machdep.cpu.brand: 0
machdep.cpu.features: FPU VME DE PSE TSC MSR PAE MCE CX8 APIC SEP MTRR PGE MCA CMOV PAT PSE36 CLFSH DS ACPI MMX FXSR SSE SSE2 SS HTT TM PBE SSE3 PCLMULQDQ DTES64 MON DSCPL VMX EST TM2 SSSE3 FMA CX16 TPR PDCM SSE4.1 SSE4.2 x2APIC MOVBE POPCNT AES PCID XSAVE OSXSAVE SEGLIM64 TSCTMR AVX1.0 RDRAND F16C
machdep.cpu.leaf7_features: RDWRFSGS TSC_THREAD_OFFSET SGX BMI1 AVX2 SMEP BMI2 ERMS INVPCID FPU_CSDS MPX RDSEED ADX SMAP CLFSOPT IPT SGXLC MDCLEAR IBRS STIBP L1DF ACAPMSR SSBD
machdep.cpu.extfeatures: SYSCALL XD 1GBPAGE EM64T LAHF LZCNT PREFETCHW RDTSCP TSCI
machdep.cpu.logical_per_package: 16
machdep.cpu.cores_per_package: 8
machdep.cpu.microcode_version: 244
machdep.cpu.processor_flag: 5
machdep.cpu.core_count: 8
machdep.cpu.thread_count: 16
machdep.user_idle_level: 0
machdep.x2apic_enabled: 0
machdep.eager_timer_evaluations: 89913
machdep.eager_timer_evaluation_max: 1171210
machdep.x86_fp_simd_isr_uses: 0
machdep.uncore_sample_state: 0
machdep.uncore_sample_mask: 1
machdep.uncore_sample_ctl: 0
machdep.uncore_sample_interval_ms: 500
machdep.uncore_pcie_mmio_base: -536870912

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions