Scalasca 2.x series

Download Build requirements Documentation ChangeLog

 

VersionDateDescription
2.6.1 15-Dec-2022 Latest Release
MD5sum: 56a49be3c2fe1c021ceeb8780a73757d
DOI

Includes:
  • Support for
    • Score-P v8.0, incl. OpenMP measurements via OMPT
    • OTF2 v3.0
    • Cube C Writer v4.8
  • Build system improvements:
    • Add proper support for additional compiler suites via --with-nocross-compiler-suite=<suite>:
      • amdclang: AMD ROCm compilers (amdclang, amdclang++)
      • nvhpc: NVIDIA HPC Toolkit compilers (nvc, nvc++)
      • oneapi: Intel oneAPI compilers (icx, icpx)
  • Automatic trace analyzer changes & improvements:
    • Fixed a performance regression when analyzing traces including OpenMP with few instrumented user regions.
    • Improved metric data aggregation and writing.
  • Measurement nexus (scan) changes:
    • Removed stalled resources counter from default POP multi-run preset and added an enhanced preset pop-with-stalled-resources.
2.6 19-Apr-2021 MD5sum: 566657db21f7bf87a7009653d330d8bf
DOI

Includes:
  • Build system improvements:
    • Auto-detect Cray XC platforms with ARM CPUs, supporting Cray, ARM, and GCC compilers
    • Added support for Clang and AMD AOCC compilers
    • Updated support for Spectrum MPI
  • Automatic trace analyzer changes & improvements:
    • Revised 'Early Reduce' wait state definition.
    • Added calculation of 'Early Reduce' delay costs.
    • Fixed various delay cost calculation and propagation issues.
    • Fixed various inconsistencies between wait-state and root-cause analysis.
    • Made POSIX threads analysis consistent with Score-P by avoiding thread function stub call paths underneath pthread_create. This also fixes a deadlock when analyzing traces containing "orphaned threads".
  • Measurement nexus (scan) changes:
    • Added preset mode for multi-run measurements with a preset for POP analysis requirements as an use case.
    • Added support for multiple file systems in SCAN_TRACE_FILESYS by using a colon separated list of paths.
  • Analysis report postprocessing changes:
    • Add metric hierarchies for CUDA, OpenCL, and OpenACC. (NOTE: The trace analysis still only supports host-side events!)
    • Renamed '-c' command-line option of square to '-C' for running sanity checks on newly created reports.
    • Added new '-c' command-line option to square to allow specifying the number of counters considered during report scoring (for consistency with scorep-score).
    • Added new '-x' command-line option to square to allow passing options directly through to scorep-score.
    • Avoid unnecessary aggregation/postprocessing of reports with multi-run experiments.
  • Substantial code cleanup.
2.5 22-Mar-2019 MD5sum: c4ea190408ef34592008f9706dc30a0c
DOI

Includes:
  • Support for
    • Score-P v5.0, incl. virtual process/thread topologies
  • Automatic trace analyzer changes & improvements:
    • Various fixes and improvements in timestamp correction algorithm.
    • Fixed 'Late Receiver' instance tracking.
    • Slightly improved analysis report data collation.
  • Added support for multi-run experiments.
  • Code refactoring and various bug fixes.
  • Improved user documentation:
    • Revised User Guide including command reference.
    • Added man pages.
2.4 14-May-2018 MD5sum: c9d09b71721a8345f172fc05debc38b3

Includes:
  • Support for
    • Cube v4.4
  • Build system improvements:
    • Fix build issues with compilers defaulting to C++11 or higher (e.g., Intel 2017, PGI 17).
    • Fix build issues with PGI 16+ compilers (pgCC no longer available)
    • Fix build issues on Cray systems, now also properly taking CRAYPE_LINK_TYPE setting into account
  • Automatic trace analyzer changes & improvements:
    • Fix rare crash/deadlock in critical-path/delay analysis while analyzing MPI persistent communication.
    • Improved memory management.
    • Improved handling of OTF2 traces in SIONlib containers.
    • Improved trace reading times, especially at scale.
    • Fixed detection of wait states in active-target synchronization based on EPIK traces
  • Code refactoring and various bug fixes.
2.3.1 20-May-2016 MD5sum: a83ced912b9d2330004cb6b9cefa7585

Includes:
  • Build system improvements:
    • Fixed build issue with GCC 6.1.
    • Fixed build issue on the Intel Xeon Phi platform.
2.3 14-Apr-2016 MD5sum: de782c8b6ecfce0e16a4b143ba7a9b5a

Includes:
  • Support for
    • Score-P v2.0
    • OTF2 v2.0
  • Automatic trace analyzer changes & improvements:
    • Experimental support for Score-P traces collected using sampling (see OPEN_ISSUES for limitations).
  • Improved analysis report postprocessing:
    • Revised metric hierarchies (organization, metric naming, etc).
    • Suppress calculation of performance properties that are only relevant for unused parallel programming models.
  • Performance property documentation fixes & improvements.
  • Build system improvements.
  • Code refactoring and various bug fixes.
2.2.2 19-June-2015 MD5sum: 2bafce988b0522d18072f7771e491ab9

Includes:
  • Platform support:
    • Fixed a build issue on the Intel Xeon Phi platform.
    • Improved support for the ibrun launcher.
  • Automatic trace analyzer changes & improvements:
    • Worked around rare run-time issue with MVAPICH2.
2.2.1 08-May-2015 MD5sum: e5083a75160257f8e2051fbe113272cd

Includes:
  • Platform support:
    • Added build system support for Power8/Linux.
    • Added build system support for 64-bit ARM/Linux (AArch64).
    • Prefer linking static over dynamic Cube/OTF2 libraries on Fujitsu K/FX10/FX100.
  • Automatic trace analyzer changes & improvements:
    • Fixed delay-cost propagation through OpenMP barrier wait states.
    • Various algorithmic optimizations reducing overall analysis time for traces of multi-threaded applications:
      • Improved memory management.
      • Improved trace preprocessing.
      • Improved timestamp correction.
  • Code refactoring and various bug fixes.
2.2 30-Jan-2015 MD5sum: 06e0380c612811a1ff9c183fed9331a9

Includes:
  • Support for
    • Score-P v1.4
    • OTF2 v1.5, incl. full SIONlib support (if configured)
    • Cube v4.3
  • Platform support:
    • Added support for Intel Xeon Phi, native mode only.
    • Added support for Fujitsu FX100 (thanks to T. Nakamura, Fujitsu Ltd).
  • Automatic trace analyzer changes & improvements:
    • Added basic support for POSIX threads.
    • Added basic support for OpenMP tasking.
    • Added lock contention analysis (OpenMP & POSIX threads).
    • Added root-cause/delay analysis (MPI & OpenMP).
    • New command-line options '--[no-]rootcause'.
  • Code refactoring and various bug fixes.
2.1 29-Aug-2014 MD5sum: bab9c2b021e51e2ba187feec442b96e6

Includes:
  • Support for
    • Score-P v1.3
    • OTF2 v1.4
  • Platform support:
    • Added support for Fujitsu FX10 & K computer.
    • Improved support for Cray systems.
  • Automatic trace analyzer changes & improvements:
    • Added Critical-path analysis.
    • Improved Late Receiver detection.
    • New command-line options '--[no-]critical-path' and '--single-pass'.
    • Fixed crash in data collation when number of OpenMP threads varied among MPI processes.
  • Code refactoring and various small bug fixes.
  • Initial version of updated User Guide (still work in progress).
2.0 13-Aug-2013 MD5sum: 0a666d4aef8ec5d32b77d1e034321fd1

Includes:
  • Support for
    • Score-P v1.2
    • OTF2 v1.2
    • Cube v4.2
  • New build system based on GNU autotools.
  • Significant amount of code refactoring.
  • Automatic trace analyzer changes & improvements:
    • Support for arbitrary deep system trees.
    • Improved performance of timestamp correction.
    • Pattern instance tracking and statistics are now enabled by default.
    • New command-line options '--verbose', '--[no-]time-correct', and '--[no-]statistics'.
    • Limited backward-compatibility support for handling existing traces in EPILOG format generated by Scalasca v1.