SLATE Quarterly Updates

Welcome to another round of SLATE project updates! It’s been quite some time since our last update, so this post will be especially long. On the plus side, we have a lot of cool and interesting new things to tell you about!

As always, we encourage you to try things out and let us know what’s working, what’s not, what can be improved and so on. For discussion, news and troubleshooting, the SLATE Slack workspace is the best place to reach us!

SLATE Portal

  • Security has been hardened, using the findings of an audit conducted by the Open Science Grid security team.
  • Assorted bug fixes and other code cleanup have been applied.

SLATE and Federated Operation Security Policy

  • Official platform polices are now located at https://slateci.io/docs/security-and-policies/

API Server and Client

  • The production SLATE federation now uses Helm 3 throughout.
  • Edge cluster registration refactored to address problems with auto-detection selecting private IP addresses.
  • Improved caching of available applications in the API server.
  • Updated the CLI client program to more consistently use exit status to signal failures, which should improve its usefulness in scripts.
  • Numerous fixes to the API server and catalog build system to account for changes in Helm 3.
  • Added support in the API server for automatically generating descriptions of edge cluster location using an external geocoding service.
  • Corrected mistakes in the API servers ability to collect and report storage and priority class information for edge clusters.
  • Expanded the CLI client to handle managing components like role-based access control and ingress controllers which are part of SLATE but require edge administrator privileges to manipulate to streamline installation and support updating and removal.
  • Created mechanisms for the SLATE client to handle tab completion across multiple shells
  • Added CLI commands for user management

SLATE Infrastructure

  • Configured all virtual machines used to form the SLATE platform which are housed at the University of Chicago to be scanned with InsightVM for security vulnerabilities.
  • Restructured automated builds of the CLI client to use the latest version of Alpine Linux for Linux builds, and a CentOS-based cross compiler to produce builds for Mac OS.
  • Work ongoing to improve Puppet infrastructure used by SLATE platform and clusters at the University of Chicago, University of Michigan, and University of Utah.

The SLATE Sandbox

  • Rebuilt with Kubernetes v1.18.0

MiniSLATE and SLATElite

  • MiniSLATE now uses K3s as its Kubernetes backend to provide for a more lightweight experience

Monitoring

  • Monitoring infrastructure ported to Helm 3
  • Ingress controller modified to allow embedded Grafana charts in the SLATE Console.
  • Prometheus and Thanos monitoring are now working for clusters at the University of Chicago, the University of Michigan, and the University of Utah, using OSiRIS storage infrastructure
  • Work continues to refine the SLATE monitoring infrastructure to be easy to use by sites
  • Work ongoing to develop CheckMK-based monitoring for the SLATE platform

Automation

  • Updates and hardening of Foreman and Puppet modules
  • Creation of Puppet modules that have ability to deploy full virtual Kubernetes machines or Kubernetes with SLATE pre-packaged
  • Updates of Puppet modules for updating physical hardware attributes

Applications

  • OSG HostedCE is now marked ‘stable’. The HostedCE application is being used in production by the the University of Utah and the University of Illinois
  • Logging sidecar pattern has now been integrated into HTCondor worker
  • Many SLATE charts have now been updated to have their container image colocated with the Chart definition
  • OSG GridFTP and Globus Connect v4 SLATE applications have been overhauled and can now support specifiable control and data port ranges
  • SLATE applications now support Network Policies, and will be integrated into all charts in coming weeks
  • The Open Science Grid has contributed an “Open Science CE” application, which is being used to support COVID19 research efforts.
  • MariaDB has landed in the stable catalog
  • MinIO is now available in the stable catalog
  • HTCondor submit and central manager charts are under active development in the incubator catalog. They currently have enough functionality to run a complete HTCondor pool within SLATE.
  • Jupyter with HTCondor is under active development in the incubator catalog
  • OSG Frontier Squid Chart extended to support advanced configuration options, including running multiple workers
  • Work to build Chart for Open OnDemand with Keycloak-based authentication is ongoing

Sites

  • University of Wisconsin-Madison has added a new SLATE cluster
  • New Mexico State University had added a new SLATE cluster
  • A new SLATE site has been added for the ATLAS collaboration in Prague, Czech Republic
  • The Texas Advanced Computing Center has added a SLATE node for the Frontera supercomputer
  • The Univ of Michigan cluster has been updated to Kubernetes 1.18
  • A SLATE cluster has been deployed at the South Pole in support of NSF polar research programs such as the South Pole Telescope. It has thus claimed the unofficial title of Southern Most Kubernetes Cluster.
The SLATE Team