In our previous blog post on JupyterLab and HTCondor with SLATE, we showed how a user can leverage multiple SLATE catalog applications to deploy a JupyterLab instance and a test HTCondor pool on SLATE.
In this blog post, however, we demonstrate how you can deploy a JupyterLab from the SLATE catalog, and then use it to submit HTCondor jobs to the OSG’s central pool. We thought this would be very helpful for those users who work with such a production-scale high-throughput cluster and would like to use the condor-submit feature that’s integrated within the SLATE JupyterLab application to submit their jobs.
In this blog post, we assume you have a SLATE account and client installed on your laptop (c.f. the SLATE quickstart) and access to a SLATE registered Kubernetes cluster.
One of the challenges of distributed high-throughput computing is efficiently moving data to and from jobs across a national fabric of resources. To tackle this problem, the Open Science Grid developed a data caching technology known as StashCache. Sites who are already providing compute resources to the OSG can streamline data access by deploying a cache at their site. Thanks to the low-risk nature of caches, StashCache is an excellent way to get started with SLATE at your site while streamlining data access and reducing overall bandwidth consumed by OSG jobs. In this blog post, we’ll fully deploy an OSG StashCache service and link it up to the federation.
This post assumes that you already have a working SLATE cluster, but if not you can visit the Cluster installation guide here: https://slateci.io/docs/cluster/
If you plan to join this StashCache to the OSG federation, you’ll additionally need to get a IGTF host certificate for this service from your institution.
First, you’ll want to get the configuration for the StashCache application:
slate app get-conf stashcache > stashcache.yaml
Open it in your favorite editor, e.g. vim:
You’ll want to make changes in 4 sections. First, change the Instance tag to something memorable. I used “iu-mwt2”.
# Label for this particular deployed instance # Results in a name like "stashcache-[Instance]" Instance: "iu-mwt2"
If you want to use this cache for real data, you’ll want to point StashCache at some directory on your host system, from which StashCache will serve data.
For this deployment, I have mounted an XFS filesystem to the mountpoint “/slate-cache” on the host system. The configuration will need to be modified correspondingly.
StashCache: # The directory on the host system in which the cache should store its data. # If unspecified, ephemeral storage will be used, meaning that the cache # contents will be lost any time the application is restarted. CacheDirectory: /slate-cache
While you’re here, you may want to change other options as well. Since this will be a production cache, I increased the
RamSize from the default 1GB to 64GB:
# The amount of memory the cache is allowed to use (in GB) RamSize: 64g
In order for your Cache to communicate with the central OSG collector, you will need to acquire and install an IGTF certificate. The process of acquiring the certificate is outside of the scope of this blog post, but once you have it you’ll need to ensure that it is in PEM format.
To install the certificate to the cluster, you’ll need to use the
slate secret command:
slate secret create stashcache-cert --from-file=hostcert.pem --from-file=hostkey.pem
Then, you’ll want to take the secret name and put that into the StashCache config file:
# Host Certificate and Key # The keys contained in the secret must be: # hostcert.pem # hostkey.pem # run 'slate secret create --help' for usage # Leaving this as the empty string will disable the authenticated cache. hostCertSecret: stashcache-cert
It’s possible to install StashCache without an IGTF certificate, however you will not be able to federate your cache with the OSG federation.
Finally, once you have configured the StashCache application to your satisfaction, you can deploy the application:
slate app install stashcache --conf stashcache.yaml --group <your group> --cluster <your cluster>
If all goes well, SLATE will successfully install the cache and give you an instance ID. You can use that ID to check the status
$ slate instance info instance_6rH9TMkq0fY Fetching instance information... Name Started App Version Chart Version Group Cluster ID stashcache-mwt2-iu-test 2020-Jul-20 20:41:55.237425 UTC v4.12.0-rc2 stashcache-0.1.15 mwt2 mwt2-iu instance_6rH9TMkq0fY Services: Name Cluster IP External IP Ports URL stashcache-mwt2-iu-test-http 10.105.253.127 188.8.131.52 8000:31612/TCP 184.108.40.206:31612 stashcache-mwt2-iu-test-xroot 10.108.181.32 220.127.116.11 1094:32009/TCP 18.104.22.168:32009
From the URL field reported by SLATE, you can run a test to ensure your cache is working over the HTTP protocol:
$ curl http://22.214.171.124:31612/osgconnect/public/rynge/test.data hello world!
A major priority of SLATE is ensuring that our clusters and applications are secure. In order to better secure the applications there is an ongoing effort to ensure that they all have built in Network Policies. This allows a user or site administrator to more strictly limit who exactly has access to a given application.
The SLATE team and collaborators continue to target security as a major focus. The SLATE team is configuring all the offered applications in the SLATE stable catalog with Kubernetes Network Policy hooks in the Helm deployment charts. As an application developer, being able to build this functionality into your application will make many site administrators much more comfortable with your application being used on their clusters.
Posting a blog to the slateci.io website is easy but there are a few steps to keep in mind.