Setup Data Curation Environment at the USDF¶
Most of Rubin data curation activities can be done via individual member’s own account.
Login Nodes and Profiles¶
Use interactive node rubin-devl in most case. In some specially cases, you can use DTN node
sdfdtn001 or sdfdtn002.
Create symlinks in your $HOME/.profile.d directory that point to files in /sdf/group/rubin/sw/profile.d.
These login configuration profiles should be run before additional data curation related environment setup
profiles are run. A good way to do this to created a file $HOME/.profile.d/99-data-curation.conf.
To obtain the lastest LSST tools (butler), and Rucio tools, add the following to the
99-data-curation.conf file:
source /cvmfs/sw.lsst.eu/almalinux-x86_64/lsst_distrib/w_2026_02/loadLSST.sh
setup lsst_distrib
Access Embargo S3 Storage¶
In many cases, we use the MinIO client tool to access the Embargo S3 storage directly. To load the MinIO client tools, run command
module load mc
The MinIO client configuration file is located at $HOME/.mc/config.json. Put the following content
in the file:
{
"version": "10",
"aliases": {
"embargo": {
"url": "https://sdfembs3.sdf.slac.stanford.edu:443",
"accessKey": "<get from Vault>",
"secretKey": "<get from Vault>",
"api": "s3v4",
"path": "auto"
}
}
}
Note on conventions: “embargo” is used in some commands. In mc ls embargo/rubin-summit this means
the embargo alias defined above. In butler query-dataset embargo "raw", this means the
embargo repository defined in the butler configuration file (see file $DAF_BUTLER_REPOSITORY_INDEX).
Files in $HOME/.lsst used by Butler¶
Butler uses several credential files in your $HOME/.lsst directory to access S3 storage and Bulter
databases.
Remove $HOME/.lsst/db-auth.yaml file. It is obsolted and should NOT be used.
AWS S3 Credentials¶
Butler uses credebtials in $HOME/.lsst/aws-credentials.ini to access the S3 storge
(Reference).
For exmaple, the following contains access credentials for the Embargo S3 storage:
[embargo]
aws_access_key_id = <your access key id from Vault>
aws_secret_access_key = <your secret access key from Vault>
Postgres Credentials¶
Bulter uses info in $HOME/.lsst/postgres-credentials.txt to access the Bulter databases. The following two
lines give exampes of the content of the file:
usdf-butler.slac.stanford.edu:5432:lsstdb1:rubin:<repo main's DB password in Vault>
usdf-butler-embargo-db-tx.sdf.slac.stanford.edu:5432:lsstdb1:rubin:<repo embargo's DB password in Vault>
The Rucio Configuretion¶
If you will need to work with Rucio, add the following to the 99-data-curation.conf.
export RUCIO_CONFIG=$HOME/.config/rucio-rubin.config
export RUCIO_ACCOUNT=$(id -un)
The $RUCIO_CONFIG file will look like this:
[client]
rucio_host = https://rubin-rucio.slac.stanford.edu:8443
auth_host = https://rubin-rucio.slac.stanford.edu:8443
auth_type = ssh
ssh_private_key = $HOME/.ssh/id_rsa
In the above example we use SSH authentication for Rucio. Most team members use this method but other
authentcation methods such as X.509 are also
supported. If you don’t have an account in Rucio, you will need to ask another member of the team to create
one for you with appropriate privileges in Rucio. The above example also assume that your Rucio account name
is the same as your Unix user name (id -un).
Access Privileges¶
Ask other members of the team and S3DF for access to various secrets in Vault, access to Kubernetes vClusters and Rucio/FTS/RSEs
Vault¶
For convenience, define the following environment variable in the 99-data-curation.conf
export VAULT_ADDR="https://vault.sdf.slac.stanford.edu:8200"
Each time you start using vault, do the following:
module load vault
vault login -method=ldap username=$(id -un)
Type in your S3DF Windows AD password when prompted.
Kubernetes¶
Likely Kubernetes vClusters that you will need access:
usdf-embargo-dmz
user-rucio (and -dev)
usdf-fts3 (and -dev)
Accessing to usdf-rucio and usdf-fts3 vClusters are needed only for the purpose of configuation, etc.
It is not needed for day-to-day use of Rucio/FTS and RSEs.
Kubernetes access tokens expires in a day or two. So you will need to re-generate the access tokens from
time to time. To do so, go to https://k8s.slac.stanford.edu/<vcluster-name> and following the instructions
there.
FTS and RSEs¶
To register to VOMS, access the FTS and RSEs via command lines, you will need X.509 certificates. Ask other members of the team for help.