Remote Storage¶
Share cached pipeline outputs across machines using S3. Push outputs from your laptop, pull them in CI or on a teammate's machine — stages with matching cached outputs skip instead of re-running.
Prerequisites¶
Install Pivot with S3 support:
Pivot uses aioboto3 under the hood. Standard AWS credentials (env vars, ~/.aws/credentials, IAM roles) work automatically.
Configure a Remote¶
# Add an S3 remote
pivot config set remotes.origin s3://my-bucket/pivot-cache
# Set it as default
pivot config set default_remote origin
Verify with:
The remote URL must be an S3 path (s3://bucket/prefix). Pivot stores cached files under this prefix, organized by content hash.
Push Outputs¶
After running your pipeline locally, upload cached outputs to S3:
Output:
Selective Push¶
# Push outputs from specific stages only
pivot push train
# Push a specific file
pivot push model/best.pkl
# Preview what would be uploaded
pivot push --dry-run
# More parallel uploads for large caches
pivot push --jobs 40
Options¶
| Option | Description |
|---|---|
-r, --remote |
Target a named remote instead of the default |
-n, --dry-run |
Show what would be pushed without uploading |
-j, --jobs |
Number of parallel upload threads |
Pull Outputs¶
On another machine, download cached outputs and restore them to the workspace:
pivot pull combines two operations: fetch (download from S3 to local cache) and checkout (restore files from local cache to workspace). This mirrors git pull = git fetch + git merge.
After pulling, pivot repro skips stages whose outputs are already present and match their lock file hashes.
Selective Pull¶
# Pull outputs for specific stages
pivot pull train
# Pull a specific file
pivot pull model/best.pkl
# Preview what would be downloaded
pivot pull --dry-run
Options¶
| Option | Description |
|---|---|
-r, --remote |
Pull from a named remote instead of the default |
-n, --dry-run |
Show what would be downloaded |
-j, --jobs |
Number of parallel download threads |
-f, --force |
Overwrite existing workspace files |
--only-missing |
Only restore files that don't exist on disk |
--checkout-mode |
symlink, hardlink, or copy (default: project config) |
Fetch and Checkout Separately¶
For more control, split the two steps:
# Download to local cache only (no workspace changes)
pivot fetch
# Restore from local cache to workspace
pivot checkout
This is useful when you want to pre-populate the cache without touching the working tree, or restore specific files after fetching.
# Fetch everything, then selectively checkout
pivot fetch
pivot checkout model/best.pkl
pivot checkout --only-missing
What Gets Transferred¶
| Data | Pushed/Pulled | Version controlled |
|---|---|---|
Cache files (.pivot/cache/files/) |
Yes | No (.gitignore) |
Lock files (.pivot/stages/*.lock) |
No | Yes (commit these) |
Config (.pivot/config.yaml) |
No | Yes (commit this) |
Lock files are the bridge. They map output paths to content hashes. Push uploads the files referenced by those hashes; pull downloads them. Always commit lock files to version control so other machines know what to pull.
Multiple Remotes¶
# Add a second remote
pivot config set remotes.backup s3://backup-bucket/pivot-cache
# Push to a specific remote
pivot push --remote backup
# Change the default
pivot config set default_remote backup
List configured remotes:
AWS Credentials¶
Pivot uses the standard AWS credential chain:
- Environment variables (
AWS_ACCESS_KEY_ID,AWS_SECRET_ACCESS_KEY) - AWS credentials file (
~/.aws/credentials) - IAM roles (EC2, ECS, Lambda)
- SSO/profile via
AWS_PROFILE
No Pivot-specific credential configuration needed — if aws s3 ls works, pivot push works.
GitHub Actions¶
For a complete GitHub Actions workflow, see the CI Integration guide. The key remote commands for CI are:
# In CI: pull cached outputs before running
pivot pull || true
# After successful run on main: push to remote
pivot push
Typical Team Workflow¶
Developer (local):
# Iterate on a feature branch
pivot repro
pivot push
git add .pivot/stages/*.lock
git commit -m "Update pipeline outputs"
git push
CI (after merge to main):
Teammate (pulling latest):
Troubleshooting¶
Push/pull fails with access denied:
Cache miss after pull — stages re-run unexpectedly:
Lock files may be out of sync. Ensure .pivot/stages/*.lock files are committed:
Slow transfers:
Increase parallelism:
Or set the default: pivot config set remote.jobs 40.
Related¶
- Caching — content-addressed caching and skip detection
- Cross-Repo Import — import artifacts from other Pivot projects
- CI Integration — more CI patterns and providers