Conversation
Assisted-by: Claude Code Signed-off-by: Javier Rodriguez <javier@chainloop.dev> Chainloop-Trace-Sessions: ef6d3cdb-5a23-445c-b39a-510b659023e4
Assisted-by: Claude Code Signed-off-by: Javier Rodriguez <javier@chainloop.dev> Chainloop-Trace-Sessions: ef6d3cdb-5a23-445c-b39a-510b659023e4 Signed-off-by: Javier Rodriguez <javier@chainloop.dev>
Assisted-by: Claude Code Signed-off-by: Javier Rodriguez <javier@chainloop.dev> Chainloop-Trace-Sessions: ef6d3cdb-5a23-445c-b39a-510b659023e4
Strengthen path validation in archive extraction to explicitly reject: - Absolute paths (e.g., /etc/passwd) - Relative path traversal attempts (e.g., ../ or foo/../..) Added direct unit test of safeArchivePath and integration test via tar with traversal entry to verify rejection. Assisted-by: Claude Code Signed-off-by: Javier Rodriguez <javier@chainloop.dev> Chainloop-Trace-Sessions: ef6d3cdb-5a23-445c-b39a-510b659023e4
Assisted-by: Claude Code Signed-off-by: Javier Rodriguez <javier@chainloop.dev> Chainloop-Trace-Sessions: ef6d3cdb-5a23-445c-b39a-510b659023e4
Assisted-by: Claude Code Signed-off-by: Javier Rodriguez <javier@chainloop.dev> Chainloop-Trace-Sessions: ef6d3cdb-5a23-445c-b39a-510b659023e4
Adds an exported AddMaterialsFromArchive method on the Crafter that walks every entry in a zip/tar/tar.gz archive, stages each entry as an independent material via stageMaterial, and commits the in-memory state in a single stateManager.Write call. If any entry fails, previously staged entries are rolled back so no partial state is ever persisted. Assisted-by: Claude Code Signed-off-by: Javier Rodriguez <javier@chainloop.dev> Chainloop-Trace-Sessions: ef6d3cdb-5a23-445c-b39a-510b659023e4
…ry on archive expand Assisted-by: Claude Code Signed-off-by: Javier Rodriguez <javier@chainloop.dev> Chainloop-Trace-Sessions: ef6d3cdb-5a23-445c-b39a-510b659023e4
Add MaxExtractEntries/MaxExtractSize to AttestationAddOpts and wire them into AttestationAdd with defaults from materials.DefaultArchiveLimits(). Change Run to return ([]*AttestationStatusMaterial, error) and insert an explode branch before the single-material switch: when --kind is set and the value is a non-archive-native archive, delegate to crafter.AddMaterialsFromArchive and return N results; otherwise the single-material path continues unchanged and its result is wrapped in a 1-element slice. Add shouldExplode helper and TestShouldExplode. Assisted-by: Claude Code Signed-off-by: Javier Rodriguez <javier@chainloop.dev> Chainloop-Trace-Sessions: ef6d3cdb-5a23-445c-b39a-510b659023e4
Assisted-by: Claude Code Signed-off-by: Javier Rodriguez <javier@chainloop.dev> Chainloop-Trace-Sessions: ef6d3cdb-5a23-445c-b39a-510b659023e4
Add TestAddMaterialsFromArchiveBehavior to crafter_test.go covering five end-to-end scenarios via AddMaterialsFromArchive: name collision suffixing (scan-json / scan-json-1), name prefix, skipping dirs and symlinks in tar.gz, path-traversal rejection with atomic rollback, and tar.gz happy path with two materials. Fixtures are built programmatically with buildZip/buildTarGz helpers using t.TempDir(); no binary fixtures committed. Also regenerate app/cli/documentation/cli-reference.mdx to include the --max-extract-entries and --max-extract-size flags added in earlier tasks. Assisted-by: Claude Code Signed-off-by: Javier Rodriguez <javier@chainloop.dev> Chainloop-Trace-Sessions: ef6d3cdb-5a23-445c-b39a-510b659023e4
… tests Assisted-by: Claude Code Signed-off-by: Javier Rodriguez <javier@chainloop.dev> Chainloop-Trace-Sessions: ef6d3cdb-5a23-445c-b39a-510b659023e4
…uards - detectByMagic now returns (ArchiveNone, nil) when os.Open fails so non-file material values (STRING, CONTAINER_IMAGE) never produce a spurious "no such file" error through the shouldExplode path. - safeArchivePath drops the over-broad strings.Contains(name, "..") early-return that wrongly rejected legitimate filenames like foo..bar.json; traversal detection now relies solely on path.Clean against a virtual root (/root/), which is the only reliable check. - Add a Warn log when --policy-input-from-file is supplied with an archive value so users know evidence cross-links are skipped on the explode path. - Name per-entry temp files with the allocated unique material name (matName) instead of filepath.Base(name) to eliminate basename collisions; remove each temp file immediately after staging to keep disk usage bounded. - Extend TestDetectArchive with .tar and .tgz cases; add writeTar helper mirroring writeTarGz without the gzip layer. Assisted-by: Claude Code Signed-off-by: Javier Rodriguez <javier@chainloop.dev> Chainloop-Trace-Sessions: ef6d3cdb-5a23-445c-b39a-510b659023e4
Render multi-material explode output as a single JSON array so --output json stays a parseable document; only the table renderer is emitted per material. Drop the unused size parameter from the archive visit signature, document the intentional zero-length-entry skips and the zip symlink-detection limitation, and correct the --max-extract-size default label to 1GiB. Build the crafter archive test fixture in-process and drop the checked-in binary blob and SDD process artifacts. Assisted-by: Claude Code Signed-off-by: Javier Rodriguez <javier@chainloop.dev> Chainloop-Trace-Sessions: da72e107-14e9-4da1-add0-28004f542628, ef6d3cdb-5a23-445c-b39a-510b659023e4
AI Session AnalysisMissing AI Coding SessionsWe detected commits in this PR that were AI-assisted, but the matching Chainloop Trace session(s) could not be found in Chainloop. Please make sure the AI coding session evidence has been sent by the Chainloop CLI, or add the Learn more about Chainloop Trace. Powered by Chainloop and Chainloop Trace |
End-to-end examplesVerified by running the locally built CLI against a control plane (dry-run / local state). Server side is unchanged, so this needs only a CLI build. Explode a zip into one material per file
Behaviors verified
|
There was a problem hiding this comment.
7 issues found across 8 files
Reply with feedback, questions, or to request a fix.
Re-trigger cubic
Additional examples — across material typesEach archive was built from the repo's own
Each entry is crafted with the same |
Validate --max-extract-size before the uint64->int64 cast so oversized values are rejected instead of wrapping negative. Check the per-entry temp file Close error before staging so a failed flush never produces an incomplete material, and write each entry into its own temp subdirectory under its original basename so the recorded artifact filename keeps the real name rather than the sanitized material key. Reject Windows drive-letter absolute paths in safeArchivePath, and only swallow not-found errors in detectByMagic so permission/I/O errors surface. Treat non-positive extraction limits as use-default. Assert the exact detected archive format and the preserved filename in tests. Assisted-by: Claude Code Signed-off-by: Javier Rodriguez <javier@chainloop.dev> Chainloop-Trace-Sessions: da72e107-14e9-4da1-add0-28004f542628, ef6d3cdb-5a23-445c-b39a-510b659023e4
There was a problem hiding this comment.
1 issue found across 7 files (changes from recent commits).
Reply with feedback, questions, or to request a fix.
Re-trigger cubic
A non-file material value whose first path segment is an existing regular file (e.g. a CONTAINER_IMAGE ref like "registry/app:v1") yields ENOTDIR on open rather than not-found. Swallow it alongside fs.ErrNotExist so such values detect as non-archive instead of failing detection; genuine permission/I/O errors still surface. Assisted-by: Claude Code Signed-off-by: Javier Rodriguez <javier@chainloop.dev> Chainloop-Trace-Sessions: da72e107-14e9-4da1-add0-28004f542628, ef6d3cdb-5a23-445c-b39a-510b659023e4
…tics filepath.Base treats backslash as a separator only on Windows, so an entry name embedding a literal backslash produced different material names across platforms. Add materials.ArchiveEntryBaseName, which normalizes separators and uses path.Base, and derive the per-entry basename through it so exploded material names are identical on every OS. Assisted-by: Claude Code Signed-off-by: Javier Rodriguez <javier@chainloop.dev> Chainloop-Trace-Sessions: da72e107-14e9-4da1-add0-28004f542628, ef6d3cdb-5a23-445c-b39a-510b659023e4
|
Thanks, let's make the names based on the original zipped material name |
Per-entry material names are now sequential rather than derived from the entry basename: material-1, material-2, … with no --name, or <name>-1, <name>-2, … when --name is provided (used as the prefix). The original entry filename is still preserved in the recorded artifact metadata. Replaces NameAllocator.Allocate with AllocateSequential. Assisted-by: Claude Code Signed-off-by: Javier Rodriguez <javier@chainloop.dev> Chainloop-Trace-Sessions: da72e107-14e9-4da1-add0-28004f542628, ef6d3cdb-5a23-445c-b39a-510b659023e4
There was a problem hiding this comment.
1 issue found across 4 files (changes from recent commits).
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="pkg/attestation/crafter/crafter.go">
<violation number="1" location="pkg/attestation/crafter/crafter.go:873">
P2: Archive-expanded material names use sequential numbering instead of deriving from entry filenames, directly contradicting the PR description which states "Per-entry material names derive from the entry filename, sanitized to DNS-1123 with collision suffixes."</violation>
</file>
Tip: Review your code locally with the cubic CLI to iterate faster.
Re-trigger cubic
| // (with archive "/" semantics, OS-independently) and used for the temp | ||
| // file so the recorded artifact filename preserves the real name. | ||
| base := materials.ArchiveEntryBaseName(name) | ||
| matName := allocator.AllocateSequential(namePrefix) |
There was a problem hiding this comment.
P2: Archive-expanded material names use sequential numbering instead of deriving from entry filenames, directly contradicting the PR description which states "Per-entry material names derive from the entry filename, sanitized to DNS-1123 with collision suffixes."
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At pkg/attestation/crafter/crafter.go, line 873:
<comment>Archive-expanded material names use sequential numbering instead of deriving from entry filenames, directly contradicting the PR description which states "Per-entry material names derive from the entry filename, sanitized to DNS-1123 with collision suffixes."</comment>
<file context>
@@ -865,10 +865,12 @@ func (c *Crafter) AddMaterialsFromArchive(
+ // file so the recorded artifact filename preserves the real name.
base := materials.ArchiveEntryBaseName(name)
- matName := allocator.Allocate(namePrefix, base)
+ matName := allocator.AllocateSequential(namePrefix)
// Give each entry its own temp subdirectory (named by the unique material
</file context>
Exploded archive materials are now numbered from 0 (material-0, material-1, … or <name>-0, <name>-1, …) instead of 1. Assisted-by: Claude Code Signed-off-by: Javier Rodriguez <javier@chainloop.dev> Chainloop-Trace-Sessions: da72e107-14e9-4da1-add0-28004f542628, ef6d3cdb-5a23-445c-b39a-510b659023e4
Avoid the repeated "material" string literal (goconst). Assisted-by: Claude Code Signed-off-by: Javier Rodriguez <javier@chainloop.dev> Chainloop-Trace-Sessions: da72e107-14e9-4da1-add0-28004f542628, ef6d3cdb-5a23-445c-b39a-510b659023e4
Manual end-to-end verificationRun against the labs control plane + CAS ( Fixture Observed results:
|
Fixes revive redefines-builtin-id on the archive behavior test. Assisted-by: Claude Code Signed-off-by: Javier Rodriguez <javier@chainloop.dev> Chainloop-Trace-Sessions: da72e107-14e9-4da1-add0-28004f542628, ef6d3cdb-5a23-445c-b39a-510b659023e4
| chainloop attestation add --name sigcheck --value sigcheckResult.csv --kind SYSINTERNALS_SIGCHECK \ | ||
| --policy-input-from-file ignored_paths=exception.csv:Path`, | ||
| RunE: func(cmd *cobra.Command, _ []string) error { | ||
| maxExtractSizeBytes, err := bytefmt.ToBytes(maxExtractSize) |
There was a problem hiding this comment.
how do you know it's something to extract?
There was a problem hiding this comment.
It no longer explodes on detection alone — only when the kind is in the SBOM/SARIF allowlist AND the value sniffs as an archive (e358fb8).
| } | ||
| // Guard against the uint64->int64 cast wrapping negative, which would | ||
| // later surface as a misleading "archive too large" error. | ||
| if maxExtractSizeBytes > math.MaxInt64 { |
There was a problem hiding this comment.
this shoudl not apply to all materials just when we know taht we want to untar smth. Customers can provide a regular zip file.
There was a problem hiding this comment.
Agreed. As of e358fb8 explosion only applies to the SBOM/SARIF allowlist; a regular zip for any other kind is stored as a single material.
| // n is a zero-based counter that advances across calls and skips names already | ||
| // in use. prefix is sanitized to DNS-1123; an empty or symbol-only prefix yields | ||
| // the base "material" (so entries are named material-0, material-1, …). | ||
| func (a *NameAllocator) AllocateSequential(prefix string) string { |
There was a problem hiding this comment.
Right — consolidated in e358fb8. materials.SanitizeMaterialName is now the single sanitizer (returns the bare component, empty when nothing usable), and the action layer's sanitizeMaterialNamePart delegates to it with its own "input" fallback instead of duplicating the logic.
Drop the redundant explode bool from shouldExplode (derivable from the returned ArchiveFormat) and branch on the format at the call site. Reuse output.EncodeOutput for the multi-material result instead of a hand-rolled format switch, adding []*AttestationStatusMaterial to the tabulated-data union so the slice renders as a single JSON array. Assisted-by: Claude Code Signed-off-by: Javier Rodriguez <javier@chainloop.dev> Chainloop-Trace-Sessions: da72e107-14e9-4da1-add0-28004f542628, ef6d3cdb-5a23-445c-b39a-510b659023e4
| return nil | ||
| } | ||
|
|
||
| func walkTar(p string, gzipped bool, visit func(name string, r io.Reader) error) error { |
There was a problem hiding this comment.
Tar is supported — walkTar covers .tar, .tar.gz and .tgz (gzip-wrapped). See the note on the zip thread re: reusing existing private readers.
| return (c >= 'a' && c <= 'z') || (c >= 'A' && c <= 'Z') | ||
| } | ||
|
|
||
| func walkZip(p string, visit func(name string, r io.Reader) error) error { |
There was a problem hiding this comment.
We do — walkZip handles .zip and walkTar handles .tar/.tar.gz/.tgz. There was no shared/exported archive walker to reuse (zap.go, helmchart.go and radamsa.go each inline their own private zip/tar reading), so WalkArchiveEntries is the new shared helper in the materials package. Happy to refactor those existing readers onto it as a follow-up if you'd like.
There was a problem hiding this comment.
I mean if we don't have it already somewhere else
Address review feedback: explode only kinds with a meaningful bundle form (SBOM_CYCLONEDX_JSON, SBOM_SPDX_JSON, SARIF) instead of exploding any archive that is not archive-native. Every other kind (ARTIFACT, EVIDENCE, ZAP_DAST_ZIP, …) records the archive whole, so a regular zip provided as a single material is never expanded. Replaces the archive-native denylist with an explodable-kind allowlist. Also consolidate name sanitization: SanitizeMaterialName now returns the bare sanitized component (empty when nothing remains) and callers supply their fallback, so the action layer's sanitizeMaterialNamePart reuses it instead of duplicating the logic. Assisted-by: Claude Code Signed-off-by: Javier Rodriguez <javier@chainloop.dev> Chainloop-Trace-Sessions: da72e107-14e9-4da1-add0-28004f542628, ef6d3cdb-5a23-445c-b39a-510b659023e4
Summary
chainloop attestation add --kind <KIND> --value <archive>now unpacks a.zip,.tar,.tar.gz, or.tgzfor the kinds that have a meaningful "bundle of the same kind" archive form and records each contained regular file as its own material of the given kind, instead of requiring oneatt addinvocation per file.Behavior
SBOM_CYCLONEDX_JSON,SBOM_SPDX_JSON, andSARIF. For one of these, when the value resolves to a supported archive, each contained regular file is added as an individual material of that kind.ARTIFACT,EVIDENCE,ZAP_DAST_ZIP) records the archive whole, so a customer can still provide a regular zip as a single material.material-0,material-1, … with no--name, or<name>-0,<name>-1, … when--nameis provided (used as the prefix, sanitized to DNS-1123). Names already present in the attestation are skipped. The recorded artifact filename preserves the original entry basename.--annotationare applied to every extracted material.--max-extract-entries(default 10000) and--max-extract-size(default 1GiB) are enforced while streaming, and directory, symlink, absolute, and path-traversal entries are skipped or rejected.For non-explodable kinds, non-archive values, and no
--kind, behavior is unchanged. There are no server-side, contract, or gRPC changes.Assumptions and scope
--kind; archives mixing material types are not split by type.--policy-input-from-fileis ignored on the explode path (a warning is logged); policy inputs and evidence cross-links are not recorded for exploded materials..zip,.tar,.tar.gz,.tgz); bzip2/xz are not supported in this iteration.AI disclosure
This contribution was assisted by Claude Code.