Skip to content

Commit ba6441e

Browse files
committed
KEP-5936: Add user fields to atomic write volumes
Signed-off-by: Gavin Lam <gavin.oss@tutamail.com>
1 parent c68dfb9 commit ba6441e

File tree

2 files changed

+440
-0
lines changed

2 files changed

+440
-0
lines changed
Lines changed: 394 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,394 @@
1+
# KEP-5936: Add user fields to atomic write volumes
2+
3+
<!-- toc -->
4+
- [Release Signoff Checklist](#release-signoff-checklist)
5+
- [Summary](#summary)
6+
- [Motivation](#motivation)
7+
- [Goals](#goals)
8+
- [Non-Goals](#non-goals)
9+
- [Proposal](#proposal)
10+
- [User Stories (Optional)](#user-stories-optional)
11+
- [Story 1: define owner UID of mounted volume files](#story-1-define-owner-uid-of-mounted-volume-files)
12+
- [Constraints](#constraints)
13+
- [Risks and Mitigations](#risks-and-mitigations)
14+
- [Design Details](#design-details)
15+
- [Changes to API Specs](#changes-to-api-specs)
16+
- [Test Plan](#test-plan)
17+
- [Prerequisite testing updates](#prerequisite-testing-updates)
18+
- [Unit tests](#unit-tests)
19+
- [Integration tests](#integration-tests)
20+
- [e2e tests](#e2e-tests)
21+
- [Graduation Criteria](#graduation-criteria)
22+
- [Alpha](#alpha)
23+
- [Upgrade / Downgrade Strategy](#upgrade--downgrade-strategy)
24+
- [Version Skew Strategy](#version-skew-strategy)
25+
- [Production Readiness Review Questionnaire](#production-readiness-review-questionnaire)
26+
- [Feature Enablement and Rollback](#feature-enablement-and-rollback)
27+
- [Rollout, Upgrade and Rollback Planning](#rollout-upgrade-and-rollback-planning)
28+
- [Monitoring Requirements](#monitoring-requirements)
29+
- [Dependencies](#dependencies)
30+
- [Scalability](#scalability)
31+
- [Troubleshooting](#troubleshooting)
32+
- [Implementation History](#implementation-history)
33+
- [Drawbacks](#drawbacks)
34+
- [Alternatives](#alternatives)
35+
- [Infrastructure Needed (Optional)](#infrastructure-needed-optional)
36+
<!-- /toc -->
37+
38+
## Release Signoff Checklist
39+
40+
<!--
41+
**ACTION REQUIRED:** In order to merge code into a release, there must be an
42+
issue in [kubernetes/enhancements] referencing this KEP and targeting a release
43+
milestone **before the [Enhancement Freeze](https://git.k8s.io/sig-release/releases)
44+
of the targeted release**.
45+
46+
For enhancements that make changes to code or processes/procedures in core
47+
Kubernetes—i.e., [kubernetes/kubernetes], we require the following Release
48+
Signoff checklist to be completed.
49+
50+
Check these off as they are completed for the Release Team to track. These
51+
checklist items _must_ be updated for the enhancement to be released.
52+
-->
53+
54+
Items marked with (R) are required *prior to targeting to a milestone / release*.
55+
56+
- [ ] (R) Enhancement issue in release milestone, which links to KEP dir in [kubernetes/enhancements] (not the initial KEP PR)
57+
- [ ] (R) KEP approvers have approved the KEP status as `implementable`
58+
- [ ] (R) Design details are appropriately documented
59+
- [ ] (R) Test plan is in place, giving consideration to SIG Architecture and SIG Testing input (including test refactors)
60+
- [ ] e2e Tests for all Beta API Operations (endpoints)
61+
- [ ] (R) Ensure GA e2e tests meet requirements for [Conformance Tests](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/conformance-tests.md)
62+
- [ ] (R) Minimum Two Week Window for GA e2e tests to prove flake free
63+
- [ ] (R) Graduation criteria is in place
64+
- [ ] (R) [all GA Endpoints](https://github.com/kubernetes/community/pull/1806) must be hit by [Conformance Tests](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/conformance-tests.md) within one minor version of promotion to GA
65+
- [ ] (R) Production readiness review completed
66+
- [ ] (R) Production readiness review approved
67+
- [ ] "Implementation History" section is up-to-date for milestone
68+
- [ ] User-facing documentation has been created in [kubernetes/website], for publication to [kubernetes.io]
69+
- [ ] Supporting documentation—e.g., additional design documents, links to mailing list discussions/SIG meetings, relevant PRs/issues, release notes
70+
71+
<!--
72+
**Note:** This checklist is iterative and should be reviewed and updated every time this enhancement is being considered for a milestone.
73+
-->
74+
75+
[kubernetes.io]: https://kubernetes.io/
76+
[kubernetes/enhancements]: https://git.k8s.io/enhancements
77+
[kubernetes/kubernetes]: https://git.k8s.io/kubernetes
78+
[kubernetes/website]: https://git.k8s.io/website
79+
80+
## Summary
81+
82+
This KEP proposes adding optional `DefaultUser` and `User` fields to atomic write volumes, defining
83+
owner UID of the written files. Atomic write volumes include ConfigMap, Secret, DownwardAPI
84+
and Projected volumes.
85+
86+
This enables running software with strict file ownership requirements as non-root users,
87+
and mounting files from atomic write volumes.
88+
89+
## Motivation
90+
91+
This KEP resolves a long-standing and recurring ask to configure atomic write volume files with proper
92+
ownerships. There have been several issue tickets since 2014, each with a lot of comments and reactions,
93+
demonstrating the strong requirements from the Kubernetes user base.
94+
95+
Many popular software requires strict file ownerships, such as MongoDB replica set
96+
[key files][mongodb-key-files] and SSHD [host keys][sshd-host-keys]. The existing atomic write volume
97+
implementation creates files owned by root and therefore not satisfying such ownership requirements.
98+
A known workaround involves running an initContainers as root to perform chown.
99+
100+
However, this workaround is not possible in clusters implementing the [restricted][restricted-policy] pod
101+
security standards policy that follow the current pod hardening best practices. The policy requires pods to
102+
run as non-root and drop all capabilities, therefore rending this workaround impossible. Moreover, even in
103+
less hardened clusters, the workaround creates unnecessary friction and maintenance overhead for the users.
104+
105+
[mongodb-key-files]: https://www.mongodb.com/docs/manual/tutorial/enforce-keyfile-access-control-in-existing-replica-set/#enforce-keyfile-access-control-on-existing-replica-set
106+
[sshd-host-keys]: https://man.openbsd.org/sshd_config#HostKey
107+
[restricted-policy]: https://kubernetes.io/docs/concepts/security/pod-security-standards/#restricted
108+
109+
### Goals
110+
111+
1. Allow users to optionally define the desired file owner UID of atomic write volume files.
112+
113+
### Non-Goals
114+
115+
1. Define file owner GID. This is already covered by `PodSecurityContext.FsGroup` and
116+
`PodSecurityContext.SupplementalGroups`.
117+
118+
2. Configure file ownership of Windows pods.
119+
120+
3. Configure file ownership of other volume types.
121+
122+
## Proposal
123+
124+
1. Introduce an optional `DefaultUser` field to atomic write volume API objects. Atomic write
125+
volumes include ConfigMap, Secret, DownwardAPI and Projected volumes.
126+
127+
2. Introduce an optional `User` field to atomic write volume types child items API objects,
128+
allowing users to define file owner UID per item.
129+
130+
3. When writing files into the volumes, Kubelet configures their file owner UID according to
131+
the new fields.
132+
133+
### User Stories (Optional)
134+
135+
#### Story 1: define owner UID of mounted volume files
136+
137+
As a Kubernetes user, I want to run software that make use of files mounted from Kubernetes volumes
138+
and define owner UID of the mounted files.
139+
140+
### Constraints
141+
142+
This won't be implemented for Windows pods, since Windows doesn't support setting file ownership for virtualized
143+
container accounts.
144+
145+
However, this is a common limitation of many related fields, such as `runAsUser`, `runAsGroup`, `fsGroup`,
146+
`supplementalGroups`, `defaultMode` and `mode`.
147+
148+
### Risks and Mitigations
149+
150+
Risks are minimal.
151+
152+
The new fields are optional, and affect an ephemeral volume or an ephemeral volume file mapping only.
153+
154+
## Design Details
155+
156+
### Changes to API Specs
157+
158+
```go
159+
type SecretVolumeSource struct {
160+
+ DefaultUser *int64
161+
}
162+
163+
type ConfigMapVolumeSource struct {
164+
+ DefaultUser *int64
165+
}
166+
167+
type ProjectedVolumeSource struct {
168+
+ DefaultUser *int64
169+
}
170+
171+
type DownwardAPIVolumeSource struct {
172+
+ DefaultUser *int64
173+
}
174+
175+
type KeyToPath struct {
176+
+ User *int64
177+
}
178+
179+
type DownwardAPIVolumeFile struct {
180+
+ User *int64
181+
}
182+
183+
type ServiceAccountTokenProjection struct {
184+
+ User *int64
185+
}
186+
187+
type ClusterTrustBundleProjection struct {
188+
+ User *int64
189+
}
190+
191+
type PodCertificateProjection struct {
192+
+ User *int64
193+
}
194+
```
195+
196+
### Test Plan
197+
198+
[x] I/we understand the owners of the involved components may require updates to
199+
existing tests to make this code solid enough prior to committing the changes necessary
200+
to implement this enhancement.
201+
202+
##### Prerequisite testing updates
203+
204+
No.
205+
206+
##### Unit tests
207+
208+
- `k8s.io/kubernetes/pkg/apis/core/validation/validation.go`: `2026-02-28` - `85.3`
209+
- `k8s.io/kubernetes/pkg/volume/configmap`: `2026-02-28` - `76.4`
210+
- `k8s.io/kubernetes/pkg/volume/downwardapi`: `2026-02-28` - `51.1`
211+
- `k8s.io/kubernetes/pkg/volume/projected`: `2026-02-28` - `70`
212+
- `k8s.io/kubernetes/pkg/volume/secret`: `2026-02-28` - `67.3`
213+
- `k8s.io/kubernetes/pkg/volume/util/atomic_writer`: `2026-02-28` - `72.6`
214+
215+
##### Integration tests
216+
217+
##### e2e tests
218+
219+
Extend the existing volume end-to-end tests.
220+
221+
Create a `agnhost` test pod with the volume definition under test.
222+
Make use of `mounttest` binary to verify file ownership of the files.
223+
224+
### Graduation Criteria
225+
226+
#### Alpha
227+
228+
- Feature implemented behind a feature flag
229+
- Initial e2e tests completed and enabled
230+
231+
### Upgrade / Downgrade Strategy
232+
233+
### Version Skew Strategy
234+
235+
## Production Readiness Review Questionnaire
236+
237+
### Feature Enablement and Rollback
238+
239+
###### How can this feature be enabled / disabled in a live cluster?
240+
241+
- [x] Feature gate (also fill in values in `kep.yaml`)
242+
- Feature gate name: AtomicWriteVolumeUserFields
243+
- Components depending on the feature gate:
244+
- kube-apiserver
245+
- kubelet
246+
247+
###### Does enabling the feature change any default behavior?
248+
249+
No.
250+
251+
###### Can the feature be disabled once it has been enabled (i.e. can we roll back the enablement)?
252+
253+
Yes.
254+
255+
Existing volume files with the user fields in existing pods are not affected, since the files have been created and
256+
configured with a file owner already.
257+
258+
Only newly created volume files will be affected.
259+
260+
###### What happens if we reenable the feature if it was previously rolled back?
261+
262+
Existing volume files with the user fields in existing pods are not affected, since the files have been created and
263+
configured with a file owner already.
264+
265+
Only newly created volume files will be affected.
266+
267+
###### Are there any tests for feature enablement/disablement?
268+
269+
### Rollout, Upgrade and Rollback Planning
270+
271+
###### How can a rollout or rollback fail? Can it impact already running workloads?
272+
273+
###### What specific metrics should inform a rollback?
274+
275+
###### Were upgrade and rollback tested? Was the upgrade->downgrade->upgrade path tested?
276+
277+
###### Is the rollout accompanied by any deprecations and/or removals of features, APIs, fields of API types, flags, etc.?
278+
279+
No.
280+
281+
### Monitoring Requirements
282+
283+
###### How can an operator determine if the feature is in use by workloads?
284+
285+
###### How can someone using this feature know that it is working for their instance?
286+
287+
- [ ] Events
288+
- Event Reason:
289+
- [ ] API .status
290+
- Condition name:
291+
- Other field:
292+
- [ ] Other (treat as last resort)
293+
- Details:
294+
295+
###### What are the reasonable SLOs (Service Level Objectives) for the enhancement?
296+
297+
No changes to kubelet SLOs.
298+
299+
###### What are the SLIs (Service Level Indicators) an operator can use to determine the health of the service?
300+
301+
- [x] Metrics
302+
- Metric name: `storage_operation_duration_seconds` (existing metric)
303+
- Aggregation method: filter by `volume_plugin` = one of
304+
`kubernetes.io/configmap`, `kubernetes.io/downward-api`, `kubernetes.io/projected` or `kubernetes.io/secret`
305+
- Components exposing the metric: kube-apiserver
306+
307+
###### Are there any missing metrics that would be useful to have to improve observability of this feature?
308+
309+
No.
310+
311+
### Dependencies
312+
313+
###### Does this feature depend on any specific services running in the cluster?
314+
315+
No.
316+
317+
### Scalability
318+
319+
###### Will enabling / using this feature result in any new API calls?
320+
321+
No.
322+
323+
###### Will enabling / using this feature result in introducing new API types?
324+
325+
No.
326+
327+
###### Will enabling / using this feature result in any new calls to the cloud provider?
328+
329+
No.
330+
331+
###### Will enabling / using this feature result in increasing size or count of the existing API objects?
332+
333+
Yes.
334+
335+
The new optional `DefaultUser` and `User` fields of atomic write volume API objects have integer values of 64-bit.
336+
337+
###### Will enabling / using this feature result in increasing time taken by any operations covered by existing SLIs/SLOs?
338+
339+
No.
340+
341+
###### Will enabling / using this feature result in non-negligible increase of resource usage (CPU, RAM, disk, IO, ...) in any components?
342+
343+
No.
344+
345+
###### Can enabling / using this feature result in resource exhaustion of some node resources (PIDs, sockets, inodes, etc.)?
346+
347+
No.
348+
349+
### Troubleshooting
350+
351+
###### How does this feature react if the API server and/or etcd is unavailable?
352+
353+
Not applicable. Volume files of pods are not affected if the API server and/or etcd is unavailable.
354+
355+
###### What are other known failure modes?
356+
357+
Not applicable.
358+
359+
###### What steps should be taken if SLOs are not being met to determine the problem?
360+
361+
Not applicable.
362+
363+
## Implementation History
364+
365+
## Drawbacks
366+
367+
Additional complexity in the atomic write volume modules.
368+
369+
However, the added complexity is minimal, since the feature is able to reuse the internal `FileProjection.FsUser`
370+
mechanism introduced by [KEP-1205][kep-1205-file-permission].
371+
372+
[kep-1205-file-permission]: https://github.com/kubernetes/enhancements/tree/master/keps/sig-auth/1205-bound-service-account-tokens#file-permission
373+
374+
## Alternatives
375+
376+
An idea of a `podSecurityContext.fsUser` has been proposed. However, I believe this KEP is preferrable
377+
because of the following reasons.
378+
379+
1. `podSecurityContext.fsUser` is a pod level construct. It doesn't support running multiple containers
380+
as different users and requiring different file owners per volumes or files.
381+
382+
2. `podSecurityContext.fsUser` is a pod level construct and implies supporting all the volume types.
383+
This may also imply passing `fsUser` to CSI drivers, since Kubernetes is [currently passing][csi-fsgroup]
384+
`fsGroup` to CSI drivers.
385+
386+
3. Its interaction with `fsGroupChangePolicy` is problematic. For example, users may reasonably expect
387+
`fsUser` to follow the same behavior of `fsGroupChangePolicy`. Adding `fsUser` may also entail a new
388+
`fsUserChangePolicy` feature.
389+
390+
[csi-fsgroup]: https://kubernetes.io/blog/2022/12/23/kubernetes-12-06-fsgroup-on-mount/
391+
392+
## Infrastructure Needed (Optional)
393+
394+
No.

0 commit comments

Comments
 (0)