Compare commits

...

46 Commits

Author SHA1 Message Date
Patrick O'Doherty
db87ec8efe client/web: fix CSRF handler order in web UI (#15143)
Fix the order of the CSRF handlers (HTTP plaintext context setting,
_then_ enforcement) in the construction of the web UI server. This
resolves false-positive "invalid Origin" 403 exceptions when attempting
to update settings in the web UI.

Add unit test to exercise the CSRF protection failure and success cases
for our web UI configuration.

Updates #14822
Updates #14872

Signed-off-by: Patrick O'Doherty <patrick@tailscale.com>
2025-02-27 22:00:59 +00:00
Andrew Lytvynov
0ae928e050 appc: fix a deadlock in route advertisements (#15031) (#15088)
`routeAdvertiser` is the `iplocal.LocalBackend`. Calls to
`Advertise/UnadvertiseRoute` end up calling `EditPrefs` which in turn
calls `authReconfig` which finally calls `readvertiseAppConnectorRoutes`
which calls `AppConnector.DomainRoutes` and gets stuck on a mutex that
was already held when `routeAdvertiser` was called.

Make all calls to `routeAdvertiser` in `app.AppConnector` go through the
execqueue instead as a short-term fix.

Updates tailscale/corp#25965

Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
Co-authored-by: Irbe Krumina <irbe@tailscale.com>
2025-02-21 11:56:41 -08:00
Andrea Gottardo
c7a79d7bae VERSION.txt: this is v1.80.2 (#15003)
Signed-off-by: Andrea Gottardo <andrea@gottardo.me>
2025-02-12 13:31:52 -05:00
Brad Fitzpatrick
fefb04b56b Revert "control/controlclient: delete unreferenced mapSession UserProfiles"
This reverts commit 413fb5b933.

See long story in #14992

Updates #14992
Updates tailscale/corp#26058

Change-Id: I3de7d080443efe47cbf281ea20887a3caf202488
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2025-02-11 15:00:10 -08:00
Brad Fitzpatrick
ce31002559 go.mod: update x/net for macOS/iOS ParseRIB fix
Updates #14201

Change-Id: Iea13c14e16dea1b349a7a8d999d7beef4d2f4a1c
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2025-02-11 09:33:20 -08:00
Percy Wegmann
aff27451b9 ssh/tailssh: add back a fake public key handler to support buggy clients
Some clients don't support 'none' authentication and insist on supplying
a public key. This change allows them to do so. It ignores the public key
and uses Tailscale to authenticate.

Updates #14922

Signed-off-by: Percy Wegmann <percy@tailscale.com>
2025-02-10 15:58:55 -06:00
Andrea Gottardo
b9dc617dda VERSION.txt: this is v1.80.1 (#14932)
Signed-off-by: Andrea Gottardo <andrea@gottardo.me>
2025-02-06 13:38:55 -05:00
James Tucker
dad4c87ca0 net/netmon: add extra panic guard around ParseRIB
We once again have a report of a panic from ParseRIB. This panic guard
should probably remain permanent.

Updates #14201

This reverts commit de9d4b2f88.

Signed-off-by: James Tucker <james@tailscale.com>
(cherry picked from commit 80a100b3cb)
2025-02-06 08:34:29 -08:00
Andrea Gottardo
649a71f8ac VERSION.txt: this is v1.80.0 (#14837)
Signed-off-by: Andrea Gottardo <andrea@gottardo.me>
2025-01-30 12:52:55 -08:00
Tom Proctor
138a83efe1 cmd/containerboot: wait for consistent state on shutdown (#14263)
tailscaled's ipn package writes a collection of keys to state after
authenticating to control, but one at a time. If containerboot happens
to send a SIGTERM signal to tailscaled in the middle of writing those
keys, it may shut down with an inconsistent state Secret and never
recover. While we can't durably fix this with our current single-use
auth keys (no atomic operation to auth + write state), we can reduce
the window for this race condition by checking for partial state
before sending SIGTERM to tailscaled. Best effort only.

Updates #14080

Change-Id: I0532d51b6f0b7d391e538468bd6a0a80dbe1d9f7
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
2025-01-30 13:51:10 +00:00
Anton Tolchanov
c2af1cd9e3 prober: support multiple probes running concurrently
Some probes might need to run for longer than their scheduling interval,
so this change relaxes the 1-at-a-time restriction, allowing us to
configure probe concurrency and timeout separately. The default values
remain the same (concurrency of 1; timeout of 80% of interval).

Updates tailscale/corp#25479

Signed-off-by: Anton Tolchanov <anton@tailscale.com>
2025-01-30 12:22:23 +00:00
Irbe Krumina
a49af98b31 cmd/k8s-operator: temporarily disable HA Ingress controller (#14833)
The HA Ingress functionality is not actually doing anything
valuable yet, so don't run the controller in 1.80 release yet.

Updates tailscale/tailscale#24795

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2025-01-30 11:36:33 +00:00
Brad Fitzpatrick
0ed4aa028f control/controlclient: flesh out a recently added comment
Updates tailscale/corp#26058

Change-Id: Ib46161fbb2e79c080f886083665961f02cbf5949
2025-01-30 08:48:52 +00:00
Brad Fitzpatrick
ed8bb3b564 control/controlclient: add missing word in comment
Found by review.ai.

Updates #cleanup

Change-Id: Ib9126de7327527b8b3818d92cc774bb1c7b6f974
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2025-01-30 08:48:52 +00:00
Irbe Krumina
3f39211f98 cmd/k8s-operator: check that cluster traffic is routed to egress ProxyGroup Pod before marking it as ready (#14792)
This change builds on top of #14436 to ensure minimum downtime during egress ProxyGroup update rollouts:

- adds a readiness gate for ProxyGroup replicas that prevents kubelet from marking
the replica Pod as ready before a corresponding readiness condition has been added
to the Pod

- adds a reconciler that reconciles egress ProxyGroup Pods and, for each that is not ready,
if cluster traffic for relevant egress endpoints is routed via this Pod- if so add the
readiness condition to allow kubelet to mark the Pod as ready.

During the sequenced StatefulSet update rollouts kubelet does not restart
a Pod before the previous replica has been updated and marked as ready, so
ensuring that a replica is not marked as ready allows to avoid a temporary
post-update situation where all replicas have been restarted, but none of the
new ones are yet set up as an endpoint for the egress service, so cluster traffic is dropped.

Updates tailscale/tailscale#14326

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2025-01-30 08:47:45 +00:00
Brad Fitzpatrick
8bd04bdd3a go.mod: bump gorilla/csrf for security fix (#14822)
For 9dd6af1f6d

Update client/web and safeweb to correctly signal to the csrf middleware
whether the request is being served over TLS. This determines whether
Origin and Referer header checks are strictly enforced. The gorilla
library previously did not enforce these checks due to a logic bug based
on erroneous use of the net/http.Request API. The patch to fix this also
inverts the library behavior to presume that every request is being
served over TLS, necessitating these changes.

Updates tailscale/corp#25340

Signed-off-by: Patrick O'Doherty <patrick@tailscale.com>
Co-authored-by: Patrick O'Doherty <patrick@tailscale.com>
2025-01-29 12:44:01 -08:00
Percy Wegmann
b60f6b849a Revert "ssh,tempfork/gliderlabs/ssh: replace github.com/tailscale/golang-x-crypto/ssh with golang.org/x/crypto/ssh"
This reverts commit 46fd4e58a2.

We don't want to include this in 1.80 yet, but can add it back post 1.80.

Updates #8593

Signed-off-by: Percy Wegmann <percy@tailscale.com>
2025-01-29 10:47:45 -06:00
Irbe Krumina
52f88f782a cmd/k8s-operator: don't set deprecated configfile hash on new proxies (#14817)
Fixes the configfile reload logic- if the tailscale capver can not
yet be determined because the device info is not yet written to the
state Secret, don't assume that the proxy is pre-110.

Updates tailscale/tailscale#13032

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2025-01-29 15:48:05 +00:00
Irbe Krumina
b406f209c3 cmd/{k8s-operator,containerboot},kube: ensure egress ProxyGroup proxies don't terminate while cluster traffic is still routed to them (#14436)
cmd/{containerboot,k8s-operator},kube: add preshutdown hook for egress PG proxies

This change is part of work towards minimizing downtime during update
rollouts of egress ProxyGroup replicas.
This change:
- updates the containerboot health check logic to return Pod IP in headers,
if set
- always runs the health check for egress PG proxies
- updates ClusterIP Services created for PG egress endpoints to include
the health check endpoint
- implements preshutdown endpoint in proxies. The preshutdown endpoint
logic waits till, for all currently configured egress services, the ClusterIP
Service health check endpoint is no longer returned by the shutting-down Pod
(by looking at the new Pod IP header).
- ensures that kubelet is configured to call the preshutdown endpoint

This reduces the possibility that, as replicas are terminated during an update,
a replica gets terminated to which cluster traffic is still being routed via
the ClusterIP Service because kube proxy has not yet updated routig rules.
This is not a perfect check as in practice, it only checks that the kube
proxy on the node on which the proxy runs has updated rules. However, overall
this might be good enough.

The preshutdown logic is disabled if users have configured a custom health check
port via TS_LOCAL_ADDR_PORT env var. This change throws a warnign if so and in
future setting of that env var for operator proxies might be disallowed (as users
shouldn't need to configure this for a Pod directly).
This is backwards compatible with earlier proxy versions.

Updates tailscale/tailscale#14326


Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2025-01-29 07:35:50 +00:00
Andrew Dunham
eb299302ba types/views: fix SliceEqualAnyOrderFunc short optimization
This was flagged by @tkhattra on the merge commit; thanks!

Updates tailscale/corp#25479

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: Ia8045640f02bd4dcc0fe7433249fd72ac6b9cf52
2025-01-28 23:17:43 -05:00
dependabot[bot]
0aa54151f2 .github: Bump actions/checkout from 3.6.0 to 4.2.2 (#14139)
Bumps [actions/checkout](https://github.com/actions/checkout) from 3.6.0 to 4.2.2.
- [Release notes](https://github.com/actions/checkout/releases)
- [Changelog](https://github.com/actions/checkout/blob/main/CHANGELOG.md)
- [Commits](https://github.com/actions/checkout/compare/v3.6.0...11bd71901bbe5b1630ceea73d27597364c9af683)

---
updated-dependencies:
- dependency-name: actions/checkout
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-01-28 15:03:13 -07:00
Mario Minardi
f1514a944a go.toolchain.rev: bump from Go 1.23.3 to 1.23.5 (#14814)
Update Go toolchain to 1.23.5.

Updates #cleanup

Signed-off-by: Mario Minardi <mario@tailscale.com>
2025-01-28 14:35:24 -07:00
Percy Wegmann
46fd4e58a2 ssh,tempfork/gliderlabs/ssh: replace github.com/tailscale/golang-x-crypto/ssh with golang.org/x/crypto/ssh
The upstream crypto package now supports sending banners at any time during
authentication, so the Tailscale fork of crypto/ssh is no longer necessary.

github.com/tailscale/golang-x-crypto is still needed for some custom ACME
autocert functionality.

tempfork/gliderlabs is still necessary because of a few other customizations,
mostly related to TTY handling.

Updates #8593

Signed-off-by: Percy Wegmann <percy@tailscale.com>
2025-01-28 14:20:55 -06:00
Anton Tolchanov
3abfbf50ae tsnet: return from Accept when the listener gets closed
Fixes #14808

Signed-off-by: Anton Tolchanov <anton@tailscale.com>
2025-01-28 14:02:36 +00:00
yejingchen
6f10fe8ab1 cmd/tailscale: add warning to help text of --force-reauth (#14778)
The warning text is adapted from https://tailscale.com/kb/1028/key-expiry#renewing-keys-for-an-expired-device .

There is already https://github.com/tailscale/tailscale/pull/7575 which presents a warning when connected over Tailscale, however the detection is done by checking SSH environment variables, which are absent within systemd's run0*. That means `--force-reauth` will happily bring down Tailscale connection, leaving the user in despair.

Changing only the help text is by no means a complete solution, but hopefully it will stop users from blindly trying it out, and motivate them to search for a proper solution.

*: https://www.freedesktop.org/software/systemd/man/devel/run0.html

Updates #3849

Signed-off-by: yejingchen <ye.jingchen@gmail.com>
2025-01-28 10:05:49 +00:00
Brad Fitzpatrick
079973de82 tempfork/acme: fix TestSyncedToUpstream with Windows line endings
Updates #10238

Change-Id: Ic85811c267679a9f79377f376d77dee3a9d92ce7
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2025-01-27 22:15:08 +00:00
Brad Fitzpatrick
ba1f9a3918 types/persist: remove Persist.LegacyFrontendPrivateMachineKey
It was a temporary migration over four years ago. It's no longer
relevant.

Updates #610

Change-Id: I1f00c9485fab13ede6f77603f7d4235222c2a481
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2025-01-27 22:01:50 +00:00
Brad Fitzpatrick
2691b9f6be tempfork/acme: add new package for x/crypto package acme fork, move
We've been maintaining temporary dev forks of golang.org/x/crypto/{acme,ssh}
in https://github.com/tailscale/golang-x-crypto instead of using
this repo's tempfork directory as we do with other packages. The reason we were
doing that was because x/crypto/ssh depended on x/crypto/ssh/internal/poly1305
and I hadn't noticed there are forwarding wrappers already available
in x/crypto/poly1305. It also depended internal/bcrypt_pbkdf but we don't use that
so it's easy to just delete that calling code in our tempfork/ssh.

Now that our SSH changes have been upstreamed, we can soon unfork from SSH.

That leaves ACME remaining.

This change copies our tailscale/golang-x-crypto/acme code to
tempfork/acme but adds a test that our vendored copied still matches
our tailscale/golang-x-crypto repo, where we can continue to do
development work and rebases with upstream. A comment on the new test
describes the expected workflow.

While we could continue to just import & use
tailscale/golang-x-crypto/acme, it seems a bit nicer to not have that
entire-fork-of-x-crypto visible at all in our transitive deps and the
questions that invites. Showing just a fork of an ACME client is much
less scary. It does add a step to the process of hacking on the ACME
client code, but we do that approximately never anyway, and the extra
step is very incremental compared to the existing tedious steps.

Updates #8593
Updates #10238

Change-Id: I8af4378c04c1f82e63d31bf4d16dba9f510f9199
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2025-01-27 21:32:26 +00:00
Brad Fitzpatrick
bd9725c5f8 health: relax no-derp-home warnable to not fire if not in map poll
Fixes #14687

Change-Id: I05035df7e075e94dd39b2192bee34d878c15310d
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2025-01-27 20:39:37 +00:00
Brad Fitzpatrick
bfde8079a0 health: do Warnable dependency filtering in tailscaled
Previously we were depending on the GUI(s) to do it.
By doing it in tailscaled, GUIs can be simplified and be
guaranteed to render consistent results.

If warnable A depends on warnable B, if both A & B are unhealhy, only
B will be shown to the GUI as unhealthy. Once B clears up, only then
will A be presented as unhealthy.

Updates #14687

Change-Id: Id8566f2672d8d2d699740fa053d4e2a2c8009e83
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2025-01-27 20:39:29 +00:00
dependabot[bot]
76dc028b38 .github: Bump github/codeql-action from 3.28.1 to 3.28.5 (#14794)
Bumps [github/codeql-action](https://github.com/github/codeql-action) from 3.28.1 to 3.28.5.
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](b6a472f63d...f6091c0113)

---
updated-dependencies:
- dependency-name: github/codeql-action
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-01-27 12:36:42 -07:00
dependabot[bot]
3fec806523 .github: Bump actions/setup-go from 5.2.0 to 5.3.0 (#14793)
Bumps [actions/setup-go](https://github.com/actions/setup-go) from 5.2.0 to 5.3.0.
- [Release notes](https://github.com/actions/setup-go/releases)
- [Commits](3041bf56c9...f111f3307d)

---
updated-dependencies:
- dependency-name: actions/setup-go
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-01-27 12:36:04 -07:00
Brad Fitzpatrick
bce05ec6c3 control/controlclient,tempfork/httprec: don't link httptest, test certs for c2n
The c2n handling code was using the Go httptest package's
ResponseRecorder code but that's in a test package which brings in
Go's test certs, etc.

This forks the httptest recorder type into its own package that only
has the recorder and adds a test that we don't re-introduce a
dependency on httptest.

Updates #12614

Change-Id: I3546f49972981e21813ece9064cc2be0b74f4b16
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2025-01-26 21:30:28 +00:00
Brad Fitzpatrick
8c925899e1 go.mod: bump depaware, add --internal flag to stop hiding internal packages
The hiding of internal packages has hidden things I wanted to see a
few times now. Stop hiding them. This makes depaware.txt output a bit
longer, but not too much. Plus we only really look at it with diffs &
greps anyway; it's not like anybody reads the whole thing.

Updates #12614

Change-Id: I868c89eeeddcaaab63e82371651003629bc9bda8
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2025-01-26 21:12:34 +00:00
Brad Fitzpatrick
04029b857f tstest/deptest: verify that tailscale.com BadDeps actually exist
This protects against rearranging packages and not catching that a BadDeps
package got moved. That would then effectively remove a test.

Updates #12614

Change-Id: I257f1eeda9e3569c867b7628d5bfb252d3354ba6
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2025-01-26 18:50:25 +00:00
Brad Fitzpatrick
e701fde6b3 control/controlknobs: make Knobs.AsDebugJSON automatic, not require maintenance
The AsDebugJSON method (used only for a LocalAPI debug call) always
needed to be updated whenever a new controlknob was added. We had a
test for it, which was nice, but it was a tedious step we don't need
to do. Use reflect instead.

Updates #14788

Change-Id: If59cd776920f3ce7c748f86ed2eddd9323039a0b
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2025-01-26 18:49:11 +00:00
Derek Kaser
66b2e9fd07 envknob/featureknob: allow use of exit node on unraid (#14754)
Fixes #14372

Signed-off-by: Derek Kaser <11674153+dkaser@users.noreply.github.com>
2025-01-26 15:35:58 +00:00
Brad Fitzpatrick
68a66ee81b feature/capture: move packet capture to feature/*, out of iOS + CLI
We had the debug packet capture code + Lua dissector in the CLI + the
iOS app. Now we don't, with tests to lock it in.

As a bonus, tailscale.com/net/packet and tailscale.com/net/flowtrack
no longer appear in the CLI's binary either.

A new build tag ts_omit_capture disables the packet capture code and
was added to build_dist.sh's --extra-small mode.

Updates #12614

Change-Id: I79b0628c0d59911bd4d510c732284d97b0160f10
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2025-01-24 17:52:43 -08:00
Brad Fitzpatrick
2c98c44d9a control/controlclient: sanitize invalid DERPMap nil Region from control
Fixes #14752

Change-Id: If364603eefb9ac6dc5ec6df84a0d5e16c94dda8d
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2025-01-24 17:19:12 -08:00
James Tucker
82e41ddc42 cmd/natc: expose netstack metrics in client metrics in natc
Updates tailscale/corp#25169

Signed-off-by: James Tucker <james@tailscale.com>
2025-01-24 16:39:09 -08:00
Tom Proctor
2089f4b603 ipn/ipnlocal: add debug envknob for ACME directory URL (#14771)
Adds an envknob setting for changing the client's ACME directory URL.
This allows testing cert issuing against LE's staging environment, as
well as enabling local-only test environments, which is useful for
avoiding the production rate limits in test and development scenarios.

Fixes #14761

Change-Id: I191c840c0ca143a20e4fa54ea3b2f9b7cbfc889f
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
2025-01-25 00:29:00 +00:00
James Tucker
ca39c4e150 cmd/natc,wgengine/netstack: tune buffer size and segment lifetime in natc
Some natc instances have been observed with excessive memory growth,
dominant in gvisor buffers. It is likely that the connection buffers are
sticking around for too long due to the default long segment time, and
uptuned buffer size applied by default in wgengine/netstack. Apply
configurations in natc specifically which are a better match for the
natc use case, most notably a 5s maximum segment lifetime.

Updates tailscale/corp#25169

Signed-off-by: James Tucker <james@tailscale.com>
2025-01-24 16:19:55 -08:00
Brad Fitzpatrick
1a7274fccb control/controlclient: skip SetControlClientStatus when queue has newer results later
Updates #1909
Updates #12542
Updates tailscale/corp#26058

Change-Id: I3033d235ca49f9739fdf3deaf603eea4ec3e407e
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2025-01-24 16:16:22 -08:00
Mario Minardi
cbf1a9abe1 go.{mod,sum}: update web-client-prebuilt (#14772)
Manually update the `web-client-prebuilt` package as the GitHub action
is failing for some reason.

Updates https://github.com/tailscale/tailscale/issues/14568

Signed-off-by: Mario Minardi <mario@tailscale.com>
2025-01-24 17:04:12 -07:00
Mario Minardi
716e4fcc97 client/web: remove advanced options from web client login (#14770)
Removing the advanced options collapsible from the web client login for
now ahead of our next client release.

Updates https://github.com/tailscale/tailscale/issues/14568

Signed-off-by: Mario Minardi <mario@tailscale.com>
2025-01-24 16:29:58 -07:00
Tom Proctor
69bc164c62 ipn/ipnlocal: include DNS SAN in cert CSR (#14764)
The CN field is technically deprecated; set the requested name in a DNS SAN
extension in addition to maximise compatibility with RFC 8555.

Fixes #14762

Change-Id: If5d27f1e7abc519ec86489bf034ac98b2e613043

Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
2025-01-24 17:04:26 +00:00
127 changed files with 9835 additions and 1204 deletions

View File

@@ -18,7 +18,7 @@ jobs:
runs-on: [ ubuntu-latest ]
steps:
- name: Check out code
uses: actions/checkout@692973e3d937129bcbf40652eb9f2f61becf3332 # v4.1.7
uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2
- name: Build checklocks
run: ./tool/go build -o /tmp/checklocks gvisor.dev/gvisor/tools/checklocks/cmd/checklocks

View File

@@ -45,17 +45,17 @@ jobs:
steps:
- name: Checkout repository
uses: actions/checkout@692973e3d937129bcbf40652eb9f2f61becf3332 # v4.1.7
uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2
# Install a more recent Go that understands modern go.mod content.
- name: Install Go
uses: actions/setup-go@3041bf56c941b39c61721a86cd11f3bb1338122a # v5.2.0
uses: actions/setup-go@f111f3307d8850f501ac008e886eec1fd1932a34 # v5.3.0
with:
go-version-file: go.mod
# Initializes the CodeQL tools for scanning.
- name: Initialize CodeQL
uses: github/codeql-action/init@b6a472f63d85b9c78a3ac5e89422239fc15e9b3c # v3.28.1
uses: github/codeql-action/init@f6091c0113d1dcf9b98e269ee48e8a7e51b7bdd4 # v3.28.5
with:
languages: ${{ matrix.language }}
# If you wish to specify custom queries, you can do so here or in a config file.
@@ -66,7 +66,7 @@ jobs:
# Autobuild attempts to build any compiled languages (C/C++, C#, or Java).
# If this step fails, then you should remove it and run the build manually (see below)
- name: Autobuild
uses: github/codeql-action/autobuild@b6a472f63d85b9c78a3ac5e89422239fc15e9b3c # v3.28.1
uses: github/codeql-action/autobuild@f6091c0113d1dcf9b98e269ee48e8a7e51b7bdd4 # v3.28.5
# Command-line programs to run using the OS shell.
# 📚 https://git.io/JvXDl
@@ -80,4 +80,4 @@ jobs:
# make release
- name: Perform CodeQL Analysis
uses: github/codeql-action/analyze@b6a472f63d85b9c78a3ac5e89422239fc15e9b3c # v3.28.1
uses: github/codeql-action/analyze@f6091c0113d1dcf9b98e269ee48e8a7e51b7bdd4 # v3.28.5

View File

@@ -10,6 +10,6 @@ jobs:
deploy:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@692973e3d937129bcbf40652eb9f2f61becf3332 # v4.1.7
- uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2
- name: "Build Docker image"
run: docker build .

View File

@@ -17,7 +17,7 @@ jobs:
id-token: "write"
contents: "read"
steps:
- uses: actions/checkout@692973e3d937129bcbf40652eb9f2f61becf3332 # v4.1.7
- uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2
with:
ref: "${{ (inputs.tag != null) && format('refs/tags/{0}', inputs.tag) || '' }}"
- uses: "DeterminateSystems/nix-installer-action@main"

View File

@@ -23,9 +23,9 @@ jobs:
name: lint
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@692973e3d937129bcbf40652eb9f2f61becf3332 # v4.1.7
- uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2
- uses: actions/setup-go@3041bf56c941b39c61721a86cd11f3bb1338122a # v5.2.0
- uses: actions/setup-go@f111f3307d8850f501ac008e886eec1fd1932a34 # v5.3.0
with:
go-version-file: go.mod
cache: false

View File

@@ -14,7 +14,7 @@ jobs:
steps:
- name: Check out code into the Go module directory
uses: actions/checkout@692973e3d937129bcbf40652eb9f2f61becf3332 # v4.1.7
uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2
- name: Install govulncheck
run: ./tool/go install golang.org/x/vuln/cmd/govulncheck@latest

View File

@@ -36,7 +36,6 @@ jobs:
- "ubuntu:24.04"
- "elementary/docker:stable"
- "elementary/docker:unstable"
- "parrotsec/core:lts-amd64"
- "parrotsec/core:latest"
- "kalilinux/kali-rolling"
- "kalilinux/kali-dev"
@@ -92,10 +91,7 @@ jobs:
|| contains(matrix.image, 'parrotsec')
|| contains(matrix.image, 'kalilinux')
- name: checkout
# We cannot use v4, as it requires a newer glibc version than some of the
# tested images provide. See
# https://github.com/actions/checkout/issues/1487
uses: actions/checkout@f43a0e5ff2bd294095638e18286ca9a3d1956744 # v3.6.0
uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2
- name: run installer
run: scripts/installer.sh
# Package installation can fail in docker because systemd is not running

View File

@@ -17,7 +17,7 @@ jobs:
runs-on: [ ubuntu-latest ]
steps:
- name: Check out code
uses: actions/checkout@692973e3d937129bcbf40652eb9f2f61becf3332 # v4.1.7
uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2
- name: Build and lint Helm chart
run: |
eval `./tool/go run ./cmd/mkversion`

View File

@@ -17,7 +17,7 @@ jobs:
runs-on: ubuntu-latest
steps:
- name: Check out code
uses: actions/checkout@692973e3d937129bcbf40652eb9f2f61becf3332 # v4.1.7
uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2
- name: Run SSH integration tests
run: |
make sshintegrationtest

View File

@@ -50,7 +50,7 @@ jobs:
- shard: '4/4'
steps:
- name: checkout
uses: actions/checkout@692973e3d937129bcbf40652eb9f2f61becf3332 # v4.1.7
uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2
- name: build test wrapper
run: ./tool/go build -o /tmp/testwrapper ./cmd/testwrapper
- name: integration tests as root
@@ -78,7 +78,7 @@ jobs:
runs-on: ubuntu-22.04
steps:
- name: checkout
uses: actions/checkout@692973e3d937129bcbf40652eb9f2f61becf3332 # v4.1.7
uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2
- name: Restore Cache
uses: actions/cache@1bd1e32a3bdc45362d1e726936510720a7c30a57 # v4.2.0
with:
@@ -150,10 +150,10 @@ jobs:
runs-on: windows-2022
steps:
- name: checkout
uses: actions/checkout@692973e3d937129bcbf40652eb9f2f61becf3332 # v4.1.7
uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2
- name: Install Go
uses: actions/setup-go@3041bf56c941b39c61721a86cd11f3bb1338122a # v5.2.0
uses: actions/setup-go@f111f3307d8850f501ac008e886eec1fd1932a34 # v5.3.0
with:
go-version-file: go.mod
cache: false
@@ -190,7 +190,7 @@ jobs:
options: --privileged
steps:
- name: checkout
uses: actions/checkout@692973e3d937129bcbf40652eb9f2f61becf3332 # v4.1.7
uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2
- name: chown
run: chown -R $(id -u):$(id -g) $PWD
- name: privileged tests
@@ -202,7 +202,7 @@ jobs:
if: github.repository == 'tailscale/tailscale'
steps:
- name: checkout
uses: actions/checkout@692973e3d937129bcbf40652eb9f2f61becf3332 # v4.1.7
uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2
- name: Run VM tests
run: ./tool/go test ./tstest/integration/vms -v -no-s3 -run-vm-tests -run=TestRunUbuntu2004
env:
@@ -214,7 +214,7 @@ jobs:
runs-on: ubuntu-22.04
steps:
- name: checkout
uses: actions/checkout@692973e3d937129bcbf40652eb9f2f61becf3332 # v4.1.7
uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2
- name: build all
run: ./tool/go install -race ./cmd/...
- name: build tests
@@ -258,7 +258,7 @@ jobs:
runs-on: ubuntu-22.04
steps:
- name: checkout
uses: actions/checkout@692973e3d937129bcbf40652eb9f2f61becf3332 # v4.1.7
uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2
- name: Restore Cache
uses: actions/cache@1bd1e32a3bdc45362d1e726936510720a7c30a57 # v4.2.0
with:
@@ -295,7 +295,7 @@ jobs:
runs-on: ubuntu-22.04
steps:
- name: checkout
uses: actions/checkout@692973e3d937129bcbf40652eb9f2f61becf3332 # v4.1.7
uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2
- name: build some
run: ./tool/go build ./ipn/... ./wgengine/ ./types/... ./control/controlclient
env:
@@ -323,7 +323,7 @@ jobs:
runs-on: ubuntu-22.04
steps:
- name: checkout
uses: actions/checkout@692973e3d937129bcbf40652eb9f2f61becf3332 # v4.1.7
uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2
- name: Restore Cache
uses: actions/cache@1bd1e32a3bdc45362d1e726936510720a7c30a57 # v4.2.0
with:
@@ -356,7 +356,7 @@ jobs:
runs-on: ubuntu-22.04
steps:
- name: checkout
uses: actions/checkout@692973e3d937129bcbf40652eb9f2f61becf3332 # v4.1.7
uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2
# Super minimal Android build that doesn't even use CGO and doesn't build everything that's needed
# and is only arm64. But it's a smoke build: it's not meant to catch everything. But it'll catch
# some Android breakages early.
@@ -371,7 +371,7 @@ jobs:
runs-on: ubuntu-22.04
steps:
- name: checkout
uses: actions/checkout@692973e3d937129bcbf40652eb9f2f61becf3332 # v4.1.7
uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2
- name: Restore Cache
uses: actions/cache@1bd1e32a3bdc45362d1e726936510720a7c30a57 # v4.2.0
with:
@@ -405,7 +405,7 @@ jobs:
runs-on: ubuntu-22.04
steps:
- name: checkout
uses: actions/checkout@692973e3d937129bcbf40652eb9f2f61becf3332 # v4.1.7
uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2
- name: test tailscale_go
run: ./tool/go test -tags=tailscale_go,ts_enable_sockstats ./net/sockstats/...
@@ -477,17 +477,17 @@ jobs:
runs-on: ubuntu-22.04
steps:
- name: checkout
uses: actions/checkout@692973e3d937129bcbf40652eb9f2f61becf3332 # v4.1.7
uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2
- name: check depaware
run: |
export PATH=$(./tool/go env GOROOT)/bin:$PATH
find . -name 'depaware.txt' | xargs -n1 dirname | xargs ./tool/go run github.com/tailscale/depaware --check
find . -name 'depaware.txt' | xargs -n1 dirname | xargs ./tool/go run github.com/tailscale/depaware --check --internal
go_generate:
runs-on: ubuntu-22.04
steps:
- name: checkout
uses: actions/checkout@692973e3d937129bcbf40652eb9f2f61becf3332 # v4.1.7
uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2
- name: check that 'go generate' is clean
run: |
pkgs=$(./tool/go list ./... | grep -Ev 'dnsfallback|k8s-operator|xdp')
@@ -500,7 +500,7 @@ jobs:
runs-on: ubuntu-22.04
steps:
- name: checkout
uses: actions/checkout@692973e3d937129bcbf40652eb9f2f61becf3332 # v4.1.7
uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2
- name: check that 'go mod tidy' is clean
run: |
./tool/go mod tidy
@@ -512,7 +512,7 @@ jobs:
runs-on: ubuntu-22.04
steps:
- name: checkout
uses: actions/checkout@692973e3d937129bcbf40652eb9f2f61becf3332 # v4.1.7
uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2
- name: check licenses
run: ./scripts/check_license_headers.sh .
@@ -528,7 +528,7 @@ jobs:
goarch: "386"
steps:
- name: checkout
uses: actions/checkout@692973e3d937129bcbf40652eb9f2f61becf3332 # v4.1.7
uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2
- name: install staticcheck
run: GOBIN=~/.local/bin ./tool/go install honnef.co/go/tools/cmd/staticcheck
- name: run staticcheck

View File

@@ -21,7 +21,7 @@ jobs:
steps:
- name: Check out code
uses: actions/checkout@692973e3d937129bcbf40652eb9f2f61becf3332 # v4.1.7
uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2
- name: Run update-flakes
run: ./update-flake.sh

View File

@@ -14,7 +14,7 @@ jobs:
steps:
- name: Check out code
uses: actions/checkout@692973e3d937129bcbf40652eb9f2f61becf3332 # v4.1.7
uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2
- name: Run go get
run: |

View File

@@ -24,7 +24,7 @@ jobs:
steps:
- name: Check out code
uses: actions/checkout@692973e3d937129bcbf40652eb9f2f61becf3332 # v4.1.7
uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2
- name: Install deps
run: ./tool/yarn --cwd client/web
- name: Run lint

View File

@@ -17,7 +17,7 @@ lint: ## Run golangci-lint
updatedeps: ## Update depaware deps
# depaware (via x/tools/go/packages) shells back to "go", so make sure the "go"
# it finds in its $$PATH is the right one.
PATH="$$(./tool/go env GOROOT)/bin:$$PATH" ./tool/go run github.com/tailscale/depaware --update \
PATH="$$(./tool/go env GOROOT)/bin:$$PATH" ./tool/go run github.com/tailscale/depaware --update --internal \
tailscale.com/cmd/tailscaled \
tailscale.com/cmd/tailscale \
tailscale.com/cmd/derper \
@@ -27,7 +27,7 @@ updatedeps: ## Update depaware deps
depaware: ## Run depaware checks
# depaware (via x/tools/go/packages) shells back to "go", so make sure the "go"
# it finds in its $$PATH is the right one.
PATH="$$(./tool/go env GOROOT)/bin:$$PATH" ./tool/go run github.com/tailscale/depaware --check \
PATH="$$(./tool/go env GOROOT)/bin:$$PATH" ./tool/go run github.com/tailscale/depaware --check --internal \
tailscale.com/cmd/tailscaled \
tailscale.com/cmd/tailscale \
tailscale.com/cmd/derper \

View File

@@ -1 +1 @@
1.79.0
1.80.2

View File

@@ -289,9 +289,11 @@ func (e *AppConnector) updateDomains(domains []string) {
toRemove = append(toRemove, netip.PrefixFrom(a, a.BitLen()))
}
}
if err := e.routeAdvertiser.UnadvertiseRoute(toRemove...); err != nil {
e.logf("failed to unadvertise routes on domain removal: %v: %v: %v", slicesx.MapKeys(oldDomains), toRemove, err)
}
e.queue.Add(func() {
if err := e.routeAdvertiser.UnadvertiseRoute(toRemove...); err != nil {
e.logf("failed to unadvertise routes on domain removal: %v: %v: %v", slicesx.MapKeys(oldDomains), toRemove, err)
}
})
}
e.logf("handling domains: %v and wildcards: %v", slicesx.MapKeys(e.domains), e.wildcards)
@@ -310,11 +312,6 @@ func (e *AppConnector) updateRoutes(routes []netip.Prefix) {
return
}
if err := e.routeAdvertiser.AdvertiseRoute(routes...); err != nil {
e.logf("failed to advertise routes: %v: %v", routes, err)
return
}
var toRemove []netip.Prefix
// If we're storing routes and know e.controlRoutes is a good
@@ -338,9 +335,14 @@ nextRoute:
}
}
if err := e.routeAdvertiser.UnadvertiseRoute(toRemove...); err != nil {
e.logf("failed to unadvertise routes: %v: %v", toRemove, err)
}
e.queue.Add(func() {
if err := e.routeAdvertiser.AdvertiseRoute(routes...); err != nil {
e.logf("failed to advertise routes: %v: %v", routes, err)
}
if err := e.routeAdvertiser.UnadvertiseRoute(toRemove...); err != nil {
e.logf("failed to unadvertise routes: %v: %v", toRemove, err)
}
})
e.controlRoutes = routes
if err := e.storeRoutesLocked(); err != nil {

View File

@@ -8,6 +8,7 @@ import (
"net/netip"
"reflect"
"slices"
"sync/atomic"
"testing"
"time"
@@ -86,6 +87,7 @@ func TestUpdateRoutes(t *testing.T) {
routes := []netip.Prefix{netip.MustParsePrefix("192.0.2.0/24"), netip.MustParsePrefix("192.0.0.1/32")}
a.updateRoutes(routes)
a.Wait(ctx)
slices.SortFunc(rc.Routes(), prefixCompare)
rc.SetRoutes(slices.Compact(rc.Routes()))
@@ -105,6 +107,7 @@ func TestUpdateRoutes(t *testing.T) {
}
func TestUpdateRoutesUnadvertisesContainedRoutes(t *testing.T) {
ctx := context.Background()
for _, shouldStore := range []bool{false, true} {
rc := &appctest.RouteCollector{}
var a *AppConnector
@@ -117,6 +120,7 @@ func TestUpdateRoutesUnadvertisesContainedRoutes(t *testing.T) {
rc.SetRoutes([]netip.Prefix{netip.MustParsePrefix("192.0.2.1/32")})
routes := []netip.Prefix{netip.MustParsePrefix("192.0.2.0/24")}
a.updateRoutes(routes)
a.Wait(ctx)
if !slices.EqualFunc(routes, rc.Routes(), prefixEqual) {
t.Fatalf("got %v, want %v", rc.Routes(), routes)
@@ -636,3 +640,57 @@ func TestMetricBucketsAreSorted(t *testing.T) {
t.Errorf("metricStoreRoutesNBuckets must be in order")
}
}
// TestUpdateRoutesDeadlock is a regression test for a deadlock in
// LocalBackend<->AppConnector interaction. When using real LocalBackend as the
// routeAdvertiser, calls to Advertise/UnadvertiseRoutes can end up calling
// back into AppConnector via authReconfig. If everything is called
// synchronously, this results in a deadlock on AppConnector.mu.
func TestUpdateRoutesDeadlock(t *testing.T) {
ctx := context.Background()
rc := &appctest.RouteCollector{}
a := NewAppConnector(t.Logf, rc, &RouteInfo{}, fakeStoreRoutes)
advertiseCalled := new(atomic.Bool)
unadvertiseCalled := new(atomic.Bool)
rc.AdvertiseCallback = func() {
// Call something that requires a.mu to be held.
a.DomainRoutes()
advertiseCalled.Store(true)
}
rc.UnadvertiseCallback = func() {
// Call something that requires a.mu to be held.
a.DomainRoutes()
unadvertiseCalled.Store(true)
}
a.updateDomains([]string{"example.com"})
a.Wait(ctx)
// Trigger rc.AdveriseRoute.
a.updateRoutes(
[]netip.Prefix{
netip.MustParsePrefix("127.0.0.1/32"),
netip.MustParsePrefix("127.0.0.2/32"),
},
)
a.Wait(ctx)
// Trigger rc.UnadveriseRoute.
a.updateRoutes(
[]netip.Prefix{
netip.MustParsePrefix("127.0.0.1/32"),
},
)
a.Wait(ctx)
if !advertiseCalled.Load() {
t.Error("AdvertiseRoute was not called")
}
if !unadvertiseCalled.Load() {
t.Error("UnadvertiseRoute was not called")
}
if want := []netip.Prefix{netip.MustParsePrefix("127.0.0.1/32")}; !slices.Equal(slices.Compact(rc.Routes()), want) {
t.Fatalf("got %v, want %v", rc.Routes(), want)
}
}

View File

@@ -11,12 +11,22 @@ import (
// RouteCollector is a test helper that collects the list of routes advertised
type RouteCollector struct {
// AdvertiseCallback (optional) is called synchronously from
// AdvertiseRoute.
AdvertiseCallback func()
// UnadvertiseCallback (optional) is called synchronously from
// UnadvertiseRoute.
UnadvertiseCallback func()
routes []netip.Prefix
removedRoutes []netip.Prefix
}
func (rc *RouteCollector) AdvertiseRoute(pfx ...netip.Prefix) error {
rc.routes = append(rc.routes, pfx...)
if rc.AdvertiseCallback != nil {
rc.AdvertiseCallback()
}
return nil
}
@@ -30,6 +40,9 @@ func (rc *RouteCollector) UnadvertiseRoute(toRemove ...netip.Prefix) error {
rc.removedRoutes = append(rc.removedRoutes, r)
}
}
if rc.UnadvertiseCallback != nil {
rc.UnadvertiseCallback()
}
return nil
}

View File

@@ -37,7 +37,7 @@ while [ "$#" -gt 1 ]; do
--extra-small)
shift
ldflags="$ldflags -w -s"
tags="${tags:+$tags,}ts_omit_aws,ts_omit_bird,ts_omit_tap,ts_omit_kube,ts_omit_completion,ts_omit_ssh,ts_omit_wakeonlan"
tags="${tags:+$tags,}ts_omit_aws,ts_omit_bird,ts_omit_tap,ts_omit_kube,ts_omit_completion,ts_omit_ssh,ts_omit_wakeonlan,ts_omit_capture"
;;
--box)
shift

View File

@@ -1,13 +1,11 @@
// Copyright (c) Tailscale Inc & AUTHORS
// SPDX-License-Identifier: BSD-3-Clause
import React, { useState } from "react"
import React from "react"
import { useAPI } from "src/api"
import TailscaleIcon from "src/assets/icons/tailscale-icon.svg?react"
import { NodeData } from "src/types"
import Button from "src/ui/button"
import Collapsible from "src/ui/collapsible"
import Input from "src/ui/input"
/**
* LoginView is rendered when the client is not authenticated
@@ -15,8 +13,6 @@ import Input from "src/ui/input"
*/
export default function LoginView({ data }: { data: NodeData }) {
const api = useAPI()
const [controlURL, setControlURL] = useState<string>("")
const [authKey, setAuthKey] = useState<string>("")
return (
<div className="mb-8 py-6 px-8 bg-white rounded-md shadow-2xl">
@@ -88,8 +84,6 @@ export default function LoginView({ data }: { data: NodeData }) {
action: "up",
data: {
Reauthenticate: true,
ControlURL: controlURL,
AuthKey: authKey,
},
})
}
@@ -98,34 +92,6 @@ export default function LoginView({ data }: { data: NodeData }) {
>
Log In
</Button>
<Collapsible trigger="Advanced options">
<h4 className="font-medium mb-1 mt-2">Auth Key</h4>
<p className="text-sm text-gray-500">
Connect with a pre-authenticated key.{" "}
<a
href="https://tailscale.com/kb/1085/auth-keys/"
className="link"
target="_blank"
rel="noreferrer"
>
Learn more &rarr;
</a>
</p>
<Input
className="mt-2"
value={authKey}
onChange={(e) => setAuthKey(e.target.value)}
placeholder="tskey-auth-XXX"
/>
<h4 className="font-medium mt-3 mb-1">Server URL</h4>
<p className="text-sm text-gray-500">Base URL of control server.</p>
<Input
className="mt-2"
value={controlURL}
onChange={(e) => setControlURL(e.target.value)}
placeholder="https://login.tailscale.com/"
/>
</Collapsible>
</>
)}
</div>

View File

@@ -203,25 +203,9 @@ func NewServer(opts ServerOpts) (s *Server, err error) {
}
s.assetsHandler, s.assetsCleanup = assetsHandler(s.devMode)
var metric string // clientmetric to report on startup
// Create handler for "/api" requests with CSRF protection.
// We don't require secure cookies, since the web client is regularly used
// on network appliances that are served on local non-https URLs.
// The client is secured by limiting the interface it listens on,
// or by authenticating requests before they reach the web client.
csrfProtect := csrf.Protect(s.csrfKey(), csrf.Secure(false))
switch s.mode {
case LoginServerMode:
s.apiHandler = csrfProtect(http.HandlerFunc(s.serveLoginAPI))
metric = "web_login_client_initialization"
case ReadOnlyServerMode:
s.apiHandler = csrfProtect(http.HandlerFunc(s.serveLoginAPI))
metric = "web_readonly_client_initialization"
case ManageServerMode:
s.apiHandler = csrfProtect(http.HandlerFunc(s.serveAPI))
metric = "web_client_initialization"
}
var metric string
s.apiHandler, metric = s.modeAPIHandler(s.mode)
s.apiHandler = s.withCSRF(s.apiHandler)
// Don't block startup on reporting metric.
// Report in separate go routine with 5 second timeout.
@@ -234,6 +218,39 @@ func NewServer(opts ServerOpts) (s *Server, err error) {
return s, nil
}
func (s *Server) withCSRF(h http.Handler) http.Handler {
csrfProtect := csrf.Protect(s.csrfKey(), csrf.Secure(false))
// ref https://github.com/tailscale/tailscale/pull/14822
// signal to the CSRF middleware that the request is being served over
// plaintext HTTP to skip TLS-only header checks.
withSetPlaintext := func(h http.Handler) http.Handler {
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
r = csrf.PlaintextHTTPRequest(r)
h.ServeHTTP(w, r)
})
}
// NB: the order of the withSetPlaintext and csrfProtect calls is important
// to ensure that we signal to the CSRF middleware that the request is being
// served over plaintext HTTP and not over TLS as it presumes by default.
return withSetPlaintext(csrfProtect(h))
}
func (s *Server) modeAPIHandler(mode ServerMode) (http.Handler, string) {
switch mode {
case LoginServerMode:
return http.HandlerFunc(s.serveLoginAPI), "web_login_client_initialization"
case ReadOnlyServerMode:
return http.HandlerFunc(s.serveLoginAPI), "web_readonly_client_initialization"
case ManageServerMode:
return http.HandlerFunc(s.serveAPI), "web_client_initialization"
default: // invalid mode
log.Fatalf("invalid mode: %v", mode)
}
return nil, ""
}
func (s *Server) Shutdown() {
s.logf("web.Server: shutting down")
if s.assetsCleanup != nil {

View File

@@ -11,6 +11,7 @@ import (
"fmt"
"io"
"net/http"
"net/http/cookiejar"
"net/http/httptest"
"net/netip"
"net/url"
@@ -20,6 +21,7 @@ import (
"time"
"github.com/google/go-cmp/cmp"
"github.com/gorilla/csrf"
"tailscale.com/client/tailscale"
"tailscale.com/client/tailscale/apitype"
"tailscale.com/ipn"
@@ -1477,3 +1479,83 @@ func mockWaitAuthURL(_ context.Context, id string, src tailcfg.NodeID) (*tailcfg
return nil, errors.New("unknown id")
}
}
func TestCSRFProtect(t *testing.T) {
s := &Server{}
mux := http.NewServeMux()
mux.HandleFunc("GET /test/csrf-token", func(w http.ResponseWriter, r *http.Request) {
token := csrf.Token(r)
_, err := io.WriteString(w, token)
if err != nil {
t.Fatal(err)
}
})
mux.HandleFunc("POST /test/csrf-protected", func(w http.ResponseWriter, r *http.Request) {
_, err := io.WriteString(w, "ok")
if err != nil {
t.Fatal(err)
}
})
h := s.withCSRF(mux)
ser := httptest.NewServer(h)
defer ser.Close()
jar, err := cookiejar.New(nil)
if err != nil {
t.Fatalf("unable to construct cookie jar: %v", err)
}
client := ser.Client()
client.Jar = jar
// make GET request to populate cookie jar
resp, err := client.Get(ser.URL + "/test/csrf-token")
if err != nil {
t.Fatalf("unable to make request: %v", err)
}
defer resp.Body.Close()
if resp.StatusCode != http.StatusOK {
t.Fatalf("unexpected status: %v", resp.Status)
}
tokenBytes, err := io.ReadAll(resp.Body)
if err != nil {
t.Fatalf("unable to read body: %v", err)
}
csrfToken := strings.TrimSpace(string(tokenBytes))
if csrfToken == "" {
t.Fatal("empty csrf token")
}
// make a POST request without the CSRF header; ensure it fails
resp, err = client.Post(ser.URL+"/test/csrf-protected", "text/plain", nil)
if err != nil {
t.Fatalf("unable to make request: %v", err)
}
if resp.StatusCode != http.StatusForbidden {
t.Fatalf("unexpected status: %v", resp.Status)
}
// make a POST request with the CSRF header; ensure it succeeds
req, err := http.NewRequest("POST", ser.URL+"/test/csrf-protected", nil)
if err != nil {
t.Fatalf("error building request: %v", err)
}
req.Header.Set("X-CSRF-Token", csrfToken)
resp, err = client.Do(req)
if err != nil {
t.Fatalf("unable to make request: %v", err)
}
if resp.StatusCode != http.StatusOK {
t.Fatalf("unexpected status: %v", resp.Status)
}
defer resp.Body.Close()
out, err := io.ReadAll(resp.Body)
if err != nil {
t.Fatalf("unable to read body: %v", err)
}
if string(out) != "ok" {
t.Fatalf("unexpected body: %q", out)
}
}

View File

@@ -6,9 +6,12 @@
package main
import (
"fmt"
"log"
"net/http"
"sync"
"tailscale.com/kube/kubetypes"
)
// healthz is a simple health check server, if enabled it returns 200 OK if
@@ -17,6 +20,7 @@ import (
type healthz struct {
sync.Mutex
hasAddrs bool
podIPv4 string
}
func (h *healthz) ServeHTTP(w http.ResponseWriter, r *http.Request) {
@@ -24,7 +28,10 @@ func (h *healthz) ServeHTTP(w http.ResponseWriter, r *http.Request) {
defer h.Unlock()
if h.hasAddrs {
w.Write([]byte("ok"))
w.Header().Add(kubetypes.PodIPv4Header, h.podIPv4)
if _, err := w.Write([]byte("ok")); err != nil {
http.Error(w, fmt.Sprintf("error writing status: %v", err), http.StatusInternalServerError)
}
} else {
http.Error(w, "node currently has no tailscale IPs", http.StatusServiceUnavailable)
}
@@ -43,8 +50,8 @@ func (h *healthz) update(healthy bool) {
// healthHandlers registers a simple health handler at /healthz.
// A containerized tailscale instance is considered healthy if
// it has at least one tailnet IP address.
func healthHandlers(mux *http.ServeMux) *healthz {
h := &healthz{}
func healthHandlers(mux *http.ServeMux, podIPv4 string) *healthz {
h := &healthz{podIPv4: podIPv4}
mux.Handle("GET /healthz", h)
return h
}

View File

@@ -8,15 +8,22 @@ package main
import (
"context"
"encoding/json"
"errors"
"fmt"
"log"
"net/http"
"net/netip"
"os"
"strings"
"time"
"tailscale.com/ipn"
"tailscale.com/kube/kubeapi"
"tailscale.com/kube/kubeclient"
"tailscale.com/kube/kubetypes"
"tailscale.com/logtail/backoff"
"tailscale.com/tailcfg"
"tailscale.com/types/logger"
)
// kubeClient is a wrapper around Tailscale's internal kube client that knows how to talk to the kube API server. We use
@@ -126,3 +133,62 @@ func (kc *kubeClient) storeCapVerUID(ctx context.Context, podUID string) error {
}
return kc.StrategicMergePatchSecret(ctx, kc.stateSecret, s, "tailscale-container")
}
// waitForConsistentState waits for tailscaled to finish writing state if it
// looks like it's started. It is designed to reduce the likelihood that
// tailscaled gets shut down in the window between authenticating to control
// and finishing writing state. However, it's not bullet proof because we can't
// atomically authenticate and write state.
func (kc *kubeClient) waitForConsistentState(ctx context.Context) error {
var logged bool
bo := backoff.NewBackoff("", logger.Discard, 2*time.Second)
for {
select {
case <-ctx.Done():
return ctx.Err()
default:
}
secret, err := kc.GetSecret(ctx, kc.stateSecret)
if ctx.Err() != nil || kubeclient.IsNotFoundErr(err) {
return nil
}
if err != nil {
return fmt.Errorf("getting Secret %q: %v", kc.stateSecret, err)
}
if hasConsistentState(secret.Data) {
return nil
}
if !logged {
log.Printf("Waiting for tailscaled to finish writing state to Secret %q", kc.stateSecret)
logged = true
}
bo.BackOff(ctx, errors.New("")) // Fake error to trigger actual sleep.
}
}
// hasConsistentState returns true is there is either no state or the full set
// of expected keys are present.
func hasConsistentState(d map[string][]byte) bool {
var (
_, hasCurrent = d[string(ipn.CurrentProfileStateKey)]
_, hasKnown = d[string(ipn.KnownProfilesStateKey)]
_, hasMachine = d[string(ipn.MachineKeyStateKey)]
hasProfile bool
)
for k := range d {
if strings.HasPrefix(k, "profile-") {
if hasProfile {
return false // We only expect one profile.
}
hasProfile = true
}
}
// Approximate check, we don't want to reimplement all of profileManager.
return (hasCurrent && hasKnown && hasMachine && hasProfile) ||
(!hasCurrent && !hasKnown && !hasMachine && !hasProfile)
}

View File

@@ -9,8 +9,10 @@ import (
"context"
"errors"
"testing"
"time"
"github.com/google/go-cmp/cmp"
"tailscale.com/ipn"
"tailscale.com/kube/kubeapi"
"tailscale.com/kube/kubeclient"
)
@@ -205,3 +207,34 @@ func TestSetupKube(t *testing.T) {
})
}
}
func TestWaitForConsistentState(t *testing.T) {
data := map[string][]byte{
// Missing _current-profile.
string(ipn.KnownProfilesStateKey): []byte(""),
string(ipn.MachineKeyStateKey): []byte(""),
"profile-foo": []byte(""),
}
kc := &kubeClient{
Client: &kubeclient.FakeClient{
GetSecretImpl: func(context.Context, string) (*kubeapi.Secret, error) {
return &kubeapi.Secret{
Data: data,
}, nil
},
},
}
ctx, cancel := context.WithTimeout(context.Background(), time.Second)
defer cancel()
if err := kc.waitForConsistentState(ctx); err != context.DeadlineExceeded {
t.Fatalf("expected DeadlineExceeded, got %v", err)
}
ctx, cancel = context.WithTimeout(context.Background(), time.Second)
defer cancel()
data[string(ipn.CurrentProfileStateKey)] = []byte("")
if err := kc.waitForConsistentState(ctx); err != nil {
t.Fatalf("expected nil, got %v", err)
}
}

View File

@@ -137,53 +137,83 @@ func newNetfilterRunner(logf logger.Logf) (linuxfw.NetfilterRunner, error) {
}
func main() {
if err := run(); err != nil && !errors.Is(err, context.Canceled) {
log.Fatal(err)
}
}
func run() error {
log.SetPrefix("boot: ")
tailscale.I_Acknowledge_This_API_Is_Unstable = true
cfg, err := configFromEnv()
if err != nil {
log.Fatalf("invalid configuration: %v", err)
return fmt.Errorf("invalid configuration: %w", err)
}
if !cfg.UserspaceMode {
if err := ensureTunFile(cfg.Root); err != nil {
log.Fatalf("Unable to create tuntap device file: %v", err)
return fmt.Errorf("unable to create tuntap device file: %w", err)
}
if cfg.ProxyTargetIP != "" || cfg.ProxyTargetDNSName != "" || cfg.Routes != nil || cfg.TailnetTargetIP != "" || cfg.TailnetTargetFQDN != "" {
if err := ensureIPForwarding(cfg.Root, cfg.ProxyTargetIP, cfg.TailnetTargetIP, cfg.TailnetTargetFQDN, cfg.Routes); err != nil {
log.Printf("Failed to enable IP forwarding: %v", err)
log.Printf("To run tailscale as a proxy or router container, IP forwarding must be enabled.")
if cfg.InKubernetes {
log.Fatalf("You can either set the sysctls as a privileged initContainer, or run the tailscale container with privileged=true.")
return fmt.Errorf("you can either set the sysctls as a privileged initContainer, or run the tailscale container with privileged=true.")
} else {
log.Fatalf("You can fix this by running the container with privileged=true, or the equivalent in your container runtime that permits access to sysctls.")
return fmt.Errorf("you can fix this by running the container with privileged=true, or the equivalent in your container runtime that permits access to sysctls.")
}
}
}
}
// Context is used for all setup stuff until we're in steady
// Root context for the whole containerboot process, used to make sure
// shutdown signals are promptly and cleanly handled.
ctx, cancel := contextWithExitSignalWatch()
defer cancel()
// bootCtx is used for all setup stuff until we're in steady
// state, so that if something is hanging we eventually time out
// and crashloop the container.
bootCtx, cancel := context.WithTimeout(context.Background(), 60*time.Second)
bootCtx, cancel := context.WithTimeout(ctx, 60*time.Second)
defer cancel()
var kc *kubeClient
if cfg.InKubernetes {
kc, err = newKubeClient(cfg.Root, cfg.KubeSecret)
if err != nil {
log.Fatalf("error initializing kube client: %v", err)
return fmt.Errorf("error initializing kube client: %w", err)
}
if err := cfg.setupKube(bootCtx, kc); err != nil {
log.Fatalf("error setting up for running on Kubernetes: %v", err)
return fmt.Errorf("error setting up for running on Kubernetes: %w", err)
}
}
client, daemonProcess, err := startTailscaled(bootCtx, cfg)
if err != nil {
log.Fatalf("failed to bring up tailscale: %v", err)
return fmt.Errorf("failed to bring up tailscale: %w", err)
}
killTailscaled := func() {
if hasKubeStateStore(cfg) {
// Check we're not shutting tailscaled down while it's still writing
// state. If we authenticate and fail to write all the state, we'll
// never recover automatically.
//
// The default termination grace period for a Pod is 30s. We wait 25s at
// most so that we still reserve some of that budget for tailscaled
// to receive and react to a SIGTERM before the SIGKILL that k8s
// will send at the end of the grace period.
ctx, cancel := context.WithTimeout(context.Background(), 25*time.Second)
defer cancel()
log.Printf("Checking for consistent state")
err := kc.waitForConsistentState(ctx)
if err != nil {
log.Printf("Error waiting for consistent state on shutdown: %v", err)
}
}
log.Printf("Sending SIGTERM to tailscaled")
if err := daemonProcess.Signal(unix.SIGTERM); err != nil {
log.Fatalf("error shutting tailscaled down: %v", err)
}
@@ -191,17 +221,18 @@ func main() {
defer killTailscaled()
var healthCheck *healthz
ep := &egressProxy{}
if cfg.HealthCheckAddrPort != "" {
mux := http.NewServeMux()
log.Printf("Running healthcheck endpoint at %s/healthz", cfg.HealthCheckAddrPort)
healthCheck = healthHandlers(mux)
healthCheck = healthHandlers(mux, cfg.PodIPv4)
close := runHTTPServer(mux, cfg.HealthCheckAddrPort)
defer close()
}
if cfg.localMetricsEnabled() || cfg.localHealthEnabled() {
if cfg.localMetricsEnabled() || cfg.localHealthEnabled() || cfg.egressSvcsTerminateEPEnabled() {
mux := http.NewServeMux()
if cfg.localMetricsEnabled() {
@@ -211,7 +242,11 @@ func main() {
if cfg.localHealthEnabled() {
log.Printf("Running healthcheck endpoint at %s/healthz", cfg.LocalAddrPort)
healthCheck = healthHandlers(mux)
healthCheck = healthHandlers(mux, cfg.PodIPv4)
}
if cfg.EgressProxiesCfgPath != "" {
log.Printf("Running preshutdown hook at %s%s", cfg.LocalAddrPort, kubetypes.EgessServicesPreshutdownEP)
ep.registerHandlers(mux)
}
close := runHTTPServer(mux, cfg.LocalAddrPort)
@@ -226,7 +261,7 @@ func main() {
w, err := client.WatchIPNBus(bootCtx, ipn.NotifyInitialNetMap|ipn.NotifyInitialPrefs|ipn.NotifyInitialState)
if err != nil {
log.Fatalf("failed to watch tailscaled for updates: %v", err)
return fmt.Errorf("failed to watch tailscaled for updates: %w", err)
}
// Now that we've started tailscaled, we can symlink the socket to the
@@ -262,18 +297,18 @@ func main() {
didLogin = true
w.Close()
if err := tailscaleUp(bootCtx, cfg); err != nil {
return fmt.Errorf("failed to auth tailscale: %v", err)
return fmt.Errorf("failed to auth tailscale: %w", err)
}
w, err = client.WatchIPNBus(bootCtx, ipn.NotifyInitialNetMap|ipn.NotifyInitialState)
if err != nil {
return fmt.Errorf("rewatching tailscaled for updates after auth: %v", err)
return fmt.Errorf("rewatching tailscaled for updates after auth: %w", err)
}
return nil
}
if isTwoStepConfigAlwaysAuth(cfg) {
if err := authTailscale(); err != nil {
log.Fatalf("failed to auth tailscale: %v", err)
return fmt.Errorf("failed to auth tailscale: %w", err)
}
}
@@ -281,7 +316,7 @@ authLoop:
for {
n, err := w.Next()
if err != nil {
log.Fatalf("failed to read from tailscaled: %v", err)
return fmt.Errorf("failed to read from tailscaled: %w", err)
}
if n.State != nil {
@@ -290,10 +325,10 @@ authLoop:
if isOneStepConfig(cfg) {
// This could happen if this is the first time tailscaled was run for this
// device and the auth key was not passed via the configfile.
log.Fatalf("invalid state: tailscaled daemon started with a config file, but tailscale is not logged in: ensure you pass a valid auth key in the config file.")
return fmt.Errorf("invalid state: tailscaled daemon started with a config file, but tailscale is not logged in: ensure you pass a valid auth key in the config file.")
}
if err := authTailscale(); err != nil {
log.Fatalf("failed to auth tailscale: %v", err)
return fmt.Errorf("failed to auth tailscale: %w", err)
}
case ipn.NeedsMachineAuth:
log.Printf("machine authorization required, please visit the admin panel")
@@ -313,14 +348,11 @@ authLoop:
w.Close()
ctx, cancel := contextWithExitSignalWatch()
defer cancel()
if isTwoStepConfigAuthOnce(cfg) {
// Now that we are authenticated, we can set/reset any of the
// settings that we need to.
if err := tailscaleSet(ctx, cfg); err != nil {
log.Fatalf("failed to auth tailscale: %v", err)
return fmt.Errorf("failed to auth tailscale: %w", err)
}
}
@@ -329,11 +361,11 @@ authLoop:
if cfg.ServeConfigPath != "" {
log.Printf("serve proxy: unsetting previous config")
if err := client.SetServeConfig(ctx, new(ipn.ServeConfig)); err != nil {
log.Fatalf("failed to unset serve config: %v", err)
return fmt.Errorf("failed to unset serve config: %w", err)
}
if hasKubeStateStore(cfg) {
if err := kc.storeHTTPSEndpoint(ctx, ""); err != nil {
log.Fatalf("failed to update HTTPS endpoint in tailscale state: %v", err)
return fmt.Errorf("failed to update HTTPS endpoint in tailscale state: %w", err)
}
}
}
@@ -344,19 +376,19 @@ authLoop:
// wipe it, but it's good hygiene.
log.Printf("Deleting authkey from kube secret")
if err := kc.deleteAuthKey(ctx); err != nil {
log.Fatalf("deleting authkey from kube secret: %v", err)
return fmt.Errorf("deleting authkey from kube secret: %w", err)
}
}
if hasKubeStateStore(cfg) {
if err := kc.storeCapVerUID(ctx, cfg.PodUID); err != nil {
log.Fatalf("storing capability version and UID: %v", err)
return fmt.Errorf("storing capability version and UID: %w", err)
}
}
w, err = client.WatchIPNBus(ctx, ipn.NotifyInitialNetMap|ipn.NotifyInitialState)
if err != nil {
log.Fatalf("rewatching tailscaled for updates after auth: %v", err)
return fmt.Errorf("rewatching tailscaled for updates after auth: %w", err)
}
// If tailscaled config was read from a mounted file, watch the file for updates and reload.
@@ -386,7 +418,7 @@ authLoop:
if isL3Proxy(cfg) {
nfr, err = newNetfilterRunner(log.Printf)
if err != nil {
log.Fatalf("error creating new netfilter runner: %v", err)
return fmt.Errorf("error creating new netfilter runner: %w", err)
}
}
@@ -457,9 +489,9 @@ runLoop:
killTailscaled()
break runLoop
case err := <-errChan:
log.Fatalf("failed to read from tailscaled: %v", err)
return fmt.Errorf("failed to read from tailscaled: %w", err)
case err := <-cfgWatchErrChan:
log.Fatalf("failed to watch tailscaled config: %v", err)
return fmt.Errorf("failed to watch tailscaled config: %w", err)
case n := <-notifyChan:
if n.State != nil && *n.State != ipn.Running {
// Something's gone wrong and we've left the authenticated state.
@@ -467,7 +499,7 @@ runLoop:
// control flow required to make it work now is hard. So, just crash
// the container and rely on the container runtime to restart us,
// whereupon we'll go through initial auth again.
log.Fatalf("tailscaled left running state (now in state %q), exiting", *n.State)
return fmt.Errorf("tailscaled left running state (now in state %q), exiting", *n.State)
}
if n.NetMap != nil {
addrs = n.NetMap.SelfNode.Addresses().AsSlice()
@@ -485,7 +517,7 @@ runLoop:
deviceID := n.NetMap.SelfNode.StableID()
if hasKubeStateStore(cfg) && deephash.Update(&currentDeviceID, &deviceID) {
if err := kc.storeDeviceID(ctx, n.NetMap.SelfNode.StableID()); err != nil {
log.Fatalf("storing device ID in Kubernetes Secret: %v", err)
return fmt.Errorf("storing device ID in Kubernetes Secret: %w", err)
}
}
if cfg.TailnetTargetFQDN != "" {
@@ -522,12 +554,12 @@ runLoop:
rulesInstalled = true
log.Printf("Installing forwarding rules for destination %v", ea.String())
if err := installEgressForwardingRule(ctx, ea.String(), addrs, nfr); err != nil {
log.Fatalf("installing egress proxy rules for destination %s: %v", ea.String(), err)
return fmt.Errorf("installing egress proxy rules for destination %s: %v", ea.String(), err)
}
}
}
if !rulesInstalled {
log.Fatalf("no forwarding rules for egress addresses %v, host supports IPv6: %v", egressAddrs, nfr.HasIPV6NAT())
return fmt.Errorf("no forwarding rules for egress addresses %v, host supports IPv6: %v", egressAddrs, nfr.HasIPV6NAT())
}
}
currentEgressIPs = newCurentEgressIPs
@@ -535,7 +567,7 @@ runLoop:
if cfg.ProxyTargetIP != "" && len(addrs) != 0 && ipsHaveChanged {
log.Printf("Installing proxy rules")
if err := installIngressForwardingRule(ctx, cfg.ProxyTargetIP, addrs, nfr); err != nil {
log.Fatalf("installing ingress proxy rules: %v", err)
return fmt.Errorf("installing ingress proxy rules: %w", err)
}
}
if cfg.ProxyTargetDNSName != "" && len(addrs) != 0 && ipsHaveChanged {
@@ -551,7 +583,7 @@ runLoop:
if backendsHaveChanged {
log.Printf("installing ingress proxy rules for backends %v", newBackendAddrs)
if err := installIngressForwardingRuleForDNSTarget(ctx, newBackendAddrs, addrs, nfr); err != nil {
log.Fatalf("error installing ingress proxy rules: %v", err)
return fmt.Errorf("error installing ingress proxy rules: %w", err)
}
}
resetTimer(false)
@@ -573,7 +605,7 @@ runLoop:
if cfg.TailnetTargetIP != "" && ipsHaveChanged && len(addrs) != 0 {
log.Printf("Installing forwarding rules for destination %v", cfg.TailnetTargetIP)
if err := installEgressForwardingRule(ctx, cfg.TailnetTargetIP, addrs, nfr); err != nil {
log.Fatalf("installing egress proxy rules: %v", err)
return fmt.Errorf("installing egress proxy rules: %w", err)
}
}
// If this is a L7 cluster ingress proxy (set up
@@ -585,7 +617,7 @@ runLoop:
if cfg.AllowProxyingClusterTrafficViaIngress && cfg.ServeConfigPath != "" && ipsHaveChanged && len(addrs) != 0 {
log.Printf("installing rules to forward traffic for %s to node's tailnet IP", cfg.PodIP)
if err := installTSForwardingRuleForDestination(ctx, cfg.PodIP, addrs, nfr); err != nil {
log.Fatalf("installing rules to forward traffic to node's tailnet IP: %v", err)
return fmt.Errorf("installing rules to forward traffic to node's tailnet IP: %w", err)
}
}
currentIPs = newCurrentIPs
@@ -604,7 +636,7 @@ runLoop:
deviceEndpoints := []any{n.NetMap.SelfNode.Name(), n.NetMap.SelfNode.Addresses()}
if hasKubeStateStore(cfg) && deephash.Update(&currentDeviceEndpoints, &deviceEndpoints) {
if err := kc.storeDeviceEndpoints(ctx, n.NetMap.SelfNode.Name(), n.NetMap.SelfNode.Addresses().AsSlice()); err != nil {
log.Fatalf("storing device IPs and FQDN in Kubernetes Secret: %v", err)
return fmt.Errorf("storing device IPs and FQDN in Kubernetes Secret: %w", err)
}
}
@@ -639,20 +671,21 @@ runLoop:
// will then continuously monitor the config file and netmap updates and
// reconfigure the firewall rules as needed. If any of its operations fail, it
// will crash this node.
if cfg.EgressSvcsCfgPath != "" {
log.Printf("configuring egress proxy using configuration file at %s", cfg.EgressSvcsCfgPath)
if cfg.EgressProxiesCfgPath != "" {
log.Printf("configuring egress proxy using configuration file at %s", cfg.EgressProxiesCfgPath)
egressSvcsNotify = make(chan ipn.Notify)
ep := egressProxy{
cfgPath: cfg.EgressSvcsCfgPath,
opts := egressProxyRunOpts{
cfgPath: cfg.EgressProxiesCfgPath,
nfr: nfr,
kc: kc,
tsClient: client,
stateSecret: cfg.KubeSecret,
netmapChan: egressSvcsNotify,
podIPv4: cfg.PodIPv4,
tailnetAddrs: addrs,
}
go func() {
if err := ep.run(ctx, n); err != nil {
if err := ep.run(ctx, n, opts); err != nil {
egressSvcsErrorChan <- err
}
}()
@@ -694,16 +727,18 @@ runLoop:
if backendsHaveChanged && len(addrs) != 0 {
log.Printf("Backend address change detected, installing proxy rules for backends %v", newBackendAddrs)
if err := installIngressForwardingRuleForDNSTarget(ctx, newBackendAddrs, addrs, nfr); err != nil {
log.Fatalf("installing ingress proxy rules for DNS target %s: %v", cfg.ProxyTargetDNSName, err)
return fmt.Errorf("installing ingress proxy rules for DNS target %s: %v", cfg.ProxyTargetDNSName, err)
}
}
backendAddrs = newBackendAddrs
resetTimer(false)
case e := <-egressSvcsErrorChan:
log.Fatalf("egress proxy failed: %v", e)
return fmt.Errorf("egress proxy failed: %v", e)
}
}
wg.Wait()
return nil
}
// ensureTunFile checks that /dev/net/tun exists, creating it if
@@ -732,13 +767,13 @@ func resolveDNS(ctx context.Context, name string) ([]net.IP, error) {
ip4s, err := net.DefaultResolver.LookupIP(ctx, "ip4", name)
if err != nil {
if e, ok := err.(*net.DNSError); !(ok && e.IsNotFound) {
return nil, fmt.Errorf("error looking up IPv4 addresses: %v", err)
return nil, fmt.Errorf("error looking up IPv4 addresses: %w", err)
}
}
ip6s, err := net.DefaultResolver.LookupIP(ctx, "ip6", name)
if err != nil {
if e, ok := err.(*net.DNSError); !(ok && e.IsNotFound) {
return nil, fmt.Errorf("error looking up IPv6 addresses: %v", err)
return nil, fmt.Errorf("error looking up IPv6 addresses: %w", err)
}
}
if len(ip4s) == 0 && len(ip6s) == 0 {
@@ -751,7 +786,7 @@ func resolveDNS(ctx context.Context, name string) ([]net.IP, error) {
// context that gets cancelled when a signal is received and a cancel function
// that can be called to free the resources when the watch should be stopped.
func contextWithExitSignalWatch() (context.Context, func()) {
closeChan := make(chan string)
closeChan := make(chan struct{})
ctx, cancel := context.WithCancel(context.Background())
signalChan := make(chan os.Signal, 1)
signal.Notify(signalChan, syscall.SIGINT, syscall.SIGTERM)
@@ -763,8 +798,11 @@ func contextWithExitSignalWatch() (context.Context, func()) {
return
}
}()
closeOnce := sync.Once{}
f := func() {
closeChan <- "goodbye"
closeOnce.Do(func() {
close(closeChan)
})
}
return ctx, f
}
@@ -817,7 +855,11 @@ func runHTTPServer(mux *http.ServeMux, addr string) (close func() error) {
go func() {
if err := srv.Serve(ln); err != nil {
log.Fatalf("failed running server: %v", err)
if err != http.ErrServerClosed {
log.Fatalf("failed running server: %v", err)
} else {
log.Printf("HTTP server at %s closed", addr)
}
}
}()

View File

@@ -25,6 +25,7 @@ import (
"strconv"
"strings"
"sync"
"syscall"
"testing"
"time"
@@ -32,6 +33,8 @@ import (
"golang.org/x/sys/unix"
"tailscale.com/ipn"
"tailscale.com/kube/egressservices"
"tailscale.com/kube/kubeclient"
"tailscale.com/kube/kubetypes"
"tailscale.com/tailcfg"
"tailscale.com/tstest"
"tailscale.com/types/netmap"
@@ -48,26 +51,13 @@ func TestContainerBoot(t *testing.T) {
defer lapi.Close()
kube := kubeServer{FSRoot: d}
if err := kube.Start(); err != nil {
t.Fatal(err)
}
kube.Start(t)
defer kube.Close()
tailscaledConf := &ipn.ConfigVAlpha{AuthKey: ptr.To("foo"), Version: "alpha0"}
tailscaledConfBytes, err := json.Marshal(tailscaledConf)
if err != nil {
t.Fatalf("error unmarshaling tailscaled config: %v", err)
}
serveConf := ipn.ServeConfig{TCP: map[uint16]*ipn.TCPPortHandler{80: {HTTP: true}}}
serveConfBytes, err := json.Marshal(serveConf)
if err != nil {
t.Fatalf("error unmarshaling serve config: %v", err)
}
egressSvcsCfg := egressservices.Configs{"foo": {TailnetTarget: egressservices.TailnetTarget{FQDN: "foo.tailnetxyx.ts.net"}}}
egressSvcsCfgBytes, err := json.Marshal(egressSvcsCfg)
if err != nil {
t.Fatalf("error unmarshaling egress services config: %v", err)
}
egressCfg := egressSvcConfig("foo", "foo.tailnetxyz.ts.net")
egressStatus := egressSvcStatus("foo", "foo.tailnetxyz.ts.net")
dirs := []string{
"var/lib",
@@ -84,16 +74,17 @@ func TestContainerBoot(t *testing.T) {
}
}
files := map[string][]byte{
"usr/bin/tailscaled": fakeTailscaled,
"usr/bin/tailscale": fakeTailscale,
"usr/bin/iptables": fakeTailscale,
"usr/bin/ip6tables": fakeTailscale,
"dev/net/tun": []byte(""),
"proc/sys/net/ipv4/ip_forward": []byte("0"),
"proc/sys/net/ipv6/conf/all/forwarding": []byte("0"),
"etc/tailscaled/cap-95.hujson": tailscaledConfBytes,
"etc/tailscaled/serve-config.json": serveConfBytes,
"etc/tailscaled/egress-services-config.json": egressSvcsCfgBytes,
"usr/bin/tailscaled": fakeTailscaled,
"usr/bin/tailscale": fakeTailscale,
"usr/bin/iptables": fakeTailscale,
"usr/bin/ip6tables": fakeTailscale,
"dev/net/tun": []byte(""),
"proc/sys/net/ipv4/ip_forward": []byte("0"),
"proc/sys/net/ipv6/conf/all/forwarding": []byte("0"),
"etc/tailscaled/cap-95.hujson": mustJSON(t, tailscaledConf),
"etc/tailscaled/serve-config.json": mustJSON(t, serveConf),
filepath.Join("etc/tailscaled/", egressservices.KeyEgressServices): mustJSON(t, egressCfg),
filepath.Join("etc/tailscaled/", egressservices.KeyHEPPings): []byte("4"),
}
resetFiles := func() {
for path, content := range files {
@@ -132,6 +123,9 @@ func TestContainerBoot(t *testing.T) {
healthURL := func(port int) string {
return fmt.Sprintf("http://127.0.0.1:%d/healthz", port)
}
egressSvcTerminateURL := func(port int) string {
return fmt.Sprintf("http://127.0.0.1:%d%s", port, kubetypes.EgessServicesPreshutdownEP)
}
capver := fmt.Sprintf("%d", tailcfg.CurrentCapabilityVersion)
@@ -143,15 +137,29 @@ func TestContainerBoot(t *testing.T) {
// WantCmds is the commands that containerboot should run in this phase.
WantCmds []string
// WantKubeSecret is the secret keys/values that should exist in the
// kube secret.
WantKubeSecret map[string]string
// Update the kube secret with these keys/values at the beginning of the
// phase (simulates our fake tailscaled doing it).
UpdateKubeSecret map[string]string
// WantFiles files that should exist in the container and their
// contents.
WantFiles map[string]string
// WantFatalLog is the fatal log message we expect from containerboot.
// If set for a phase, the test will finish on that phase.
WantFatalLog string
// WantLog is a log message we expect from containerboot.
WantLog string
// If set for a phase, the test will expect containerboot to exit with
// this error code, and the test will finish on that phase without
// waiting for the successful startup log message.
WantExitCode *int
// The signal to send to containerboot at the start of the phase.
Signal *syscall.Signal
EndpointStatuses map[string]int
}
@@ -439,7 +447,8 @@ func TestContainerBoot(t *testing.T) {
},
},
},
WantFatalLog: "no forwarding rules for egress addresses [::1/128], host supports IPv6: false",
WantLog: "no forwarding rules for egress addresses [::1/128], host supports IPv6: false",
WantExitCode: ptr.To(1),
},
},
},
@@ -896,9 +905,10 @@ func TestContainerBoot(t *testing.T) {
{
Name: "egress_svcs_config_kube",
Env: map[string]string{
"KUBERNETES_SERVICE_HOST": kube.Host,
"KUBERNETES_SERVICE_PORT_HTTPS": kube.Port,
"TS_EGRESS_SERVICES_CONFIG_PATH": filepath.Join(d, "etc/tailscaled/egress-services-config.json"),
"KUBERNETES_SERVICE_HOST": kube.Host,
"KUBERNETES_SERVICE_PORT_HTTPS": kube.Port,
"TS_EGRESS_PROXIES_CONFIG_PATH": filepath.Join(d, "etc/tailscaled"),
"TS_LOCAL_ADDR_PORT": fmt.Sprintf("[::]:%d", localAddrPort),
},
KubeSecret: map[string]string{
"authkey": "tskey-key",
@@ -912,28 +922,92 @@ func TestContainerBoot(t *testing.T) {
WantKubeSecret: map[string]string{
"authkey": "tskey-key",
},
EndpointStatuses: map[string]int{
egressSvcTerminateURL(localAddrPort): 200,
},
},
{
Notify: runningNotify,
WantKubeSecret: map[string]string{
"egress-services": mustBase64(t, egressStatus),
"authkey": "tskey-key",
"device_fqdn": "test-node.test.ts.net",
"device_id": "myID",
"device_ips": `["100.64.0.1"]`,
"tailscale_capver": capver,
},
EndpointStatuses: map[string]int{
egressSvcTerminateURL(localAddrPort): 200,
},
},
},
},
{
Name: "egress_svcs_config_no_kube",
Env: map[string]string{
"TS_EGRESS_SERVICES_CONFIG_PATH": filepath.Join(d, "etc/tailscaled/egress-services-config.json"),
"TS_AUTHKEY": "tskey-key",
"TS_EGRESS_PROXIES_CONFIG_PATH": filepath.Join(d, "etc/tailscaled"),
"TS_AUTHKEY": "tskey-key",
},
Phases: []phase{
{
WantFatalLog: "TS_EGRESS_SERVICES_CONFIG_PATH is only supported for Tailscale running on Kubernetes",
WantLog: "TS_EGRESS_PROXIES_CONFIG_PATH is only supported for Tailscale running on Kubernetes",
WantExitCode: ptr.To(1),
},
},
},
{
Name: "kube_shutdown_during_state_write",
Env: map[string]string{
"KUBERNETES_SERVICE_HOST": kube.Host,
"KUBERNETES_SERVICE_PORT_HTTPS": kube.Port,
"TS_ENABLE_HEALTH_CHECK": "true",
},
KubeSecret: map[string]string{
"authkey": "tskey-key",
},
Phases: []phase{
{
// Normal startup.
WantCmds: []string{
"/usr/bin/tailscaled --socket=/tmp/tailscaled.sock --state=kube:tailscale --statedir=/tmp --tun=userspace-networking",
"/usr/bin/tailscale --socket=/tmp/tailscaled.sock up --accept-dns=false --authkey=tskey-key",
},
WantKubeSecret: map[string]string{
"authkey": "tskey-key",
},
},
{
// SIGTERM before state is finished writing, should wait for
// consistent state before propagating SIGTERM to tailscaled.
Signal: ptr.To(unix.SIGTERM),
UpdateKubeSecret: map[string]string{
"_machinekey": "foo",
"_profiles": "foo",
"profile-baff": "foo",
// Missing "_current-profile" key.
},
WantKubeSecret: map[string]string{
"authkey": "tskey-key",
"_machinekey": "foo",
"_profiles": "foo",
"profile-baff": "foo",
},
WantLog: "Waiting for tailscaled to finish writing state to Secret \"tailscale\"",
},
{
// tailscaled has finished writing state, should propagate SIGTERM.
UpdateKubeSecret: map[string]string{
"_current-profile": "foo",
},
WantKubeSecret: map[string]string{
"authkey": "tskey-key",
"_machinekey": "foo",
"_profiles": "foo",
"profile-baff": "foo",
"_current-profile": "foo",
},
WantLog: "HTTP server at [::]:9002 closed",
WantExitCode: ptr.To(0),
},
},
},
@@ -981,26 +1055,36 @@ func TestContainerBoot(t *testing.T) {
var wantCmds []string
for i, p := range test.Phases {
for k, v := range p.UpdateKubeSecret {
kube.SetSecret(k, v)
}
lapi.Notify(p.Notify)
if p.WantFatalLog != "" {
if p.Signal != nil {
cmd.Process.Signal(*p.Signal)
}
if p.WantLog != "" {
err := tstest.WaitFor(2*time.Second, func() error {
state, err := cmd.Process.Wait()
if err != nil {
return err
}
if state.ExitCode() != 1 {
return fmt.Errorf("process exited with code %d but wanted %d", state.ExitCode(), 1)
}
waitLogLine(t, time.Second, cbOut, p.WantFatalLog)
waitLogLine(t, time.Second, cbOut, p.WantLog)
return nil
})
if err != nil {
t.Fatal(err)
}
}
if p.WantExitCode != nil {
state, err := cmd.Process.Wait()
if err != nil {
t.Fatal(err)
}
if state.ExitCode() != *p.WantExitCode {
t.Fatalf("phase %d: want exit code %d, got %d", i, *p.WantExitCode, state.ExitCode())
}
// Early test return, we don't expect the successful startup log message.
return
}
wantCmds = append(wantCmds, p.WantCmds...)
waitArgs(t, 2*time.Second, d, argFile, strings.Join(wantCmds, "\n"))
err := tstest.WaitFor(2*time.Second, func() error {
@@ -1056,6 +1140,9 @@ func TestContainerBoot(t *testing.T) {
}
}
waitLogLine(t, 2*time.Second, cbOut, "Startup complete, waiting for shutdown signal")
if cmd.ProcessState != nil {
t.Fatalf("containerboot should be running but exited with exit code %d", cmd.ProcessState.ExitCode())
}
})
}
}
@@ -1287,18 +1374,18 @@ func (k *kubeServer) Reset() {
k.secret = map[string]string{}
}
func (k *kubeServer) Start() error {
func (k *kubeServer) Start(t *testing.T) {
root := filepath.Join(k.FSRoot, "var/run/secrets/kubernetes.io/serviceaccount")
if err := os.MkdirAll(root, 0700); err != nil {
return err
t.Fatal(err)
}
if err := os.WriteFile(filepath.Join(root, "namespace"), []byte("default"), 0600); err != nil {
return err
t.Fatal(err)
}
if err := os.WriteFile(filepath.Join(root, "token"), []byte("bearer_token"), 0600); err != nil {
return err
t.Fatal(err)
}
k.srv = httptest.NewTLSServer(k)
@@ -1307,13 +1394,11 @@ func (k *kubeServer) Start() error {
var cert bytes.Buffer
if err := pem.Encode(&cert, &pem.Block{Type: "CERTIFICATE", Bytes: k.srv.Certificate().Raw}); err != nil {
return err
t.Fatal(err)
}
if err := os.WriteFile(filepath.Join(root, "ca.crt"), cert.Bytes(), 0600); err != nil {
return err
t.Fatal(err)
}
return nil
}
func (k *kubeServer) Close() {
@@ -1362,6 +1447,7 @@ func (k *kubeServer) serveSecret(w http.ResponseWriter, r *http.Request) {
http.Error(w, fmt.Sprintf("reading request body: %v", err), http.StatusInternalServerError)
return
}
defer r.Body.Close()
switch r.Method {
case "GET":
@@ -1394,13 +1480,32 @@ func (k *kubeServer) serveSecret(w http.ResponseWriter, r *http.Request) {
panic(fmt.Sprintf("json decode failed: %v. Body:\n\n%s", err, string(bs)))
}
for _, op := range req {
if op.Op != "remove" {
switch op.Op {
case "remove":
if !strings.HasPrefix(op.Path, "/data/") {
panic(fmt.Sprintf("unsupported json-patch path %q", op.Path))
}
delete(k.secret, strings.TrimPrefix(op.Path, "/data/"))
case "replace":
path, ok := strings.CutPrefix(op.Path, "/data/")
if !ok {
panic(fmt.Sprintf("unsupported json-patch path %q", op.Path))
}
req := make([]kubeclient.JSONPatch, 0)
if err := json.Unmarshal(bs, &req); err != nil {
panic(fmt.Sprintf("json decode failed: %v. Body:\n\n%s", err, string(bs)))
}
for _, patch := range req {
val, ok := patch.Value.(string)
if !ok {
panic(fmt.Sprintf("unsupported json patch value %v: cannot be converted to string", patch.Value))
}
k.secret[path] = val
}
default:
panic(fmt.Sprintf("unsupported json-patch op %q", op.Op))
}
if !strings.HasPrefix(op.Path, "/data/") {
panic(fmt.Sprintf("unsupported json-patch path %q", op.Path))
}
delete(k.secret, strings.TrimPrefix(op.Path, "/data/"))
}
case "application/strategic-merge-patch+json":
req := struct {
@@ -1416,6 +1521,44 @@ func (k *kubeServer) serveSecret(w http.ResponseWriter, r *http.Request) {
panic(fmt.Sprintf("unknown content type %q", r.Header.Get("Content-Type")))
}
default:
panic(fmt.Sprintf("unhandled HTTP method %q", r.Method))
panic(fmt.Sprintf("unhandled HTTP request %s %s", r.Method, r.URL))
}
}
func mustBase64(t *testing.T, v any) string {
b := mustJSON(t, v)
s := base64.StdEncoding.WithPadding('=').EncodeToString(b)
return s
}
func mustJSON(t *testing.T, v any) []byte {
b, err := json.Marshal(v)
if err != nil {
t.Fatalf("error converting %v to json: %v", v, err)
}
return b
}
// egress services status given one named tailnet target specified by FQDN. As written by the proxy to its state Secret.
func egressSvcStatus(name, fqdn string) egressservices.Status {
return egressservices.Status{
Services: map[string]*egressservices.ServiceStatus{
name: {
TailnetTarget: egressservices.TailnetTarget{
FQDN: fqdn,
},
},
},
}
}
// egress config given one named tailnet target specified by FQDN.
func egressSvcConfig(name, fqdn string) egressservices.Configs {
return egressservices.Configs{
name: egressservices.Config{
TailnetTarget: egressservices.TailnetTarget{
FQDN: fqdn,
},
},
}
}

View File

@@ -11,18 +11,24 @@ import (
"errors"
"fmt"
"log"
"net/http"
"net/netip"
"os"
"path/filepath"
"reflect"
"strconv"
"strings"
"time"
"github.com/fsnotify/fsnotify"
"tailscale.com/client/tailscale"
"tailscale.com/ipn"
"tailscale.com/kube/egressservices"
"tailscale.com/kube/kubeclient"
"tailscale.com/kube/kubetypes"
"tailscale.com/syncs"
"tailscale.com/tailcfg"
"tailscale.com/util/httpm"
"tailscale.com/util/linuxfw"
"tailscale.com/util/mak"
)
@@ -37,13 +43,15 @@ const tailscaleTunInterface = "tailscale0"
// egressProxy knows how to configure firewall rules to route cluster traffic to
// one or more tailnet services.
type egressProxy struct {
cfgPath string // path to egress service config file
cfgPath string // path to a directory with egress services config files
nfr linuxfw.NetfilterRunner // never nil
kc kubeclient.Client // never nil
stateSecret string // name of the kube state Secret
tsClient *tailscale.LocalClient // never nil
netmapChan chan ipn.Notify // chan to receive netmap updates on
podIPv4 string // never empty string, currently only IPv4 is supported
@@ -55,15 +63,29 @@ type egressProxy struct {
// memory at all.
targetFQDNs map[string][]netip.Prefix
// used to configure firewall rules.
tailnetAddrs []netip.Prefix
tailnetAddrs []netip.Prefix // tailnet IPs of this tailnet device
// shortSleep is the backoff sleep between healthcheck endpoint calls - can be overridden in tests.
shortSleep time.Duration
// longSleep is the time to sleep after the routing rules are updated to increase the chance that kube
// proxies on all nodes have updated their routing configuration. It can be configured to 0 in
// tests.
longSleep time.Duration
// client is a client that can send HTTP requests.
client httpClient
}
// httpClient is a client that can send HTTP requests and can be mocked in tests.
type httpClient interface {
Do(*http.Request) (*http.Response, error)
}
// run configures egress proxy firewall rules and ensures that the firewall rules are reconfigured when:
// - the mounted egress config has changed
// - the proxy's tailnet IP addresses have changed
// - tailnet IPs have changed for any backend targets specified by tailnet FQDN
func (ep *egressProxy) run(ctx context.Context, n ipn.Notify) error {
func (ep *egressProxy) run(ctx context.Context, n ipn.Notify, opts egressProxyRunOpts) error {
ep.configure(opts)
var tickChan <-chan time.Time
var eventChan <-chan fsnotify.Event
// TODO (irbekrm): take a look if this can be pulled into a single func
@@ -75,7 +97,7 @@ func (ep *egressProxy) run(ctx context.Context, n ipn.Notify) error {
tickChan = ticker.C
} else {
defer w.Close()
if err := w.Add(filepath.Dir(ep.cfgPath)); err != nil {
if err := w.Add(ep.cfgPath); err != nil {
return fmt.Errorf("failed to add fsnotify watch: %w", err)
}
eventChan = w.Events
@@ -85,28 +107,52 @@ func (ep *egressProxy) run(ctx context.Context, n ipn.Notify) error {
return err
}
for {
var err error
select {
case <-ctx.Done():
return nil
case <-tickChan:
err = ep.sync(ctx, n)
log.Printf("periodic sync, ensuring firewall config is up to date...")
case <-eventChan:
log.Printf("config file change detected, ensuring firewall config is up to date...")
err = ep.sync(ctx, n)
case n = <-ep.netmapChan:
shouldResync := ep.shouldResync(n)
if shouldResync {
log.Printf("netmap change detected, ensuring firewall config is up to date...")
err = ep.sync(ctx, n)
if !shouldResync {
continue
}
log.Printf("netmap change detected, ensuring firewall config is up to date...")
}
if err != nil {
if err := ep.sync(ctx, n); err != nil {
return fmt.Errorf("error syncing egress service config: %w", err)
}
}
}
type egressProxyRunOpts struct {
cfgPath string
nfr linuxfw.NetfilterRunner
kc kubeclient.Client
tsClient *tailscale.LocalClient
stateSecret string
netmapChan chan ipn.Notify
podIPv4 string
tailnetAddrs []netip.Prefix
}
// applyOpts configures egress proxy using the provided options.
func (ep *egressProxy) configure(opts egressProxyRunOpts) {
ep.cfgPath = opts.cfgPath
ep.nfr = opts.nfr
ep.kc = opts.kc
ep.tsClient = opts.tsClient
ep.stateSecret = opts.stateSecret
ep.netmapChan = opts.netmapChan
ep.podIPv4 = opts.podIPv4
ep.tailnetAddrs = opts.tailnetAddrs
ep.client = &http.Client{} // default HTTP client
ep.shortSleep = time.Second
ep.longSleep = time.Second * 10
}
// sync triggers an egress proxy config resync. The resync calculates the diff between config and status to determine if
// any firewall rules need to be updated. Currently using status in state Secret as a reference for what is the current
// firewall configuration is good enough because - the status is keyed by the Pod IP - we crash the Pod on errors such
@@ -327,7 +373,8 @@ func (ep *egressProxy) deleteUnnecessaryServices(cfgs *egressservices.Configs, s
// getConfigs gets the mounted egress service configuration.
func (ep *egressProxy) getConfigs() (*egressservices.Configs, error) {
j, err := os.ReadFile(ep.cfgPath)
svcsCfg := filepath.Join(ep.cfgPath, egressservices.KeyEgressServices)
j, err := os.ReadFile(svcsCfg)
if os.IsNotExist(err) {
return nil, nil
}
@@ -569,3 +616,142 @@ func servicesStatusIsEqual(st, st1 *egressservices.Status) bool {
st1.PodIPv4 = ""
return reflect.DeepEqual(*st, *st1)
}
// registerHandlers adds a new handler to the provided ServeMux that can be called as a Kubernetes prestop hook to
// delay shutdown till it's safe to do so.
func (ep *egressProxy) registerHandlers(mux *http.ServeMux) {
mux.Handle(fmt.Sprintf("GET %s", kubetypes.EgessServicesPreshutdownEP), ep)
}
// ServeHTTP serves /internal-egress-services-preshutdown endpoint, when it receives a request, it periodically polls
// the configured health check endpoint for each egress service till it the health check endpoint no longer hits this
// proxy Pod. It uses the Pod-IPv4 header to verify if health check response is received from this Pod.
func (ep *egressProxy) ServeHTTP(w http.ResponseWriter, r *http.Request) {
cfgs, err := ep.getConfigs()
if err != nil {
http.Error(w, fmt.Sprintf("error retrieving egress services configs: %v", err), http.StatusInternalServerError)
return
}
if cfgs == nil {
if _, err := w.Write([]byte("safe to terminate")); err != nil {
http.Error(w, fmt.Sprintf("error writing termination status: %v", err), http.StatusInternalServerError)
return
}
}
hp, err := ep.getHEPPings()
if err != nil {
http.Error(w, fmt.Sprintf("error determining the number of times health check endpoint should be pinged: %v", err), http.StatusInternalServerError)
return
}
ep.waitTillSafeToShutdown(r.Context(), cfgs, hp)
}
// waitTillSafeToShutdown looks up all egress targets configured to be proxied via this instance and, for each target
// whose configuration includes a healthcheck endpoint, pings the endpoint till none of the responses
// are returned by this instance or till the HTTP request times out. In practice, the endpoint will be a Kubernetes Service for whom one of the backends
// would normally be this Pod. When this Pod is being deleted, the operator should have removed it from the Service
// backends and eventually kube proxy routing rules should be updated to no longer route traffic for the Service to this
// Pod.
func (ep *egressProxy) waitTillSafeToShutdown(ctx context.Context, cfgs *egressservices.Configs, hp int) {
if cfgs == nil || len(*cfgs) == 0 { // avoid sleeping if no services are configured
return
}
log.Printf("Ensuring that cluster traffic for egress targets is no longer routed via this Pod...")
wg := syncs.WaitGroup{}
for s, cfg := range *cfgs {
hep := cfg.HealthCheckEndpoint
if hep == "" {
log.Printf("Tailnet target %q does not have a cluster healthcheck specified, unable to verify if cluster traffic for the target is still routed via this Pod", s)
continue
}
svc := s
wg.Go(func() {
log.Printf("Ensuring that cluster traffic is no longer routed to %q via this Pod...", svc)
for {
if ctx.Err() != nil { // kubelet's HTTP request timeout
log.Printf("Cluster traffic for %s did not stop being routed to this Pod.", svc)
return
}
found, err := lookupPodRoute(ctx, hep, ep.podIPv4, hp, ep.client)
if err != nil {
log.Printf("unable to reach endpoint %q, assuming the routing rules for this Pod have been deleted: %v", hep, err)
break
}
if !found {
log.Printf("service %q is no longer routed through this Pod", svc)
break
}
log.Printf("service %q is still routed through this Pod, waiting...", svc)
time.Sleep(ep.shortSleep)
}
})
}
wg.Wait()
// The check above really only checked that the routing rules are updated on this node. Sleep for a bit to
// ensure that the routing rules are updated on other nodes. TODO(irbekrm): this may or may not be good enough.
// If it's not good enough, we'd probably want to do something more complex, where the proxies check each other.
log.Printf("Sleeping for %s before shutdown to ensure that kube proxies on all nodes have updated routing configuration", ep.longSleep)
time.Sleep(ep.longSleep)
}
// lookupPodRoute calls the healthcheck endpoint repeat times and returns true if the endpoint returns with the podIP
// header at least once.
func lookupPodRoute(ctx context.Context, hep, podIP string, repeat int, client httpClient) (bool, error) {
for range repeat {
f, err := lookup(ctx, hep, podIP, client)
if err != nil {
return false, err
}
if f {
return true, nil
}
}
return false, nil
}
// lookup calls the healthcheck endpoint and returns true if the response contains the podIP header.
func lookup(ctx context.Context, hep, podIP string, client httpClient) (bool, error) {
req, err := http.NewRequestWithContext(ctx, httpm.GET, hep, nil)
if err != nil {
return false, fmt.Errorf("error creating new HTTP request: %v", err)
}
// Close the TCP connection to ensure that the next request is routed to a different backend.
req.Close = true
resp, err := client.Do(req)
if err != nil {
log.Printf("Endpoint %q can not be reached: %v, likely because there are no (more) healthy backends", hep, err)
return true, nil
}
defer resp.Body.Close()
gotIP := resp.Header.Get(kubetypes.PodIPv4Header)
return strings.EqualFold(podIP, gotIP), nil
}
// getHEPPings gets the number of pings that should be sent to a health check endpoint to ensure that each configured
// backend is hit. This assumes that a health check endpoint is a Kubernetes Service and traffic to backend Pods is
// round robin load balanced.
func (ep *egressProxy) getHEPPings() (int, error) {
hepPingsPath := filepath.Join(ep.cfgPath, egressservices.KeyHEPPings)
j, err := os.ReadFile(hepPingsPath)
if os.IsNotExist(err) {
return 0, nil
}
if err != nil {
return -1, err
}
if len(j) == 0 || string(j) == "" {
return 0, nil
}
hp, err := strconv.Atoi(string(j))
if err != nil {
return -1, fmt.Errorf("error parsing hep pings as int: %v", err)
}
if hp < 0 {
log.Printf("[unexpected] hep pings is negative: %d", hp)
return 0, nil
}
return hp, nil
}

View File

@@ -6,11 +6,18 @@
package main
import (
"context"
"fmt"
"io"
"net/http"
"net/netip"
"reflect"
"strings"
"sync"
"testing"
"tailscale.com/kube/egressservices"
"tailscale.com/kube/kubetypes"
)
func Test_updatesForSvc(t *testing.T) {
@@ -173,3 +180,145 @@ func Test_updatesForSvc(t *testing.T) {
})
}
}
// A failure of this test will most likely look like a timeout.
func TestWaitTillSafeToShutdown(t *testing.T) {
podIP := "10.0.0.1"
anotherIP := "10.0.0.2"
tests := []struct {
name string
// services is a map of service name to the number of calls to make to the healthcheck endpoint before
// returning a response that does NOT contain this Pod's IP in headers.
services map[string]int
replicas int
healthCheckSet bool
}{
{
name: "no_configs",
},
{
name: "one_service_immediately_safe_to_shutdown",
services: map[string]int{
"svc1": 0,
},
replicas: 2,
healthCheckSet: true,
},
{
name: "multiple_services_immediately_safe_to_shutdown",
services: map[string]int{
"svc1": 0,
"svc2": 0,
"svc3": 0,
},
replicas: 2,
healthCheckSet: true,
},
{
name: "multiple_services_no_healthcheck_endpoints",
services: map[string]int{
"svc1": 0,
"svc2": 0,
"svc3": 0,
},
replicas: 2,
},
{
name: "one_service_eventually_safe_to_shutdown",
services: map[string]int{
"svc1": 3, // After 3 calls to health check endpoint, no longer returns this Pod's IP
},
replicas: 2,
healthCheckSet: true,
},
{
name: "multiple_services_eventually_safe_to_shutdown",
services: map[string]int{
"svc1": 1, // After 1 call to health check endpoint, no longer returns this Pod's IP
"svc2": 3, // After 3 calls to health check endpoint, no longer returns this Pod's IP
"svc3": 5, // After 5 calls to the health check endpoint, no longer returns this Pod's IP
},
replicas: 2,
healthCheckSet: true,
},
{
name: "multiple_services_eventually_safe_to_shutdown_with_higher_replica_count",
services: map[string]int{
"svc1": 7,
"svc2": 10,
},
replicas: 5,
healthCheckSet: true,
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
cfgs := &egressservices.Configs{}
switches := make(map[string]int)
for svc, callsToSwitch := range tt.services {
endpoint := fmt.Sprintf("http://%s.local", svc)
if tt.healthCheckSet {
(*cfgs)[svc] = egressservices.Config{
HealthCheckEndpoint: endpoint,
}
}
switches[endpoint] = callsToSwitch
}
ep := &egressProxy{
podIPv4: podIP,
client: &mockHTTPClient{
podIP: podIP,
anotherIP: anotherIP,
switches: switches,
},
}
ep.waitTillSafeToShutdown(context.Background(), cfgs, tt.replicas)
})
}
}
// mockHTTPClient is a client that receives an HTTP call for an egress service endpoint and returns a response with an
// IP address in a 'Pod-IPv4' header. It can be configured to return one IP address for N calls, then switch to another
// IP address to simulate a scenario where an IP is eventually no longer a backend for an endpoint.
// TODO(irbekrm): to test this more thoroughly, we should have the client take into account the number of replicas and
// return as if traffic was round robin load balanced across different Pods.
type mockHTTPClient struct {
// podIP - initial IP address to return, that matches the current proxy's IP address.
podIP string
anotherIP string
// after how many calls to an endpoint, the client should start returning 'anotherIP' instead of 'podIP.
switches map[string]int
mu sync.Mutex // protects the following
// calls tracks the number of calls received.
calls map[string]int
}
func (m *mockHTTPClient) Do(req *http.Request) (*http.Response, error) {
m.mu.Lock()
if m.calls == nil {
m.calls = make(map[string]int)
}
endpoint := req.URL.String()
m.calls[endpoint]++
calls := m.calls[endpoint]
m.mu.Unlock()
resp := &http.Response{
StatusCode: http.StatusOK,
Header: make(http.Header),
Body: io.NopCloser(strings.NewReader("")),
}
if calls <= m.switches[endpoint] {
resp.Header.Set(kubetypes.PodIPv4Header, m.podIP) // Pod is still routable
} else {
resp.Header.Set(kubetypes.PodIPv4Header, m.anotherIP) // Pod is no longer routable
}
return resp, nil
}

View File

@@ -64,16 +64,16 @@ type settings struct {
// when setting up rules to proxy cluster traffic to cluster ingress
// target.
// Deprecated: use PodIPv4, PodIPv6 instead to support dual stack clusters
PodIP string
PodIPv4 string
PodIPv6 string
PodUID string
HealthCheckAddrPort string
LocalAddrPort string
MetricsEnabled bool
HealthCheckEnabled bool
DebugAddrPort string
EgressSvcsCfgPath string
PodIP string
PodIPv4 string
PodIPv6 string
PodUID string
HealthCheckAddrPort string
LocalAddrPort string
MetricsEnabled bool
HealthCheckEnabled bool
DebugAddrPort string
EgressProxiesCfgPath string
}
func configFromEnv() (*settings, error) {
@@ -107,7 +107,7 @@ func configFromEnv() (*settings, error) {
MetricsEnabled: defaultBool("TS_ENABLE_METRICS", false),
HealthCheckEnabled: defaultBool("TS_ENABLE_HEALTH_CHECK", false),
DebugAddrPort: defaultEnv("TS_DEBUG_ADDR_PORT", ""),
EgressSvcsCfgPath: defaultEnv("TS_EGRESS_SERVICES_CONFIG_PATH", ""),
EgressProxiesCfgPath: defaultEnv("TS_EGRESS_PROXIES_CONFIG_PATH", ""),
PodUID: defaultEnv("POD_UID", ""),
}
podIPs, ok := os.LookupEnv("POD_IPS")
@@ -186,7 +186,7 @@ func (s *settings) validate() error {
return fmt.Errorf("error parsing TS_HEALTHCHECK_ADDR_PORT value %q: %w", s.HealthCheckAddrPort, err)
}
}
if s.localMetricsEnabled() || s.localHealthEnabled() {
if s.localMetricsEnabled() || s.localHealthEnabled() || s.EgressProxiesCfgPath != "" {
if _, err := netip.ParseAddrPort(s.LocalAddrPort); err != nil {
return fmt.Errorf("error parsing TS_LOCAL_ADDR_PORT value %q: %w", s.LocalAddrPort, err)
}
@@ -199,8 +199,8 @@ func (s *settings) validate() error {
if s.HealthCheckEnabled && s.HealthCheckAddrPort != "" {
return errors.New("TS_HEALTHCHECK_ADDR_PORT is deprecated and will be removed in 1.82.0, use TS_ENABLE_HEALTH_CHECK and optionally TS_LOCAL_ADDR_PORT")
}
if s.EgressSvcsCfgPath != "" && !(s.InKubernetes && s.KubeSecret != "") {
return errors.New("TS_EGRESS_SERVICES_CONFIG_PATH is only supported for Tailscale running on Kubernetes")
if s.EgressProxiesCfgPath != "" && !(s.InKubernetes && s.KubeSecret != "") {
return errors.New("TS_EGRESS_PROXIES_CONFIG_PATH is only supported for Tailscale running on Kubernetes")
}
return nil
}
@@ -291,7 +291,7 @@ func isOneStepConfig(cfg *settings) bool {
// as an L3 proxy, proxying to an endpoint provided via one of the config env
// vars.
func isL3Proxy(cfg *settings) bool {
return cfg.ProxyTargetIP != "" || cfg.ProxyTargetDNSName != "" || cfg.TailnetTargetIP != "" || cfg.TailnetTargetFQDN != "" || cfg.AllowProxyingClusterTrafficViaIngress || cfg.EgressSvcsCfgPath != ""
return cfg.ProxyTargetIP != "" || cfg.ProxyTargetDNSName != "" || cfg.TailnetTargetIP != "" || cfg.TailnetTargetFQDN != "" || cfg.AllowProxyingClusterTrafficViaIngress || cfg.EgressProxiesCfgPath != ""
}
// hasKubeStateStore returns true if the state must be stored in a Kubernetes
@@ -308,6 +308,10 @@ func (cfg *settings) localHealthEnabled() bool {
return cfg.LocalAddrPort != "" && cfg.HealthCheckEnabled
}
func (cfg *settings) egressSvcsTerminateEPEnabled() bool {
return cfg.LocalAddrPort != "" && cfg.EgressProxiesCfgPath != ""
}
// defaultEnv returns the value of the given envvar name, or defVal if
// unset.
func defaultEnv(name, defVal string) string {

View File

@@ -42,14 +42,14 @@ func startTailscaled(ctx context.Context, cfg *settings) (*tailscale.LocalClient
log.Printf("Waiting for tailscaled socket")
for {
if ctx.Err() != nil {
log.Fatalf("Timed out waiting for tailscaled socket")
return nil, nil, errors.New("timed out waiting for tailscaled socket")
}
_, err := os.Stat(cfg.Socket)
if errors.Is(err, fs.ErrNotExist) {
time.Sleep(100 * time.Millisecond)
continue
} else if err != nil {
log.Fatalf("Waiting for tailscaled socket: %v", err)
return nil, nil, fmt.Errorf("error waiting for tailscaled socket: %w", err)
}
break
}

View File

@@ -189,6 +189,8 @@ tailscale.com/cmd/derper dependencies: (generated by github.com/tailscale/depawa
golang.org/x/crypto/cryptobyte/asn1 from crypto/ecdsa+
golang.org/x/crypto/curve25519 from golang.org/x/crypto/nacl/box+
golang.org/x/crypto/hkdf from crypto/tls+
golang.org/x/crypto/internal/alias from golang.org/x/crypto/chacha20+
golang.org/x/crypto/internal/poly1305 from golang.org/x/crypto/chacha20poly1305+
golang.org/x/crypto/nacl/box from tailscale.com/types/key
golang.org/x/crypto/nacl/secretbox from golang.org/x/crypto/nacl/box
golang.org/x/crypto/salsa20/salsa from golang.org/x/crypto/nacl/box+
@@ -201,6 +203,7 @@ tailscale.com/cmd/derper dependencies: (generated by github.com/tailscale/depawa
golang.org/x/net/http/httpproxy from net/http+
golang.org/x/net/http2/hpack from net/http
golang.org/x/net/idna from golang.org/x/crypto/acme/autocert+
golang.org/x/net/internal/socks from golang.org/x/net/proxy
golang.org/x/net/proxy from tailscale.com/net/netns
D golang.org/x/net/route from net+
golang.org/x/sync/errgroup from github.com/mdlayher/socket+
@@ -232,6 +235,18 @@ tailscale.com/cmd/derper dependencies: (generated by github.com/tailscale/depawa
crypto/ed25519 from crypto/tls+
crypto/elliptic from crypto/ecdsa+
crypto/hmac from crypto/tls+
crypto/internal/alias from crypto/aes+
crypto/internal/bigmod from crypto/ecdsa+
crypto/internal/boring from crypto/aes+
crypto/internal/boring/bbig from crypto/ecdsa+
crypto/internal/boring/sig from crypto/internal/boring
crypto/internal/edwards25519 from crypto/ed25519
crypto/internal/edwards25519/field from crypto/ecdh+
crypto/internal/hpke from crypto/tls
crypto/internal/mlkem768 from crypto/tls
crypto/internal/nistec from crypto/ecdh+
crypto/internal/nistec/fiat from crypto/internal/nistec
crypto/internal/randutil from crypto/dsa+
crypto/md5 from crypto/tls+
crypto/rand from crypto/ed25519+
crypto/rc4 from crypto/tls
@@ -242,6 +257,7 @@ tailscale.com/cmd/derper dependencies: (generated by github.com/tailscale/depawa
crypto/subtle from crypto/aes+
crypto/tls from golang.org/x/crypto/acme+
crypto/x509 from crypto/tls+
D crypto/x509/internal/macos from crypto/x509
crypto/x509/pkix from crypto/x509+
embed from crypto/internal/nistec+
encoding from encoding/json+
@@ -263,6 +279,44 @@ tailscale.com/cmd/derper dependencies: (generated by github.com/tailscale/depawa
hash/maphash from go4.org/mem
html from net/http/pprof+
html/template from tailscale.com/cmd/derper
internal/abi from crypto/x509/internal/macos+
internal/asan from syscall
internal/bisect from internal/godebug
internal/bytealg from bytes+
internal/byteorder from crypto/aes+
internal/chacha8rand from math/rand/v2+
internal/concurrent from unique
internal/coverage/rtcov from runtime
internal/cpu from crypto/aes+
internal/filepathlite from os+
internal/fmtsort from fmt+
internal/goarch from crypto/aes+
internal/godebug from crypto/tls+
internal/godebugs from internal/godebug+
internal/goexperiment from runtime
internal/goos from crypto/x509+
internal/itoa from internal/poll+
internal/msan from syscall
internal/nettrace from net+
internal/oserror from io/fs+
internal/poll from net+
internal/profile from net/http/pprof
internal/profilerecord from runtime+
internal/race from internal/poll+
internal/reflectlite from context+
internal/runtime/atomic from internal/runtime/exithook+
internal/runtime/exithook from runtime
L internal/runtime/syscall from runtime+
internal/singleflight from net
internal/stringslite from embed+
internal/syscall/execenv from os+
LD internal/syscall/unix from crypto/rand+
W internal/syscall/windows from crypto/rand+
W internal/syscall/windows/registry from mime+
W internal/syscall/windows/sysdll from internal/syscall/windows+
internal/testlog from os
internal/unsafeheader from internal/reflectlite+
internal/weak from unique
io from bufio+
io/fs from crypto/x509+
L io/ioutil from github.com/mitchellh/go-ps+
@@ -282,6 +336,7 @@ tailscale.com/cmd/derper dependencies: (generated by github.com/tailscale/depawa
net/http from expvar+
net/http/httptrace from net/http+
net/http/internal from net/http
net/http/internal/ascii from net/http
net/http/pprof from tailscale.com/tsweb
net/netip from go4.org/netipx+
net/textproto from golang.org/x/net/http/httpguts+
@@ -295,7 +350,10 @@ tailscale.com/cmd/derper dependencies: (generated by github.com/tailscale/depawa
reflect from crypto/x509+
regexp from github.com/coreos/go-iptables/iptables+
regexp/syntax from regexp
runtime from crypto/internal/nistec+
runtime/debug from github.com/prometheus/client_golang/prometheus+
runtime/internal/math from runtime
runtime/internal/sys from runtime
runtime/metrics from github.com/prometheus/client_golang/prometheus+
runtime/pprof from net/http/pprof
runtime/trace from net/http/pprof
@@ -314,3 +372,4 @@ tailscale.com/cmd/derper dependencies: (generated by github.com/tailscale/depawa
unicode/utf16 from crypto/x509+
unicode/utf8 from bufio+
unique from net/netip
unsafe from bytes+

View File

@@ -197,7 +197,6 @@ tailscale.com/cmd/k8s-operator dependencies: (generated by github.com/tailscale/
W 💣 github.com/tailscale/go-winio/internal/socket from github.com/tailscale/go-winio
W github.com/tailscale/go-winio/internal/stringbuffer from github.com/tailscale/go-winio/internal/fs
W github.com/tailscale/go-winio/pkg/guid from github.com/tailscale/go-winio+
github.com/tailscale/golang-x-crypto/acme from tailscale.com/ipn/ipnlocal
LD github.com/tailscale/golang-x-crypto/internal/poly1305 from github.com/tailscale/golang-x-crypto/ssh
LD github.com/tailscale/golang-x-crypto/ssh from tailscale.com/ipn/ipnlocal
LD github.com/tailscale/golang-x-crypto/ssh/internal/bcrypt_pbkdf from github.com/tailscale/golang-x-crypto/ssh
@@ -802,6 +801,7 @@ tailscale.com/cmd/k8s-operator dependencies: (generated by github.com/tailscale/
tailscale.com/envknob from tailscale.com/client/tailscale+
tailscale.com/envknob/featureknob from tailscale.com/client/web+
tailscale.com/feature from tailscale.com/feature/wakeonlan+
tailscale.com/feature/capture from tailscale.com/feature/condregister
tailscale.com/feature/condregister from tailscale.com/tsnet
L tailscale.com/feature/tap from tailscale.com/feature/condregister
tailscale.com/feature/wakeonlan from tailscale.com/feature/condregister
@@ -814,7 +814,7 @@ tailscale.com/cmd/k8s-operator dependencies: (generated by github.com/tailscale/
💣 tailscale.com/ipn/ipnauth from tailscale.com/ipn/ipnlocal+
tailscale.com/ipn/ipnlocal from tailscale.com/ipn/localapi+
tailscale.com/ipn/ipnstate from tailscale.com/client/tailscale+
tailscale.com/ipn/localapi from tailscale.com/tsnet
tailscale.com/ipn/localapi from tailscale.com/tsnet+
tailscale.com/ipn/policy from tailscale.com/ipn/ipnlocal
tailscale.com/ipn/store from tailscale.com/ipn/ipnlocal+
L tailscale.com/ipn/store/awsstore from tailscale.com/ipn/store
@@ -887,7 +887,9 @@ tailscale.com/cmd/k8s-operator dependencies: (generated by github.com/tailscale/
tailscale.com/syncs from tailscale.com/control/controlknobs+
tailscale.com/tailcfg from tailscale.com/client/tailscale+
tailscale.com/taildrop from tailscale.com/ipn/ipnlocal+
tailscale.com/tempfork/acme from tailscale.com/ipn/ipnlocal
tailscale.com/tempfork/heap from tailscale.com/wgengine/magicsock
tailscale.com/tempfork/httprec from tailscale.com/control/controlclient
tailscale.com/tka from tailscale.com/client/tailscale+
tailscale.com/tsconst from tailscale.com/net/netmon+
tailscale.com/tsd from tailscale.com/ipn/ipnlocal+
@@ -969,7 +971,6 @@ tailscale.com/cmd/k8s-operator dependencies: (generated by github.com/tailscale/
tailscale.com/version from tailscale.com/client/web+
tailscale.com/version/distro from tailscale.com/client/web+
tailscale.com/wgengine from tailscale.com/ipn/ipnlocal+
tailscale.com/wgengine/capture from tailscale.com/ipn/ipnlocal+
tailscale.com/wgengine/filter from tailscale.com/control/controlclient+
tailscale.com/wgengine/filter/filtertype from tailscale.com/types/netmap+
💣 tailscale.com/wgengine/magicsock from tailscale.com/ipn/ipnlocal+
@@ -992,6 +993,8 @@ tailscale.com/cmd/k8s-operator dependencies: (generated by github.com/tailscale/
golang.org/x/crypto/cryptobyte/asn1 from crypto/ecdsa+
golang.org/x/crypto/curve25519 from github.com/tailscale/golang-x-crypto/ssh+
golang.org/x/crypto/hkdf from crypto/tls+
golang.org/x/crypto/internal/alias from golang.org/x/crypto/chacha20+
golang.org/x/crypto/internal/poly1305 from golang.org/x/crypto/chacha20poly1305+
golang.org/x/crypto/nacl/box from tailscale.com/types/key
golang.org/x/crypto/nacl/secretbox from golang.org/x/crypto/nacl/box
golang.org/x/crypto/poly1305 from github.com/tailscale/wireguard-go/device
@@ -1009,6 +1012,10 @@ tailscale.com/cmd/k8s-operator dependencies: (generated by github.com/tailscale/
golang.org/x/net/http2/hpack from golang.org/x/net/http2+
golang.org/x/net/icmp from github.com/prometheus-community/pro-bing+
golang.org/x/net/idna from golang.org/x/net/http/httpguts+
golang.org/x/net/internal/httpcommon from golang.org/x/net/http2
golang.org/x/net/internal/iana from golang.org/x/net/icmp+
golang.org/x/net/internal/socket from golang.org/x/net/icmp+
golang.org/x/net/internal/socks from golang.org/x/net/proxy
golang.org/x/net/ipv4 from github.com/miekg/dns+
golang.org/x/net/ipv6 from github.com/miekg/dns+
golang.org/x/net/proxy from tailscale.com/net/netns
@@ -1050,6 +1057,18 @@ tailscale.com/cmd/k8s-operator dependencies: (generated by github.com/tailscale/
crypto/ed25519 from crypto/tls+
crypto/elliptic from crypto/ecdsa+
crypto/hmac from crypto/tls+
crypto/internal/alias from crypto/aes+
crypto/internal/bigmod from crypto/ecdsa+
crypto/internal/boring from crypto/aes+
crypto/internal/boring/bbig from crypto/ecdsa+
crypto/internal/boring/sig from crypto/internal/boring
crypto/internal/edwards25519 from crypto/ed25519
crypto/internal/edwards25519/field from crypto/ecdh+
crypto/internal/hpke from crypto/tls
crypto/internal/mlkem768 from crypto/tls
crypto/internal/nistec from crypto/ecdh+
crypto/internal/nistec/fiat from crypto/internal/nistec
crypto/internal/randutil from crypto/dsa+
crypto/md5 from crypto/tls+
crypto/rand from crypto/ed25519+
crypto/rc4 from crypto/tls+
@@ -1060,6 +1079,7 @@ tailscale.com/cmd/k8s-operator dependencies: (generated by github.com/tailscale/
crypto/subtle from crypto/aes+
crypto/tls from github.com/aws/aws-sdk-go-v2/aws/transport/http+
crypto/x509 from crypto/tls+
D crypto/x509/internal/macos from crypto/x509
crypto/x509/pkix from crypto/x509+
database/sql from github.com/prometheus/client_golang/prometheus/collectors
database/sql/driver from database/sql+
@@ -1085,6 +1105,7 @@ tailscale.com/cmd/k8s-operator dependencies: (generated by github.com/tailscale/
go/build/constraint from go/parser
go/doc from k8s.io/apimachinery/pkg/runtime
go/doc/comment from go/doc
go/internal/typeparams from go/parser
go/parser from k8s.io/apimachinery/pkg/runtime
go/scanner from go/ast+
go/token from go/ast+
@@ -1095,6 +1116,46 @@ tailscale.com/cmd/k8s-operator dependencies: (generated by github.com/tailscale/
hash/maphash from go4.org/mem
html from html/template+
html/template from github.com/gorilla/csrf
internal/abi from crypto/x509/internal/macos+
internal/asan from syscall
internal/bisect from internal/godebug
internal/bytealg from bytes+
internal/byteorder from crypto/aes+
internal/chacha8rand from math/rand/v2+
internal/concurrent from unique
internal/coverage/rtcov from runtime
internal/cpu from crypto/aes+
internal/filepathlite from os+
internal/fmtsort from fmt+
internal/goarch from crypto/aes+
internal/godebug from archive/tar+
internal/godebugs from internal/godebug+
internal/goexperiment from runtime
internal/goos from crypto/x509+
internal/itoa from internal/poll+
internal/lazyregexp from go/doc
internal/msan from syscall
internal/nettrace from net+
internal/oserror from io/fs+
internal/poll from net+
internal/profile from net/http/pprof
internal/profilerecord from runtime+
internal/race from internal/poll+
internal/reflectlite from context+
internal/runtime/atomic from internal/runtime/exithook+
internal/runtime/exithook from runtime
L internal/runtime/syscall from runtime+
internal/saferio from debug/pe+
internal/singleflight from net
internal/stringslite from embed+
internal/syscall/execenv from os+
LD internal/syscall/unix from crypto/rand+
W internal/syscall/windows from crypto/rand+
W internal/syscall/windows/registry from mime+
W internal/syscall/windows/sysdll from internal/syscall/windows+
internal/testlog from os
internal/unsafeheader from internal/reflectlite+
internal/weak from unique
io from archive/tar+
io/fs from archive/tar+
io/ioutil from github.com/aws/aws-sdk-go-v2/aws/protocol/query+
@@ -1103,6 +1164,7 @@ tailscale.com/cmd/k8s-operator dependencies: (generated by github.com/tailscale/
log/internal from log+
log/slog from github.com/go-logr/logr+
log/slog/internal from log/slog
log/slog/internal/buffer from log/slog
maps from sigs.k8s.io/controller-runtime/pkg/predicate+
math from archive/tar+
math/big from crypto/dsa+
@@ -1114,10 +1176,10 @@ tailscale.com/cmd/k8s-operator dependencies: (generated by github.com/tailscale/
mime/quotedprintable from mime/multipart
net from crypto/tls+
net/http from expvar+
net/http/httptest from tailscale.com/control/controlclient
net/http/httptrace from github.com/prometheus-community/pro-bing+
net/http/httputil from github.com/aws/smithy-go/transport/http+
net/http/internal from net/http+
net/http/internal/ascii from net/http+
net/http/pprof from sigs.k8s.io/controller-runtime/pkg/manager+
net/netip from github.com/gaissmai/bart+
net/textproto from github.com/aws/aws-sdk-go-v2/aws/signer/v4+
@@ -1131,7 +1193,10 @@ tailscale.com/cmd/k8s-operator dependencies: (generated by github.com/tailscale/
reflect from archive/tar+
regexp from github.com/aws/aws-sdk-go-v2/internal/endpoints+
regexp/syntax from regexp
runtime from archive/tar+
runtime/debug from github.com/aws/aws-sdk-go-v2/internal/sync/singleflight+
runtime/internal/math from runtime
runtime/internal/sys from runtime
runtime/metrics from github.com/prometheus/client_golang/prometheus+
runtime/pprof from net/http/pprof+
runtime/trace from net/http/pprof
@@ -1150,3 +1215,4 @@ tailscale.com/cmd/k8s-operator dependencies: (generated by github.com/tailscale/
unicode/utf16 from crypto/x509+
unicode/utf8 from bufio+
unique from net/netip
unsafe from bytes+

View File

@@ -63,7 +63,10 @@ rules:
verbs: ["create","delete","deletecollection","get","list","patch","update","watch"]
- apiGroups: [""]
resources: ["pods"]
verbs: ["get","list","watch"]
verbs: ["get","list","watch", "update"]
- apiGroups: [""]
resources: ["pods/status"]
verbs: ["update"]
- apiGroups: ["apps"]
resources: ["statefulsets", "deployments"]
verbs: ["create","delete","deletecollection","get","list","patch","update","watch"]

View File

@@ -103,7 +103,7 @@ spec:
pattern: ^tag:[a-zA-Z][a-zA-Z0-9-]*$
type:
description: |-
Type of the ProxyGroup proxies. Supported types are egress and ingress.
Type of the ProxyGroup proxies. Currently the only supported type is egress.
Type is immutable once a ProxyGroup is created.
type: string
enum:

View File

@@ -2860,7 +2860,7 @@ spec:
type: array
type:
description: |-
Type of the ProxyGroup proxies. Supported types are egress and ingress.
Type of the ProxyGroup proxies. Currently the only supported type is egress.
Type is immutable once a ProxyGroup is created.
enum:
- egress
@@ -4854,6 +4854,13 @@ rules:
- get
- list
- watch
- update
- apiGroups:
- ""
resources:
- pods/status
verbs:
- update
- apiGroups:
- apps
resources:

View File

@@ -20,7 +20,6 @@ import (
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
"sigs.k8s.io/controller-runtime/pkg/client"
"sigs.k8s.io/controller-runtime/pkg/reconcile"
tsoperator "tailscale.com/k8s-operator"
"tailscale.com/kube/egressservices"
"tailscale.com/types/ptr"
)
@@ -71,25 +70,27 @@ func (er *egressEpsReconciler) Reconcile(ctx context.Context, req reconcile.Requ
if err != nil {
return res, fmt.Errorf("error retrieving ExternalName Service: %w", err)
}
if !tsoperator.EgressServiceIsValidAndConfigured(svc) {
l.Infof("Cluster resources for ExternalName Service %s/%s are not yet configured", svc.Namespace, svc.Name)
return res, nil
}
// TODO(irbekrm): currently this reconcile loop runs all the checks every time it's triggered, which is
// wasteful. Once we have a Ready condition for ExternalName Services for ProxyGroup, use the condition to
// determine if a reconcile is needed.
oldEps := eps.DeepCopy()
proxyGroupName := eps.Labels[labelProxyGroup]
tailnetSvc := tailnetSvcName(svc)
l = l.With("tailnet-service-name", tailnetSvc)
// Retrieve the desired tailnet service configuration from the ConfigMap.
proxyGroupName := eps.Labels[labelProxyGroup]
_, cfgs, err := egressSvcsConfigs(ctx, er.Client, proxyGroupName, er.tsNamespace)
if err != nil {
return res, fmt.Errorf("error retrieving tailnet services configuration: %w", err)
}
if cfgs == nil {
// TODO(irbekrm): this path would be hit if egress service was once exposed on a ProxyGroup that later
// got deleted. Probably the EndpointSlices then need to be deleted too- need to rethink this flow.
l.Debugf("No egress config found, likely because ProxyGroup has not been created")
return res, nil
}
cfg, ok := (*cfgs)[tailnetSvc]
if !ok {
l.Infof("[unexpected] configuration for tailnet service %s not found", tailnetSvc)

View File

@@ -0,0 +1,274 @@
// Copyright (c) Tailscale Inc & AUTHORS
// SPDX-License-Identifier: BSD-3-Clause
//go:build !plan9
package main
import (
"context"
"errors"
"fmt"
"net/http"
"slices"
"strings"
"sync/atomic"
"time"
"go.uber.org/zap"
xslices "golang.org/x/exp/slices"
corev1 "k8s.io/api/core/v1"
apierrors "k8s.io/apimachinery/pkg/api/errors"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
"k8s.io/apimachinery/pkg/types"
"sigs.k8s.io/controller-runtime/pkg/client"
"sigs.k8s.io/controller-runtime/pkg/reconcile"
tsapi "tailscale.com/k8s-operator/apis/v1alpha1"
"tailscale.com/kube/kubetypes"
"tailscale.com/logtail/backoff"
"tailscale.com/tstime"
"tailscale.com/util/httpm"
)
const tsEgressReadinessGate = "tailscale.com/egress-services"
// egressPodsReconciler is responsible for setting tailscale.com/egress-services condition on egress ProxyGroup Pods.
// The condition is used as a readiness gate for the Pod, meaning that kubelet will not mark the Pod as ready before the
// condition is set. The ProxyGroup StatefulSet updates are rolled out in such a way that no Pod is restarted, before
// the previous Pod is marked as ready, so ensuring that the Pod does not get marked as ready when it is not yet able to
// route traffic for egress service prevents downtime during restarts caused by no available endpoints left because
// every Pod has been recreated and is not yet added to endpoints.
// https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle/#pod-readiness-gate
type egressPodsReconciler struct {
client.Client
logger *zap.SugaredLogger
tsNamespace string
clock tstime.Clock
httpClient doer // http client that can be set to a mock client in tests
maxBackoff time.Duration // max backoff period between health check calls
}
// Reconcile reconciles an egress ProxyGroup Pods on changes to those Pods and ProxyGroup EndpointSlices. It ensures
// that for each Pod who is ready to route traffic to all egress services for the ProxyGroup, the Pod has a
// tailscale.com/egress-services condition to set, so that kubelet will mark the Pod as ready.
//
// For the Pod to be ready
// to route traffic to the egress service, the kube proxy needs to have set up the Pod's IP as an endpoint for the
// ClusterIP Service corresponding to the egress service.
//
// Note that the endpoints for the ClusterIP Service are configured by the operator itself using custom
// EndpointSlices(egress-eps-reconciler), so the routing is not blocked on Pod's readiness.
//
// Each egress service has a corresponding ClusterIP Service, that exposes all user configured
// tailnet ports, as well as a health check port for the proxy.
//
// The reconciler calls the health check endpoint of each Service up to N number of times, where N is the number of
// replicas for the ProxyGroup x 3, and checks if the received response is healthy response from the Pod being reconciled.
//
// The health check response contains a header with the
// Pod's IP address- this is used to determine whether the response is received from this Pod.
//
// If the Pod does not appear to be serving the health check endpoint (pre-v1.80 proxies), the reconciler just sets the
// readiness condition for backwards compatibility reasons.
func (er *egressPodsReconciler) Reconcile(ctx context.Context, req reconcile.Request) (res reconcile.Result, err error) {
l := er.logger.With("Pod", req.NamespacedName)
l.Debugf("starting reconcile")
defer l.Debugf("reconcile finished")
pod := new(corev1.Pod)
err = er.Get(ctx, req.NamespacedName, pod)
if apierrors.IsNotFound(err) {
return reconcile.Result{}, nil
}
if err != nil {
return reconcile.Result{}, fmt.Errorf("failed to get Pod: %w", err)
}
if !pod.DeletionTimestamp.IsZero() {
l.Debugf("Pod is being deleted, do nothing")
return res, nil
}
if pod.Labels[LabelParentType] != proxyTypeProxyGroup {
l.Infof("[unexpected] reconciler called for a Pod that is not a ProxyGroup Pod")
return res, nil
}
// If the Pod does not have the readiness gate set, there is no need to add the readiness condition. In practice
// this will happen if the user has configured custom TS_LOCAL_ADDR_PORT, thus disabling the graceful failover.
if !slices.ContainsFunc(pod.Spec.ReadinessGates, func(r corev1.PodReadinessGate) bool {
return r.ConditionType == tsEgressReadinessGate
}) {
l.Debug("Pod does not have egress readiness gate set, skipping")
return res, nil
}
proxyGroupName := pod.Labels[LabelParentName]
pg := new(tsapi.ProxyGroup)
if err := er.Get(ctx, types.NamespacedName{Name: proxyGroupName}, pg); err != nil {
return res, fmt.Errorf("error getting ProxyGroup %q: %w", proxyGroupName, err)
}
if pg.Spec.Type != typeEgress {
l.Infof("[unexpected] reconciler called for %q ProxyGroup Pod", pg.Spec.Type)
return res, nil
}
// Get all ClusterIP Services for all egress targets exposed to cluster via this ProxyGroup.
lbls := map[string]string{
LabelManaged: "true",
labelProxyGroup: proxyGroupName,
labelSvcType: typeEgress,
}
svcs := &corev1.ServiceList{}
if err := er.List(ctx, svcs, client.InNamespace(er.tsNamespace), client.MatchingLabels(lbls)); err != nil {
return res, fmt.Errorf("error listing ClusterIP Services")
}
idx := xslices.IndexFunc(pod.Status.Conditions, func(c corev1.PodCondition) bool {
return c.Type == tsEgressReadinessGate
})
if idx != -1 {
l.Debugf("Pod is already ready, do nothing")
return res, nil
}
var routesMissing atomic.Bool
errChan := make(chan error, len(svcs.Items))
for _, svc := range svcs.Items {
s := svc
go func() {
ll := l.With("service_name", s.Name)
d := retrieveClusterDomain(er.tsNamespace, ll)
healthCheckAddr := healthCheckForSvc(&s, d)
if healthCheckAddr == "" {
ll.Debugf("ClusterIP Service does not expose a health check endpoint, unable to verify if routing is set up")
errChan <- nil
return
}
var routesSetup bool
bo := backoff.NewBackoff(s.Name, ll.Infof, er.maxBackoff)
for range numCalls(pgReplicas(pg)) {
if ctx.Err() != nil {
errChan <- nil
return
}
state, err := er.lookupPodRouteViaSvc(ctx, pod, healthCheckAddr, ll)
if err != nil {
errChan <- fmt.Errorf("error validating if routing has been set up for Pod: %w", err)
return
}
if state == healthy || state == cannotVerify {
routesSetup = true
break
}
if state == unreachable || state == unhealthy || state == podNotReady {
bo.BackOff(ctx, errors.New("backoff"))
}
}
if !routesSetup {
ll.Debugf("Pod is not yet configured as Service endpoint")
routesMissing.Store(true)
}
errChan <- nil
}()
}
for range len(svcs.Items) {
e := <-errChan
err = errors.Join(err, e)
}
if err != nil {
return res, fmt.Errorf("error verifying conectivity: %w", err)
}
if rm := routesMissing.Load(); rm {
l.Info("Pod is not yet added as an endpoint for all egress targets, waiting...")
return reconcile.Result{RequeueAfter: shortRequeue}, nil
}
if err := er.setPodReady(ctx, pod, l); err != nil {
return res, fmt.Errorf("error setting Pod as ready: %w", err)
}
return res, nil
}
func (er *egressPodsReconciler) setPodReady(ctx context.Context, pod *corev1.Pod, l *zap.SugaredLogger) error {
if slices.ContainsFunc(pod.Status.Conditions, func(c corev1.PodCondition) bool {
return c.Type == tsEgressReadinessGate
}) {
return nil
}
l.Infof("Pod is ready to route traffic to all egress targets")
pod.Status.Conditions = append(pod.Status.Conditions, corev1.PodCondition{
Type: tsEgressReadinessGate,
Status: corev1.ConditionTrue,
LastTransitionTime: metav1.Time{Time: er.clock.Now()},
})
return er.Status().Update(ctx, pod)
}
// healthCheckState is the result of a single request to an egress Service health check endpoint with a goal to hit a
// specific backend Pod.
type healthCheckState int8
const (
cannotVerify healthCheckState = iota // not verifiable for this setup (i.e earlier proxy version)
unreachable // no backends or another network error
notFound // hit another backend
unhealthy // not 200
podNotReady // Pod is not ready, i.e does not have an IP address yet
healthy // 200
)
// lookupPodRouteViaSvc attempts to reach a Pod using a health check endpoint served by a Service and returns the state of the health check.
func (er *egressPodsReconciler) lookupPodRouteViaSvc(ctx context.Context, pod *corev1.Pod, healthCheckAddr string, l *zap.SugaredLogger) (healthCheckState, error) {
if !slices.ContainsFunc(pod.Spec.Containers[0].Env, func(e corev1.EnvVar) bool {
return e.Name == "TS_ENABLE_HEALTH_CHECK" && e.Value == "true"
}) {
l.Debugf("Pod does not have health check enabled, unable to verify if it is currently routable via Service")
return cannotVerify, nil
}
wantsIP, err := podIPv4(pod)
if err != nil {
return -1, fmt.Errorf("error determining Pod's IP address: %w", err)
}
if wantsIP == "" {
return podNotReady, nil
}
ctx, cancel := context.WithTimeout(ctx, time.Second*3)
defer cancel()
req, err := http.NewRequestWithContext(ctx, httpm.GET, healthCheckAddr, nil)
if err != nil {
return -1, fmt.Errorf("error creating new HTTP request: %w", err)
}
// Do not re-use the same connection for the next request so to maximize the chance of hitting all backends equally.
req.Close = true
resp, err := er.httpClient.Do(req)
if err != nil {
// This is most likely because this is the first Pod and is not yet added to Service endoints. Other
// error types are possible, but checking for those would likely make the system too fragile.
return unreachable, nil
}
defer resp.Body.Close()
gotIP := resp.Header.Get(kubetypes.PodIPv4Header)
if gotIP == "" {
l.Debugf("Health check does not return Pod's IP header, unable to verify if Pod is currently routable via Service")
return cannotVerify, nil
}
if !strings.EqualFold(wantsIP, gotIP) {
return notFound, nil
}
if resp.StatusCode != http.StatusOK {
return unhealthy, nil
}
return healthy, nil
}
// numCalls return the number of times an endpoint on a ProxyGroup Service should be called till it can be safely
// assumed that, if none of the responses came back from a specific Pod then traffic for the Service is currently not
// being routed to that Pod. This assumes that traffic for the Service is routed via round robin, so
// InternalTrafficPolicy must be 'Cluster' and session affinity must be None.
func numCalls(replicas int32) int32 {
return replicas * 3
}
// doer is an interface for HTTP client that can be set to a mock client in tests.
type doer interface {
Do(*http.Request) (*http.Response, error)
}

View File

@@ -0,0 +1,525 @@
// Copyright (c) Tailscale Inc & AUTHORS
// SPDX-License-Identifier: BSD-3-Clause
//go:build !plan9
package main
import (
"bytes"
"errors"
"fmt"
"io"
"log"
"net/http"
"sync"
"testing"
"time"
"go.uber.org/zap"
corev1 "k8s.io/api/core/v1"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
"k8s.io/apimachinery/pkg/util/intstr"
"sigs.k8s.io/controller-runtime/pkg/client/fake"
tsapi "tailscale.com/k8s-operator/apis/v1alpha1"
"tailscale.com/kube/kubetypes"
"tailscale.com/tstest"
"tailscale.com/types/ptr"
)
func TestEgressPodReadiness(t *testing.T) {
// We need to pass a Pod object to WithStatusSubresource because of some quirks in how the fake client
// works. Without this code we would not be able to update Pod's status further down.
fc := fake.NewClientBuilder().
WithScheme(tsapi.GlobalScheme).
WithStatusSubresource(&corev1.Pod{}).
Build()
zl, _ := zap.NewDevelopment()
cl := tstest.NewClock(tstest.ClockOpts{})
rec := &egressPodsReconciler{
tsNamespace: "operator-ns",
Client: fc,
logger: zl.Sugar(),
clock: cl,
}
pg := &tsapi.ProxyGroup{
ObjectMeta: metav1.ObjectMeta{
Name: "dev",
},
Spec: tsapi.ProxyGroupSpec{
Type: "egress",
Replicas: ptr.To(int32(3)),
},
}
mustCreate(t, fc, pg)
podIP := "10.0.0.2"
podTemplate := &corev1.Pod{
ObjectMeta: metav1.ObjectMeta{
Namespace: "operator-ns",
Name: "pod",
Labels: map[string]string{
LabelParentType: "proxygroup",
LabelParentName: "dev",
},
},
Spec: corev1.PodSpec{
ReadinessGates: []corev1.PodReadinessGate{{
ConditionType: tsEgressReadinessGate,
}},
Containers: []corev1.Container{{
Name: "tailscale",
Env: []corev1.EnvVar{{
Name: "TS_ENABLE_HEALTH_CHECK",
Value: "true",
}},
}},
},
Status: corev1.PodStatus{
PodIPs: []corev1.PodIP{{IP: podIP}},
},
}
t.Run("no_egress_services", func(t *testing.T) {
pod := podTemplate.DeepCopy()
mustCreate(t, fc, pod)
expectReconciled(t, rec, "operator-ns", pod.Name)
// Pod should have readiness gate condition set.
podSetReady(pod, cl)
expectEqual(t, fc, pod)
mustDeleteAll(t, fc, pod)
})
t.Run("one_svc_already_routed_to", func(t *testing.T) {
pod := podTemplate.DeepCopy()
svc, hep := newSvc("svc", 9002)
mustCreateAll(t, fc, svc, pod)
resp := readyResps(podIP, 1)
httpCl := fakeHTTPClient{
t: t,
state: map[string][]fakeResponse{hep: resp},
}
rec.httpClient = &httpCl
expectReconciled(t, rec, "operator-ns", pod.Name)
// Pod should have readiness gate condition set.
podSetReady(pod, cl)
expectEqual(t, fc, pod)
// A subsequent reconcile should not change the Pod.
expectReconciled(t, rec, "operator-ns", pod.Name)
expectEqual(t, fc, pod)
mustDeleteAll(t, fc, pod, svc)
})
t.Run("one_svc_many_backends_eventually_routed_to", func(t *testing.T) {
pod := podTemplate.DeepCopy()
svc, hep := newSvc("svc", 9002)
mustCreateAll(t, fc, svc, pod)
// For a 3 replica ProxyGroup the healthcheck endpoint should be called 9 times, make the 9th time only
// return with the right Pod IP.
resps := append(readyResps("10.0.0.3", 4), append(readyResps("10.0.0.4", 4), readyResps(podIP, 1)...)...)
httpCl := fakeHTTPClient{
t: t,
state: map[string][]fakeResponse{hep: resps},
}
rec.httpClient = &httpCl
expectReconciled(t, rec, "operator-ns", pod.Name)
// Pod should have readiness gate condition set.
podSetReady(pod, cl)
expectEqual(t, fc, pod)
mustDeleteAll(t, fc, pod, svc)
})
t.Run("one_svc_one_backend_eventually_healthy", func(t *testing.T) {
pod := podTemplate.DeepCopy()
svc, hep := newSvc("svc", 9002)
mustCreateAll(t, fc, svc, pod)
// For a 3 replica ProxyGroup the healthcheck endpoint should be called 9 times, make the 9th time only
// return with 200 status code.
resps := append(unreadyResps(podIP, 8), readyResps(podIP, 1)...)
httpCl := fakeHTTPClient{
t: t,
state: map[string][]fakeResponse{hep: resps},
}
rec.httpClient = &httpCl
expectReconciled(t, rec, "operator-ns", pod.Name)
// Pod should have readiness gate condition set.
podSetReady(pod, cl)
expectEqual(t, fc, pod)
mustDeleteAll(t, fc, pod, svc)
})
t.Run("one_svc_one_backend_never_routable", func(t *testing.T) {
pod := podTemplate.DeepCopy()
svc, hep := newSvc("svc", 9002)
mustCreateAll(t, fc, svc, pod)
// For a 3 replica ProxyGroup the healthcheck endpoint should be called 9 times and Pod should be
// requeued if neither of those succeed.
resps := readyResps("10.0.0.3", 9)
httpCl := fakeHTTPClient{
t: t,
state: map[string][]fakeResponse{hep: resps},
}
rec.httpClient = &httpCl
expectRequeue(t, rec, "operator-ns", pod.Name)
// Pod should not have readiness gate condition set.
expectEqual(t, fc, pod)
mustDeleteAll(t, fc, pod, svc)
})
t.Run("one_svc_many_backends_already_routable", func(t *testing.T) {
pod := podTemplate.DeepCopy()
svc, hep := newSvc("svc", 9002)
svc2, hep2 := newSvc("svc-2", 9002)
svc3, hep3 := newSvc("svc-3", 9002)
mustCreateAll(t, fc, svc, svc2, svc3, pod)
resps := readyResps(podIP, 1)
httpCl := fakeHTTPClient{
t: t,
state: map[string][]fakeResponse{
hep: resps,
hep2: resps,
hep3: resps,
},
}
rec.httpClient = &httpCl
expectReconciled(t, rec, "operator-ns", pod.Name)
// Pod should not have readiness gate condition set.
podSetReady(pod, cl)
expectEqual(t, fc, pod)
mustDeleteAll(t, fc, pod, svc, svc2, svc3)
})
t.Run("one_svc_many_backends_eventually_routable_and_healthy", func(t *testing.T) {
pod := podTemplate.DeepCopy()
svc, hep := newSvc("svc", 9002)
svc2, hep2 := newSvc("svc-2", 9002)
svc3, hep3 := newSvc("svc-3", 9002)
mustCreateAll(t, fc, svc, svc2, svc3, pod)
resps := append(readyResps("10.0.0.3", 7), readyResps(podIP, 1)...)
resps2 := append(readyResps("10.0.0.3", 5), readyResps(podIP, 1)...)
resps3 := append(unreadyResps(podIP, 4), readyResps(podIP, 1)...)
httpCl := fakeHTTPClient{
t: t,
state: map[string][]fakeResponse{
hep: resps,
hep2: resps2,
hep3: resps3,
},
}
rec.httpClient = &httpCl
expectReconciled(t, rec, "operator-ns", pod.Name)
// Pod should have readiness gate condition set.
podSetReady(pod, cl)
expectEqual(t, fc, pod)
mustDeleteAll(t, fc, pod, svc, svc2, svc3)
})
t.Run("one_svc_many_backends_never_routable_and_healthy", func(t *testing.T) {
pod := podTemplate.DeepCopy()
svc, hep := newSvc("svc", 9002)
svc2, hep2 := newSvc("svc-2", 9002)
svc3, hep3 := newSvc("svc-3", 9002)
mustCreateAll(t, fc, svc, svc2, svc3, pod)
// For a ProxyGroup with 3 replicas, each Service's health endpoint will be tried 9 times and the Pod
// will be requeued if neither succeeds.
resps := readyResps("10.0.0.3", 9)
resps2 := append(readyResps("10.0.0.3", 5), readyResps("10.0.0.4", 4)...)
resps3 := unreadyResps(podIP, 9)
httpCl := fakeHTTPClient{
t: t,
state: map[string][]fakeResponse{
hep: resps,
hep2: resps2,
hep3: resps3,
},
}
rec.httpClient = &httpCl
expectRequeue(t, rec, "operator-ns", pod.Name)
// Pod should not have readiness gate condition set.
expectEqual(t, fc, pod)
mustDeleteAll(t, fc, pod, svc, svc2, svc3)
})
t.Run("one_svc_many_backends_one_never_routable", func(t *testing.T) {
pod := podTemplate.DeepCopy()
svc, hep := newSvc("svc", 9002)
svc2, hep2 := newSvc("svc-2", 9002)
svc3, hep3 := newSvc("svc-3", 9002)
mustCreateAll(t, fc, svc, svc2, svc3, pod)
// For a ProxyGroup with 3 replicas, each Service's health endpoint will be tried 9 times and the Pod
// will be requeued if any one never succeeds.
resps := readyResps(podIP, 9)
resps2 := readyResps(podIP, 9)
resps3 := append(readyResps("10.0.0.3", 5), readyResps("10.0.0.4", 4)...)
httpCl := fakeHTTPClient{
t: t,
state: map[string][]fakeResponse{
hep: resps,
hep2: resps2,
hep3: resps3,
},
}
rec.httpClient = &httpCl
expectRequeue(t, rec, "operator-ns", pod.Name)
// Pod should not have readiness gate condition set.
expectEqual(t, fc, pod)
mustDeleteAll(t, fc, pod, svc, svc2, svc3)
})
t.Run("one_svc_many_backends_one_never_healthy", func(t *testing.T) {
pod := podTemplate.DeepCopy()
svc, hep := newSvc("svc", 9002)
svc2, hep2 := newSvc("svc-2", 9002)
svc3, hep3 := newSvc("svc-3", 9002)
mustCreateAll(t, fc, svc, svc2, svc3, pod)
// For a ProxyGroup with 3 replicas, each Service's health endpoint will be tried 9 times and the Pod
// will be requeued if any one never succeeds.
resps := readyResps(podIP, 9)
resps2 := unreadyResps(podIP, 9)
resps3 := readyResps(podIP, 9)
httpCl := fakeHTTPClient{
t: t,
state: map[string][]fakeResponse{
hep: resps,
hep2: resps2,
hep3: resps3,
},
}
rec.httpClient = &httpCl
expectRequeue(t, rec, "operator-ns", pod.Name)
// Pod should not have readiness gate condition set.
expectEqual(t, fc, pod)
mustDeleteAll(t, fc, pod, svc, svc2, svc3)
})
t.Run("one_svc_many_backends_different_ports_eventually_healthy_and_routable", func(t *testing.T) {
pod := podTemplate.DeepCopy()
svc, hep := newSvc("svc", 9003)
svc2, hep2 := newSvc("svc-2", 9004)
svc3, hep3 := newSvc("svc-3", 9010)
mustCreateAll(t, fc, svc, svc2, svc3, pod)
// For a ProxyGroup with 3 replicas, each Service's health endpoint will be tried up to 9 times and
// marked as success as soon as one try succeeds.
resps := append(readyResps("10.0.0.3", 7), readyResps(podIP, 1)...)
resps2 := append(readyResps("10.0.0.3", 5), readyResps(podIP, 1)...)
resps3 := append(unreadyResps(podIP, 4), readyResps(podIP, 1)...)
httpCl := fakeHTTPClient{
t: t,
state: map[string][]fakeResponse{
hep: resps,
hep2: resps2,
hep3: resps3,
},
}
rec.httpClient = &httpCl
expectReconciled(t, rec, "operator-ns", pod.Name)
// Pod should have readiness gate condition set.
podSetReady(pod, cl)
expectEqual(t, fc, pod)
mustDeleteAll(t, fc, pod, svc, svc2, svc3)
})
// Proxies of 1.78 and earlier did not set the Pod IP header.
t.Run("pod_does_not_return_ip_header", func(t *testing.T) {
pod := podTemplate.DeepCopy()
pod.Name = "foo-bar"
svc, hep := newSvc("foo-bar", 9002)
mustCreateAll(t, fc, svc, pod)
// If a response does not contain Pod IP header, we assume that this is an earlier proxy version,
// readiness cannot be verified so the readiness gate is just set to true.
resps := unreadyResps("", 1)
httpCl := fakeHTTPClient{
t: t,
state: map[string][]fakeResponse{
hep: resps,
},
}
rec.httpClient = &httpCl
expectReconciled(t, rec, "operator-ns", pod.Name)
// Pod should have readiness gate condition set.
podSetReady(pod, cl)
expectEqual(t, fc, pod)
mustDeleteAll(t, fc, pod, svc)
})
t.Run("one_svc_one_backend_eventually_healthy_and_routable", func(t *testing.T) {
pod := podTemplate.DeepCopy()
svc, hep := newSvc("svc", 9002)
mustCreateAll(t, fc, svc, pod)
// If a response errors, it is probably because the Pod is not yet properly running, so retry.
resps := append(erroredResps(8), readyResps(podIP, 1)...)
httpCl := fakeHTTPClient{
t: t,
state: map[string][]fakeResponse{
hep: resps,
},
}
rec.httpClient = &httpCl
expectReconciled(t, rec, "operator-ns", pod.Name)
// Pod should have readiness gate condition set.
podSetReady(pod, cl)
expectEqual(t, fc, pod)
mustDeleteAll(t, fc, pod, svc)
})
t.Run("one_svc_one_backend_svc_does_not_have_health_port", func(t *testing.T) {
pod := podTemplate.DeepCopy()
// If a Service does not have health port set, we assume that it is not possible to determine Pod's
// readiness and set it to ready.
svc, _ := newSvc("svc", -1)
mustCreateAll(t, fc, svc, pod)
rec.httpClient = nil
expectReconciled(t, rec, "operator-ns", pod.Name)
// Pod should have readiness gate condition set.
podSetReady(pod, cl)
expectEqual(t, fc, pod)
mustDeleteAll(t, fc, pod, svc)
})
t.Run("error_setting_up_healthcheck", func(t *testing.T) {
pod := podTemplate.DeepCopy()
// This is not a realistic reason for error, but we are just testing the behaviour of a healthcheck
// lookup failing.
pod.Status.PodIPs = []corev1.PodIP{{IP: "not-an-ip"}}
svc, _ := newSvc("svc", 9002)
svc2, _ := newSvc("svc-2", 9002)
svc3, _ := newSvc("svc-3", 9002)
mustCreateAll(t, fc, svc, svc2, svc3, pod)
rec.httpClient = nil
expectError(t, rec, "operator-ns", pod.Name)
// Pod should not have readiness gate condition set.
expectEqual(t, fc, pod)
mustDeleteAll(t, fc, pod, svc, svc2, svc3)
})
t.Run("pod_does_not_have_an_ip_address", func(t *testing.T) {
pod := podTemplate.DeepCopy()
pod.Status.PodIPs = nil
svc, _ := newSvc("svc", 9002)
svc2, _ := newSvc("svc-2", 9002)
svc3, _ := newSvc("svc-3", 9002)
mustCreateAll(t, fc, svc, svc2, svc3, pod)
rec.httpClient = nil
expectRequeue(t, rec, "operator-ns", pod.Name)
// Pod should not have readiness gate condition set.
expectEqual(t, fc, pod)
mustDeleteAll(t, fc, pod, svc, svc2, svc3)
})
}
func readyResps(ip string, num int) (resps []fakeResponse) {
for range num {
resps = append(resps, fakeResponse{statusCode: 200, podIP: ip})
}
return resps
}
func unreadyResps(ip string, num int) (resps []fakeResponse) {
for range num {
resps = append(resps, fakeResponse{statusCode: 503, podIP: ip})
}
return resps
}
func erroredResps(num int) (resps []fakeResponse) {
for range num {
resps = append(resps, fakeResponse{err: errors.New("timeout")})
}
return resps
}
func newSvc(name string, port int32) (*corev1.Service, string) {
svc := &corev1.Service{
ObjectMeta: metav1.ObjectMeta{
Namespace: "operator-ns",
Name: name,
Labels: map[string]string{
LabelManaged: "true",
labelProxyGroup: "dev",
labelSvcType: typeEgress,
},
},
Spec: corev1.ServiceSpec{},
}
if port != -1 {
svc.Spec.Ports = []corev1.ServicePort{
{
Name: tsHealthCheckPortName,
Port: port,
TargetPort: intstr.FromInt(9002),
Protocol: "TCP",
},
}
}
return svc, fmt.Sprintf("http://%s.operator-ns.svc.cluster.local:%d/healthz", name, port)
}
func podSetReady(pod *corev1.Pod, cl *tstest.Clock) {
pod.Status.Conditions = append(pod.Status.Conditions, corev1.PodCondition{
Type: tsEgressReadinessGate,
Status: corev1.ConditionTrue,
LastTransitionTime: metav1.Time{Time: cl.Now().Truncate(time.Second)},
})
}
// fakeHTTPClient is a mock HTTP client with a preset map of request URLs to list of responses. When it receives a
// request for a specific URL, it returns the preset response for that URL. It errors if an unexpected request is
// received.
type fakeHTTPClient struct {
t *testing.T
mu sync.Mutex // protects following
state map[string][]fakeResponse
}
func (f *fakeHTTPClient) Do(req *http.Request) (*http.Response, error) {
f.mu.Lock()
resps := f.state[req.URL.String()]
if len(resps) == 0 {
f.mu.Unlock()
log.Printf("\n\n\nURL %q\n\n\n", req.URL)
f.t.Fatalf("fakeHTTPClient received an unexpected request for %q", req.URL)
}
defer func() {
if len(resps) == 1 {
delete(f.state, req.URL.String())
f.mu.Unlock()
return
}
f.state[req.URL.String()] = f.state[req.URL.String()][1:]
f.mu.Unlock()
}()
resp := resps[0]
if resp.err != nil {
return nil, resp.err
}
r := http.Response{
StatusCode: resp.statusCode,
Header: make(http.Header),
Body: io.NopCloser(bytes.NewReader([]byte{})),
}
r.Header.Add(kubetypes.PodIPv4Header, resp.podIP)
return &r, nil
}
type fakeResponse struct {
err error
statusCode int
podIP string // for the Pod IP header
}

View File

@@ -48,11 +48,12 @@ type egressSvcsReadinessReconciler struct {
// service to determine how many replicas are currently able to route traffic.
func (esrr *egressSvcsReadinessReconciler) Reconcile(ctx context.Context, req reconcile.Request) (res reconcile.Result, err error) {
l := esrr.logger.With("Service", req.NamespacedName)
defer l.Info("reconcile finished")
l.Debugf("starting reconcile")
defer l.Debugf("reconcile finished")
svc := new(corev1.Service)
if err = esrr.Get(ctx, req.NamespacedName, svc); apierrors.IsNotFound(err) {
l.Info("Service not found")
l.Debugf("Service not found")
return res, nil
} else if err != nil {
return res, fmt.Errorf("failed to get Service: %w", err)
@@ -127,16 +128,16 @@ func (esrr *egressSvcsReadinessReconciler) Reconcile(ctx context.Context, req re
return res, err
}
if pod == nil {
l.Infof("[unexpected] ProxyGroup is ready, but replica %d was not found", i)
l.Warnf("[unexpected] ProxyGroup is ready, but replica %d was not found", i)
reason, msg = reasonClusterResourcesNotReady, reasonClusterResourcesNotReady
return res, nil
}
l.Infof("looking at Pod with IPs %v", pod.Status.PodIPs)
l.Debugf("looking at Pod with IPs %v", pod.Status.PodIPs)
ready := false
for _, ep := range eps.Endpoints {
l.Infof("looking at endpoint with addresses %v", ep.Addresses)
l.Debugf("looking at endpoint with addresses %v", ep.Addresses)
if endpointReadyForPod(&ep, pod, l) {
l.Infof("endpoint is ready for Pod")
l.Debugf("endpoint is ready for Pod")
ready = true
break
}
@@ -165,7 +166,7 @@ func (esrr *egressSvcsReadinessReconciler) Reconcile(ctx context.Context, req re
func endpointReadyForPod(ep *discoveryv1.Endpoint, pod *corev1.Pod, l *zap.SugaredLogger) bool {
podIP, err := podIPv4(pod)
if err != nil {
l.Infof("[unexpected] error retrieving Pod's IPv4 address: %v", err)
l.Warnf("[unexpected] error retrieving Pod's IPv4 address: %v", err)
return false
}
// Currently we only ever set a single address on and Endpoint and nothing else is meant to modify this.

View File

@@ -59,6 +59,8 @@ const (
maxPorts = 1000
indexEgressProxyGroup = ".metadata.annotations.egress-proxy-group"
tsHealthCheckPortName = "tailscale-health-check"
)
var gaugeEgressServices = clientmetric.NewGauge(kubetypes.MetricEgressServiceCount)
@@ -229,15 +231,16 @@ func (esr *egressSvcsReconciler) provision(ctx context.Context, proxyGroupName s
found := false
for _, wantsPM := range svc.Spec.Ports {
if wantsPM.Port == pm.Port && strings.EqualFold(string(wantsPM.Protocol), string(pm.Protocol)) {
// We don't use the port name to distinguish this port internally, but Kubernetes
// require that, for Service ports with more than one name each port is uniquely named.
// So we can always pick the port name from the ExternalName Service as at this point we
// know that those are valid names because Kuberentes already validated it once. Note
// that users could have changed an unnamed port to a named port and might have changed
// port names- this should still work.
// We want to both preserve the user set port names for ease of debugging, but also
// ensure that we name all unnamed ports as the ClusterIP Service that we create will
// always have at least two ports.
// https://kubernetes.io/docs/concepts/services-networking/service/#multi-port-services
// See also https://github.com/tailscale/tailscale/issues/13406#issuecomment-2507230388
clusterIPSvc.Spec.Ports[i].Name = wantsPM.Name
if wantsPM.Name != "" {
clusterIPSvc.Spec.Ports[i].Name = wantsPM.Name
} else {
clusterIPSvc.Spec.Ports[i].Name = "tailscale-unnamed"
}
found = true
break
}
@@ -252,6 +255,12 @@ func (esr *egressSvcsReconciler) provision(ctx context.Context, proxyGroupName s
// ClusterIP Service produce new target port and add a portmapping to
// the ClusterIP Service.
for _, wantsPM := range svc.Spec.Ports {
// Because we add a healthcheck port of our own, we will always have at least two ports. That
// means that we cannot have ports with name not set.
// https://kubernetes.io/docs/concepts/services-networking/service/#multi-port-services
if wantsPM.Name == "" {
wantsPM.Name = "tailscale-unnamed"
}
found := false
for _, gotPM := range clusterIPSvc.Spec.Ports {
if wantsPM.Port == gotPM.Port && strings.EqualFold(string(wantsPM.Protocol), string(gotPM.Protocol)) {
@@ -278,6 +287,25 @@ func (esr *egressSvcsReconciler) provision(ctx context.Context, proxyGroupName s
})
}
}
var healthCheckPort int32 = defaultLocalAddrPort
for {
if !slices.ContainsFunc(svc.Spec.Ports, func(p corev1.ServicePort) bool {
return p.Port == healthCheckPort
}) {
break
}
healthCheckPort++
if healthCheckPort > 10002 {
return nil, false, fmt.Errorf("unable to find a free port for internal health check in range [9002, 10002]")
}
}
clusterIPSvc.Spec.Ports = append(clusterIPSvc.Spec.Ports, corev1.ServicePort{
Name: tsHealthCheckPortName,
Port: healthCheckPort,
TargetPort: intstr.FromInt(defaultLocalAddrPort),
Protocol: "TCP",
})
if !reflect.DeepEqual(clusterIPSvc, oldClusterIPSvc) {
if clusterIPSvc, err = createOrUpdate(ctx, esr.Client, esr.tsNamespace, clusterIPSvc, func(svc *corev1.Service) {
svc.Labels = clusterIPSvc.Labels
@@ -320,7 +348,7 @@ func (esr *egressSvcsReconciler) provision(ctx context.Context, proxyGroupName s
}
tailnetSvc := tailnetSvcName(svc)
gotCfg := (*cfgs)[tailnetSvc]
wantsCfg := egressSvcCfg(svc, clusterIPSvc)
wantsCfg := egressSvcCfg(svc, clusterIPSvc, esr.tsNamespace, l)
if !reflect.DeepEqual(gotCfg, wantsCfg) {
l.Debugf("updating egress services ConfigMap %s", cm.Name)
mak.Set(cfgs, tailnetSvc, wantsCfg)
@@ -504,10 +532,8 @@ func (esr *egressSvcsReconciler) validateClusterResources(ctx context.Context, s
return false, nil
}
if !tsoperator.ProxyGroupIsReady(pg) {
l.Infof("ProxyGroup %s is not ready, waiting...", proxyGroupName)
tsoperator.SetServiceCondition(svc, tsapi.EgressSvcValid, metav1.ConditionUnknown, reasonProxyGroupNotReady, reasonProxyGroupNotReady, esr.clock, l)
tsoperator.RemoveServiceCondition(svc, tsapi.EgressSvcConfigured)
return false, nil
}
l.Debugf("egress service is valid")
@@ -515,6 +541,24 @@ func (esr *egressSvcsReconciler) validateClusterResources(ctx context.Context, s
return true, nil
}
func egressSvcCfg(externalNameSvc, clusterIPSvc *corev1.Service, ns string, l *zap.SugaredLogger) egressservices.Config {
d := retrieveClusterDomain(ns, l)
tt := tailnetTargetFromSvc(externalNameSvc)
hep := healthCheckForSvc(clusterIPSvc, d)
cfg := egressservices.Config{
TailnetTarget: tt,
HealthCheckEndpoint: hep,
}
for _, svcPort := range clusterIPSvc.Spec.Ports {
if svcPort.Name == tsHealthCheckPortName {
continue // exclude healthcheck from egress svcs configs
}
pm := portMap(svcPort)
mak.Set(&cfg.Ports, pm, struct{}{})
}
return cfg
}
func validateEgressService(svc *corev1.Service, pg *tsapi.ProxyGroup) []string {
violations := validateService(svc)
@@ -584,16 +628,6 @@ func tailnetTargetFromSvc(svc *corev1.Service) egressservices.TailnetTarget {
}
}
func egressSvcCfg(externalNameSvc, clusterIPSvc *corev1.Service) egressservices.Config {
tt := tailnetTargetFromSvc(externalNameSvc)
cfg := egressservices.Config{TailnetTarget: tt}
for _, svcPort := range clusterIPSvc.Spec.Ports {
pm := portMap(svcPort)
mak.Set(&cfg.Ports, pm, struct{}{})
}
return cfg
}
func portMap(p corev1.ServicePort) egressservices.PortMap {
// TODO (irbekrm): out of bounds check?
return egressservices.PortMap{Protocol: string(p.Protocol), MatchPort: uint16(p.TargetPort.IntVal), TargetPort: uint16(p.Port)}
@@ -618,7 +652,11 @@ func egressSvcsConfigs(ctx context.Context, cl client.Client, proxyGroupName, ts
Namespace: tsNamespace,
},
}
if err := cl.Get(ctx, client.ObjectKeyFromObject(cm), cm); err != nil {
err = cl.Get(ctx, client.ObjectKeyFromObject(cm), cm)
if apierrors.IsNotFound(err) { // ProxyGroup resources have not been created (yet)
return nil, nil, nil
}
if err != nil {
return nil, nil, fmt.Errorf("error retrieving egress services ConfigMap %s: %v", name, err)
}
cfgs = &egressservices.Configs{}
@@ -740,3 +778,17 @@ func (esr *egressSvcsReconciler) updateSvcSpec(ctx context.Context, svc *corev1.
svc.Status = *st
return err
}
// healthCheckForSvc return the URL of the containerboot's health check endpoint served by this Service or empty string.
func healthCheckForSvc(svc *corev1.Service, clusterDomain string) string {
// This version of the operator always sets health check port on the egress Services. However, it is possible
// that this reconcile loops runs during a proxy upgrade from a version that did not set the health check port
// and parses a Service that does not have the port set yet.
i := slices.IndexFunc(svc.Spec.Ports, func(port corev1.ServicePort) bool {
return port.Name == tsHealthCheckPortName
})
if i == -1 {
return ""
}
return fmt.Sprintf("http://%s.%s.svc.%s:%d/healthz", svc.Name, svc.Namespace, clusterDomain, svc.Spec.Ports[i].Port)
}

View File

@@ -18,6 +18,7 @@ import (
discoveryv1 "k8s.io/api/discovery/v1"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
"k8s.io/apimachinery/pkg/types"
"k8s.io/apimachinery/pkg/util/intstr"
"sigs.k8s.io/controller-runtime/pkg/client"
"sigs.k8s.io/controller-runtime/pkg/client/fake"
tsapi "tailscale.com/k8s-operator/apis/v1alpha1"
@@ -78,42 +79,16 @@ func TestTailscaleEgressServices(t *testing.T) {
Selector: nil,
Ports: []corev1.ServicePort{
{
Name: "http",
Protocol: "TCP",
Port: 80,
},
{
Name: "https",
Protocol: "TCP",
Port: 443,
},
},
},
}
t.Run("proxy_group_not_ready", func(t *testing.T) {
t.Run("service_one_unnamed_port", func(t *testing.T) {
mustCreate(t, fc, svc)
expectReconciled(t, esr, "default", "test")
// Service should have EgressSvcValid condition set to Unknown.
svc.Status.Conditions = []metav1.Condition{condition(tsapi.EgressSvcValid, metav1.ConditionUnknown, reasonProxyGroupNotReady, reasonProxyGroupNotReady, clock)}
expectEqual(t, fc, svc)
})
t.Run("proxy_group_ready", func(t *testing.T) {
mustUpdateStatus(t, fc, "", "foo", func(pg *tsapi.ProxyGroup) {
pg.Status.Conditions = []metav1.Condition{
condition(tsapi.ProxyGroupReady, metav1.ConditionTrue, "", "", clock),
}
})
expectReconciled(t, esr, "default", "test")
validateReadyService(t, fc, esr, svc, clock, zl, cm)
})
t.Run("service_retain_one_unnamed_port", func(t *testing.T) {
svc.Spec.Ports = []corev1.ServicePort{{Protocol: "TCP", Port: 80}}
mustUpdate(t, fc, "default", "test", func(s *corev1.Service) {
s.Spec.Ports = svc.Spec.Ports
})
expectReconciled(t, esr, "default", "test")
validateReadyService(t, fc, esr, svc, clock, zl, cm)
})
t.Run("service_add_two_named_ports", func(t *testing.T) {
@@ -164,7 +139,7 @@ func validateReadyService(t *testing.T, fc client.WithWatch, esr *egressSvcsReco
// Verify that an EndpointSlice has been created.
expectEqual(t, fc, endpointSlice(name, svc, clusterSvc))
// Verify that ConfigMap contains configuration for the new egress service.
mustHaveConfigForSvc(t, fc, svc, clusterSvc, cm)
mustHaveConfigForSvc(t, fc, svc, clusterSvc, cm, zl)
r := svcConfiguredReason(svc, true, zl.Sugar())
// Verify that the user-created ExternalName Service has Configured set to true and ExternalName pointing to the
// CluterIP Service.
@@ -203,6 +178,23 @@ func findGenNameForEgressSvcResources(t *testing.T, client client.Client, svc *c
func clusterIPSvc(name string, extNSvc *corev1.Service) *corev1.Service {
labels := egressSvcChildResourceLabels(extNSvc)
ports := make([]corev1.ServicePort, len(extNSvc.Spec.Ports))
for i, port := range extNSvc.Spec.Ports {
ports[i] = corev1.ServicePort{ // Copy the port to avoid modifying the original.
Name: port.Name,
Port: port.Port,
Protocol: port.Protocol,
}
if port.Name == "" {
ports[i].Name = "tailscale-unnamed"
}
}
ports = append(ports, corev1.ServicePort{
Name: "tailscale-health-check",
Port: 9002,
TargetPort: intstr.FromInt(9002),
Protocol: "TCP",
})
return &corev1.Service{
ObjectMeta: metav1.ObjectMeta{
Name: name,
@@ -212,7 +204,7 @@ func clusterIPSvc(name string, extNSvc *corev1.Service) *corev1.Service {
},
Spec: corev1.ServiceSpec{
Type: corev1.ServiceTypeClusterIP,
Ports: extNSvc.Spec.Ports,
Ports: ports,
},
}
}
@@ -257,9 +249,9 @@ func portsForEndpointSlice(svc *corev1.Service) []discoveryv1.EndpointPort {
return ports
}
func mustHaveConfigForSvc(t *testing.T, cl client.Client, extNSvc, clusterIPSvc *corev1.Service, cm *corev1.ConfigMap) {
func mustHaveConfigForSvc(t *testing.T, cl client.Client, extNSvc, clusterIPSvc *corev1.Service, cm *corev1.ConfigMap, l *zap.Logger) {
t.Helper()
wantsCfg := egressSvcCfg(extNSvc, clusterIPSvc)
wantsCfg := egressSvcCfg(extNSvc, clusterIPSvc, clusterIPSvc.Namespace, l.Sugar())
if err := cl.Get(context.Background(), client.ObjectKeyFromObject(cm), cm); err != nil {
t.Fatalf("Error retrieving ConfigMap: %v", err)
}

View File

@@ -9,6 +9,7 @@ package main
import (
"context"
"net/http"
"os"
"regexp"
"strconv"
@@ -330,28 +331,6 @@ func runReconcilers(opts reconcilerOpts) {
if err != nil {
startlog.Fatalf("could not create ingress reconciler: %v", err)
}
lc, err := opts.tsServer.LocalClient()
if err != nil {
startlog.Fatalf("could not get local client: %v", err)
}
err = builder.
ControllerManagedBy(mgr).
For(&networkingv1.Ingress{}).
Named("ingress-pg-reconciler").
Watches(&corev1.Service{}, handler.EnqueueRequestsFromMapFunc(serviceHandlerForIngressPG(mgr.GetClient(), startlog))).
Complete(&IngressPGReconciler{
recorder: eventRecorder,
tsClient: opts.tsClient,
tsnetServer: opts.tsServer,
defaultTags: strings.Split(opts.proxyTags, ","),
Client: mgr.GetClient(),
logger: opts.log.Named("ingress-pg-reconciler"),
lc: lc,
tsNamespace: opts.tailscaleNamespace,
})
if err != nil {
startlog.Fatalf("could not create ingress-pg-reconciler: %v", err)
}
connectorFilter := handler.EnqueueRequestsFromMapFunc(managedResourceHandlerForType("connector"))
// If a ProxyClassChanges, enqueue all Connectors that have
@@ -453,6 +432,24 @@ func runReconcilers(opts reconcilerOpts) {
startlog.Fatalf("could not create egress EndpointSlices reconciler: %v", err)
}
podsForEps := handler.EnqueueRequestsFromMapFunc(podsFromEgressEps(mgr.GetClient(), opts.log, opts.tailscaleNamespace))
podsER := handler.EnqueueRequestsFromMapFunc(egressPodsHandler)
err = builder.
ControllerManagedBy(mgr).
Named("egress-pods-readiness-reconciler").
Watches(&discoveryv1.EndpointSlice{}, podsForEps).
Watches(&corev1.Pod{}, podsER).
Complete(&egressPodsReconciler{
Client: mgr.GetClient(),
tsNamespace: opts.tailscaleNamespace,
clock: tstime.DefaultClock{},
logger: opts.log.Named("egress-pods-readiness-reconciler"),
httpClient: http.DefaultClient,
})
if err != nil {
startlog.Fatalf("could not create egress Pods readiness reconciler: %v", err)
}
// ProxyClass reconciler gets triggered on ServiceMonitor CRD changes to ensure that any ProxyClasses, that
// define that a ServiceMonitor should be created, were set to invalid because the CRD did not exist get
// reconciled if the CRD is applied at a later point.
@@ -777,7 +774,7 @@ func proxyClassHandlerForConnector(cl client.Client, logger *zap.SugaredLogger)
}
}
// proxyClassHandlerForConnector returns a handler that, for a given ProxyClass,
// proxyClassHandlerForProxyGroup returns a handler that, for a given ProxyClass,
// returns a list of reconcile requests for all Connectors that have
// .spec.proxyClass set.
func proxyClassHandlerForProxyGroup(cl client.Client, logger *zap.SugaredLogger) handler.MapFunc {
@@ -906,6 +903,20 @@ func egressEpsHandler(_ context.Context, o client.Object) []reconcile.Request {
}
}
func egressPodsHandler(_ context.Context, o client.Object) []reconcile.Request {
if typ := o.GetLabels()[LabelParentType]; typ != proxyTypeProxyGroup {
return nil
}
return []reconcile.Request{
{
NamespacedName: types.NamespacedName{
Namespace: o.GetNamespace(),
Name: o.GetName(),
},
},
}
}
// egressEpsFromEgressPods returns a Pod event handler that checks if Pod is a replica for a ProxyGroup and if it is,
// returns reconciler requests for all egress EndpointSlices for that ProxyGroup.
func egressEpsFromPGPods(cl client.Client, ns string) handler.MapFunc {
@@ -998,7 +1009,7 @@ func reconcileRequestsForPG(pg string, cl client.Client, ns string) []reconcile.
// egressSvcsFromEgressProxyGroup is an event handler for egress ProxyGroups. It returns reconcile requests for all
// user-created ExternalName Services that should be exposed on this ProxyGroup.
func egressSvcsFromEgressProxyGroup(cl client.Client, logger *zap.SugaredLogger) handler.MapFunc {
return func(_ context.Context, o client.Object) []reconcile.Request {
return func(ctx context.Context, o client.Object) []reconcile.Request {
pg, ok := o.(*tsapi.ProxyGroup)
if !ok {
logger.Infof("[unexpected] ProxyGroup handler triggered for an object that is not a ProxyGroup")
@@ -1008,7 +1019,7 @@ func egressSvcsFromEgressProxyGroup(cl client.Client, logger *zap.SugaredLogger)
return nil
}
svcList := &corev1.ServiceList{}
if err := cl.List(context.Background(), svcList, client.MatchingFields{indexEgressProxyGroup: pg.Name}); err != nil {
if err := cl.List(ctx, svcList, client.MatchingFields{indexEgressProxyGroup: pg.Name}); err != nil {
logger.Infof("error listing Services: %v, skipping a reconcile for event on ProxyGroup %s", err, pg.Name)
return nil
}
@@ -1028,7 +1039,7 @@ func egressSvcsFromEgressProxyGroup(cl client.Client, logger *zap.SugaredLogger)
// epsFromExternalNameService is an event handler for ExternalName Services that define a Tailscale egress service that
// should be exposed on a ProxyGroup. It returns reconcile requests for EndpointSlices created for this Service.
func epsFromExternalNameService(cl client.Client, logger *zap.SugaredLogger, ns string) handler.MapFunc {
return func(_ context.Context, o client.Object) []reconcile.Request {
return func(ctx context.Context, o client.Object) []reconcile.Request {
svc, ok := o.(*corev1.Service)
if !ok {
logger.Infof("[unexpected] Service handler triggered for an object that is not a Service")
@@ -1038,7 +1049,7 @@ func epsFromExternalNameService(cl client.Client, logger *zap.SugaredLogger, ns
return nil
}
epsList := &discoveryv1.EndpointSliceList{}
if err := cl.List(context.Background(), epsList, client.InNamespace(ns),
if err := cl.List(ctx, epsList, client.InNamespace(ns),
client.MatchingLabels(egressSvcChildResourceLabels(svc))); err != nil {
logger.Infof("error listing EndpointSlices: %v, skipping a reconcile for event on Service %s", err, svc.Name)
return nil
@@ -1056,6 +1067,43 @@ func epsFromExternalNameService(cl client.Client, logger *zap.SugaredLogger, ns
}
}
func podsFromEgressEps(cl client.Client, logger *zap.SugaredLogger, ns string) handler.MapFunc {
return func(ctx context.Context, o client.Object) []reconcile.Request {
eps, ok := o.(*discoveryv1.EndpointSlice)
if !ok {
logger.Infof("[unexpected] EndpointSlice handler triggered for an object that is not a EndpointSlice")
return nil
}
if eps.Labels[labelProxyGroup] == "" {
return nil
}
if eps.Labels[labelSvcType] != "egress" {
return nil
}
podLabels := map[string]string{
LabelManaged: "true",
LabelParentType: "proxygroup",
LabelParentName: eps.Labels[labelProxyGroup],
}
podList := &corev1.PodList{}
if err := cl.List(ctx, podList, client.InNamespace(ns),
client.MatchingLabels(podLabels)); err != nil {
logger.Infof("error listing EndpointSlices: %v, skipping a reconcile for event on EndpointSlice %s", err, eps.Name)
return nil
}
reqs := make([]reconcile.Request, 0)
for _, pod := range podList.Items {
reqs = append(reqs, reconcile.Request{
NamespacedName: types.NamespacedName{
Namespace: pod.Namespace,
Name: pod.Name,
},
})
}
return reqs
}
}
// proxyClassesWithServiceMonitor returns an event handler that, given that the event is for the Prometheus
// ServiceMonitor CRD, returns all ProxyClasses that define that a ServiceMonitor should be created.
func proxyClassesWithServiceMonitor(cl client.Client, logger *zap.SugaredLogger) handler.MapFunc {
@@ -1108,42 +1156,6 @@ func indexEgressServices(o client.Object) []string {
return []string{o.GetAnnotations()[AnnotationProxyGroup]}
}
// serviceHandlerForIngressPG returns a handler for Service events that ensures that if the Service
// associated with an event is a backend Service for a tailscale Ingress with ProxyGroup annotation,
// the associated Ingress gets reconciled.
func serviceHandlerForIngressPG(cl client.Client, logger *zap.SugaredLogger) handler.MapFunc {
return func(ctx context.Context, o client.Object) []reconcile.Request {
ingList := networkingv1.IngressList{}
if err := cl.List(ctx, &ingList, client.InNamespace(o.GetNamespace())); err != nil {
logger.Debugf("error listing Ingresses: %v", err)
return nil
}
reqs := make([]reconcile.Request, 0)
for _, ing := range ingList.Items {
if ing.Spec.IngressClassName == nil || *ing.Spec.IngressClassName != tailscaleIngressClassName {
continue
}
if !hasProxyGroupAnnotation(&ing) {
continue
}
if ing.Spec.DefaultBackend != nil && ing.Spec.DefaultBackend.Service != nil && ing.Spec.DefaultBackend.Service.Name == o.GetName() {
reqs = append(reqs, reconcile.Request{NamespacedName: client.ObjectKeyFromObject(&ing)})
}
for _, rule := range ing.Spec.Rules {
if rule.HTTP == nil {
continue
}
for _, path := range rule.HTTP.Paths {
if path.Backend.Service != nil && path.Backend.Service.Name == o.GetName() {
reqs = append(reqs, reconcile.Request{NamespacedName: client.ObjectKeyFromObject(&ing)})
}
}
}
}
return reqs
}
}
func hasProxyGroupAnnotation(obj client.Object) bool {
ing := obj.(*networkingv1.Ingress)
return ing.Annotations[AnnotationProxyGroup] != ""

View File

@@ -1339,71 +1339,6 @@ func TestProxyFirewallMode(t *testing.T) {
expectEqual(t, fc, expectedSTS(t, fc, o), removeHashAnnotation, removeResourceReqs)
}
func TestTailscaledConfigfileHash(t *testing.T) {
fc := fake.NewFakeClient()
ft := &fakeTSClient{}
zl, err := zap.NewDevelopment()
if err != nil {
t.Fatal(err)
}
clock := tstest.NewClock(tstest.ClockOpts{})
sr := &ServiceReconciler{
Client: fc,
ssr: &tailscaleSTSReconciler{
Client: fc,
tsClient: ft,
defaultTags: []string{"tag:k8s"},
operatorNamespace: "operator-ns",
proxyImage: "tailscale/tailscale",
},
logger: zl.Sugar(),
clock: clock,
isDefaultLoadBalancer: true,
}
// Create a service that we should manage, and check that the initial round
// of objects looks right.
mustCreate(t, fc, &corev1.Service{
ObjectMeta: metav1.ObjectMeta{
Name: "test",
Namespace: "default",
// The apiserver is supposed to set the UID, but the fake client
// doesn't. So, set it explicitly because other code later depends
// on it being set.
UID: types.UID("1234-UID"),
},
Spec: corev1.ServiceSpec{
ClusterIP: "10.20.30.40",
Type: corev1.ServiceTypeLoadBalancer,
},
})
expectReconciled(t, sr, "default", "test")
expectReconciled(t, sr, "default", "test")
fullName, shortName := findGenName(t, fc, "default", "test", "svc")
o := configOpts{
stsName: shortName,
secretName: fullName,
namespace: "default",
parentType: "svc",
hostname: "default-test",
clusterTargetIP: "10.20.30.40",
confFileHash: "848bff4b5ba83ac999e6984c8464e597156daba961ae045e7dbaef606d54ab5e",
app: kubetypes.AppIngressProxy,
}
expectEqual(t, fc, expectedSTS(t, fc, o), removeResourceReqs)
// 2. Hostname gets changed, configfile is updated and a new hash value
// is produced.
mustUpdate(t, fc, "default", "test", func(svc *corev1.Service) {
mak.Set(&svc.Annotations, AnnotationHostname, "another-test")
})
o.hostname = "another-test"
o.confFileHash = "d4cc13f09f55f4f6775689004f9a466723325b84d2b590692796bfe22aeaa389"
expectReconciled(t, sr, "default", "test")
expectEqual(t, fc, expectedSTS(t, fc, o), removeResourceReqs)
}
func Test_isMagicDNSName(t *testing.T) {
tests := []struct {
in string

View File

@@ -32,6 +32,7 @@ import (
"tailscale.com/ipn"
tsoperator "tailscale.com/k8s-operator"
tsapi "tailscale.com/k8s-operator/apis/v1alpha1"
"tailscale.com/kube/egressservices"
"tailscale.com/kube/kubetypes"
"tailscale.com/tailcfg"
"tailscale.com/tstime"
@@ -166,6 +167,7 @@ func (r *ProxyGroupReconciler) Reconcile(ctx context.Context, req reconcile.Requ
r.recorder.Eventf(pg, corev1.EventTypeWarning, reasonProxyGroupCreationFailed, err.Error())
return setStatusReady(pg, metav1.ConditionFalse, reasonProxyGroupCreationFailed, err.Error())
}
validateProxyClassForPG(logger, pg, proxyClass)
if !tsoperator.ProxyClassIsReady(proxyClass) {
message := fmt.Sprintf("the ProxyGroup's ProxyClass %s is not yet in a ready state, waiting...", proxyClassName)
logger.Info(message)
@@ -204,6 +206,31 @@ func (r *ProxyGroupReconciler) Reconcile(ctx context.Context, req reconcile.Requ
return setStatusReady(pg, metav1.ConditionTrue, reasonProxyGroupReady, reasonProxyGroupReady)
}
// validateProxyClassForPG applies custom validation logic for ProxyClass applied to ProxyGroup.
func validateProxyClassForPG(logger *zap.SugaredLogger, pg *tsapi.ProxyGroup, pc *tsapi.ProxyClass) {
if pg.Spec.Type == tsapi.ProxyGroupTypeIngress {
return
}
// Our custom logic for ensuring minimum downtime ProxyGroup update rollouts relies on the local health check
// beig accessible on the replica Pod IP:9002. This address can also be modified by users, via
// TS_LOCAL_ADDR_PORT env var.
//
// Currently TS_LOCAL_ADDR_PORT controls Pod's health check and metrics address. _Probably_ there is no need for
// users to set this to a custom value. Users who want to consume metrics, should integrate with the metrics
// Service and/or ServiceMonitor, rather than Pods directly. The health check is likely not useful to integrate
// directly with for operator proxies (and we should aim for unified lifecycle logic in the operator, users
// shouldn't need to set their own).
//
// TODO(irbekrm): maybe disallow configuring this env var in future (in Tailscale 1.84 or later).
if hasLocalAddrPortSet(pc) {
msg := fmt.Sprintf("ProxyClass %s applied to an egress ProxyGroup has TS_LOCAL_ADDR_PORT env var set to a custom value."+
"This will disable the ProxyGroup graceful failover mechanism, so you might experience downtime when ProxyGroup pods are restarted."+
"In future we will remove the ability to set custom TS_LOCAL_ADDR_PORT for egress ProxyGroups."+
"Please raise an issue if you expect that this will cause issues for your workflow.", pc.Name)
logger.Warn(msg)
}
}
func (r *ProxyGroupReconciler) maybeProvision(ctx context.Context, pg *tsapi.ProxyGroup, proxyClass *tsapi.ProxyClass) error {
logger := r.logger(pg.Name)
r.mu.Lock()
@@ -253,10 +280,11 @@ func (r *ProxyGroupReconciler) maybeProvision(ctx context.Context, pg *tsapi.Pro
return fmt.Errorf("error provisioning RoleBinding: %w", err)
}
if pg.Spec.Type == tsapi.ProxyGroupTypeEgress {
cm := pgEgressCM(pg, r.tsNamespace)
cm, hp := pgEgressCM(pg, r.tsNamespace)
if _, err := createOrUpdate(ctx, r.Client, r.tsNamespace, cm, func(existing *corev1.ConfigMap) {
existing.ObjectMeta.Labels = cm.ObjectMeta.Labels
existing.ObjectMeta.OwnerReferences = cm.ObjectMeta.OwnerReferences
mak.Set(&existing.BinaryData, egressservices.KeyHEPPings, hp)
}); err != nil {
return fmt.Errorf("error provisioning egress ConfigMap %q: %w", cm.Name, err)
}
@@ -270,7 +298,7 @@ func (r *ProxyGroupReconciler) maybeProvision(ctx context.Context, pg *tsapi.Pro
return fmt.Errorf("error provisioning ingress ConfigMap %q: %w", cm.Name, err)
}
}
ss, err := pgStatefulSet(pg, r.tsNamespace, r.proxyImage, r.tsFirewallMode)
ss, err := pgStatefulSet(pg, r.tsNamespace, r.proxyImage, r.tsFirewallMode, proxyClass)
if err != nil {
return fmt.Errorf("error generating StatefulSet spec: %w", err)
}

View File

@@ -7,11 +7,14 @@ package main
import (
"fmt"
"slices"
"strconv"
appsv1 "k8s.io/api/apps/v1"
corev1 "k8s.io/api/core/v1"
rbacv1 "k8s.io/api/rbac/v1"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
"k8s.io/apimachinery/pkg/util/intstr"
"sigs.k8s.io/yaml"
tsapi "tailscale.com/k8s-operator/apis/v1alpha1"
"tailscale.com/kube/egressservices"
@@ -19,9 +22,12 @@ import (
"tailscale.com/types/ptr"
)
// deletionGracePeriodSeconds is set to 6 minutes to ensure that the pre-stop hook of these proxies have enough chance to terminate gracefully.
const deletionGracePeriodSeconds int64 = 360
// Returns the base StatefulSet definition for a ProxyGroup. A ProxyClass may be
// applied over the top after.
func pgStatefulSet(pg *tsapi.ProxyGroup, namespace, image, tsFirewallMode string) (*appsv1.StatefulSet, error) {
func pgStatefulSet(pg *tsapi.ProxyGroup, namespace, image, tsFirewallMode string, proxyClass *tsapi.ProxyClass) (*appsv1.StatefulSet, error) {
ss := new(appsv1.StatefulSet)
if err := yaml.Unmarshal(proxyYaml, &ss); err != nil {
return nil, fmt.Errorf("failed to unmarshal proxy spec: %w", err)
@@ -145,15 +151,25 @@ func pgStatefulSet(pg *tsapi.ProxyGroup, namespace, image, tsFirewallMode string
}
if pg.Spec.Type == tsapi.ProxyGroupTypeEgress {
envs = append(envs, corev1.EnvVar{
Name: "TS_EGRESS_SERVICES_CONFIG_PATH",
Value: fmt.Sprintf("/etc/proxies/%s", egressservices.KeyEgressServices),
},
envs = append(envs,
// TODO(irbekrm): in 1.80 we deprecated TS_EGRESS_SERVICES_CONFIG_PATH in favour of
// TS_EGRESS_PROXIES_CONFIG_PATH. Remove it in 1.84.
corev1.EnvVar{
Name: "TS_EGRESS_SERVICES_CONFIG_PATH",
Value: fmt.Sprintf("/etc/proxies/%s", egressservices.KeyEgressServices),
},
corev1.EnvVar{
Name: "TS_EGRESS_PROXIES_CONFIG_PATH",
Value: "/etc/proxies",
},
corev1.EnvVar{
Name: "TS_INTERNAL_APP",
Value: kubetypes.AppProxyGroupEgress,
},
)
corev1.EnvVar{
Name: "TS_ENABLE_HEALTH_CHECK",
Value: "true",
})
} else { // ingress
envs = append(envs, corev1.EnvVar{
Name: "TS_INTERNAL_APP",
@@ -167,6 +183,25 @@ func pgStatefulSet(pg *tsapi.ProxyGroup, namespace, image, tsFirewallMode string
return append(c.Env, envs...)
}()
// The pre-stop hook is used to ensure that a replica does not get terminated while cluster traffic for egress
// services is still being routed to it.
//
// This mechanism currently (2025-01-26) rely on the local health check being accessible on the Pod's
// IP, so they are not supported for ProxyGroups where users have configured TS_LOCAL_ADDR_PORT to a custom
// value.
if pg.Spec.Type == tsapi.ProxyGroupTypeEgress && !hasLocalAddrPortSet(proxyClass) {
c.Lifecycle = &corev1.Lifecycle{
PreStop: &corev1.LifecycleHandler{
HTTPGet: &corev1.HTTPGetAction{
Path: kubetypes.EgessServicesPreshutdownEP,
Port: intstr.FromInt(defaultLocalAddrPort),
},
},
}
// Set the deletion grace period to 6 minutes to ensure that the pre-stop hook has enough time to terminate
// gracefully.
ss.Spec.Template.DeletionGracePeriodSeconds = ptr.To(deletionGracePeriodSeconds)
}
return ss, nil
}
@@ -258,7 +293,9 @@ func pgStateSecrets(pg *tsapi.ProxyGroup, namespace string) (secrets []*corev1.S
return secrets
}
func pgEgressCM(pg *tsapi.ProxyGroup, namespace string) *corev1.ConfigMap {
func pgEgressCM(pg *tsapi.ProxyGroup, namespace string) (*corev1.ConfigMap, []byte) {
hp := hepPings(pg)
hpBs := []byte(strconv.Itoa(hp))
return &corev1.ConfigMap{
ObjectMeta: metav1.ObjectMeta{
Name: pgEgressCMName(pg.Name),
@@ -266,8 +303,10 @@ func pgEgressCM(pg *tsapi.ProxyGroup, namespace string) *corev1.ConfigMap {
Labels: pgLabels(pg.Name, nil),
OwnerReferences: pgOwnerReference(pg),
},
}
BinaryData: map[string][]byte{egressservices.KeyHEPPings: hpBs},
}, hpBs
}
func pgIngressCM(pg *tsapi.ProxyGroup, namespace string) *corev1.ConfigMap {
return &corev1.ConfigMap{
ObjectMeta: metav1.ObjectMeta{
@@ -313,3 +352,23 @@ func pgReplicas(pg *tsapi.ProxyGroup) int32 {
func pgEgressCMName(pg string) string {
return fmt.Sprintf("%s-egress-config", pg)
}
// hasLocalAddrPortSet returns true if the proxyclass has the TS_LOCAL_ADDR_PORT env var set. For egress ProxyGroups,
// currently (2025-01-26) this means that the ProxyGroup does not support graceful failover.
func hasLocalAddrPortSet(proxyClass *tsapi.ProxyClass) bool {
if proxyClass == nil || proxyClass.Spec.StatefulSet == nil || proxyClass.Spec.StatefulSet.Pod == nil || proxyClass.Spec.StatefulSet.Pod.TailscaleContainer == nil {
return false
}
return slices.ContainsFunc(proxyClass.Spec.StatefulSet.Pod.TailscaleContainer.Env, func(env tsapi.Env) bool {
return env.Name == envVarTSLocalAddrPort
})
}
// hepPings returns the number of times a health check endpoint exposed by a Service fronting ProxyGroup replicas should
// be pinged to ensure that all currently configured backend replicas are hit.
func hepPings(pg *tsapi.ProxyGroup) int {
rc := pgReplicas(pg)
// Assuming a Service implemented using round robin load balancing, number-of-replica-times should be enough, but in
// practice, we cannot assume that the requests will be load balanced perfectly.
return int(rc) * 3
}

View File

@@ -19,13 +19,13 @@ import (
rbacv1 "k8s.io/api/rbac/v1"
apiextensionsv1 "k8s.io/apiextensions-apiserver/pkg/apis/apiextensions/v1"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
"k8s.io/apimachinery/pkg/util/intstr"
"k8s.io/client-go/tools/record"
"sigs.k8s.io/controller-runtime/pkg/client"
"sigs.k8s.io/controller-runtime/pkg/client/fake"
"tailscale.com/client/tailscale"
tsoperator "tailscale.com/k8s-operator"
tsapi "tailscale.com/k8s-operator/apis/v1alpha1"
"tailscale.com/kube/egressservices"
"tailscale.com/kube/kubetypes"
"tailscale.com/tstest"
"tailscale.com/types/ptr"
@@ -97,7 +97,7 @@ func TestProxyGroup(t *testing.T) {
tsoperator.SetProxyGroupCondition(pg, tsapi.ProxyGroupReady, metav1.ConditionFalse, reasonProxyGroupCreating, "the ProxyGroup's ProxyClass default-pc is not yet in a ready state, waiting...", 0, cl, zl.Sugar())
expectEqual(t, fc, pg)
expectProxyGroupResources(t, fc, pg, false, "")
expectProxyGroupResources(t, fc, pg, false, "", pc)
})
t.Run("observe_ProxyGroupCreating_status_reason", func(t *testing.T) {
@@ -118,11 +118,11 @@ func TestProxyGroup(t *testing.T) {
tsoperator.SetProxyGroupCondition(pg, tsapi.ProxyGroupReady, metav1.ConditionFalse, reasonProxyGroupCreating, "0/2 ProxyGroup pods running", 0, cl, zl.Sugar())
expectEqual(t, fc, pg)
expectProxyGroupResources(t, fc, pg, true, "")
expectProxyGroupResources(t, fc, pg, true, "", pc)
if expected := 1; reconciler.egressProxyGroups.Len() != expected {
t.Fatalf("expected %d egress ProxyGroups, got %d", expected, reconciler.egressProxyGroups.Len())
}
expectProxyGroupResources(t, fc, pg, true, "")
expectProxyGroupResources(t, fc, pg, true, "", pc)
keyReq := tailscale.KeyCapabilities{
Devices: tailscale.KeyDeviceCapabilities{
Create: tailscale.KeyDeviceCreateCapabilities{
@@ -154,7 +154,7 @@ func TestProxyGroup(t *testing.T) {
}
tsoperator.SetProxyGroupCondition(pg, tsapi.ProxyGroupReady, metav1.ConditionTrue, reasonProxyGroupReady, reasonProxyGroupReady, 0, cl, zl.Sugar())
expectEqual(t, fc, pg)
expectProxyGroupResources(t, fc, pg, true, initialCfgHash)
expectProxyGroupResources(t, fc, pg, true, initialCfgHash, pc)
})
t.Run("scale_up_to_3", func(t *testing.T) {
@@ -165,7 +165,7 @@ func TestProxyGroup(t *testing.T) {
expectReconciled(t, reconciler, "", pg.Name)
tsoperator.SetProxyGroupCondition(pg, tsapi.ProxyGroupReady, metav1.ConditionFalse, reasonProxyGroupCreating, "2/3 ProxyGroup pods running", 0, cl, zl.Sugar())
expectEqual(t, fc, pg)
expectProxyGroupResources(t, fc, pg, true, initialCfgHash)
expectProxyGroupResources(t, fc, pg, true, initialCfgHash, pc)
addNodeIDToStateSecrets(t, fc, pg)
expectReconciled(t, reconciler, "", pg.Name)
@@ -175,7 +175,7 @@ func TestProxyGroup(t *testing.T) {
TailnetIPs: []string{"1.2.3.4", "::1"},
})
expectEqual(t, fc, pg)
expectProxyGroupResources(t, fc, pg, true, initialCfgHash)
expectProxyGroupResources(t, fc, pg, true, initialCfgHash, pc)
})
t.Run("scale_down_to_1", func(t *testing.T) {
@@ -188,7 +188,7 @@ func TestProxyGroup(t *testing.T) {
pg.Status.Devices = pg.Status.Devices[:1] // truncate to only the first device.
expectEqual(t, fc, pg)
expectProxyGroupResources(t, fc, pg, true, initialCfgHash)
expectProxyGroupResources(t, fc, pg, true, initialCfgHash, pc)
})
t.Run("trigger_config_change_and_observe_new_config_hash", func(t *testing.T) {
@@ -202,7 +202,7 @@ func TestProxyGroup(t *testing.T) {
expectReconciled(t, reconciler, "", pg.Name)
expectEqual(t, fc, pg)
expectProxyGroupResources(t, fc, pg, true, "518a86e9fae64f270f8e0ec2a2ea6ca06c10f725035d3d6caca132cd61e42a74")
expectProxyGroupResources(t, fc, pg, true, "518a86e9fae64f270f8e0ec2a2ea6ca06c10f725035d3d6caca132cd61e42a74", pc)
})
t.Run("enable_metrics", func(t *testing.T) {
@@ -246,12 +246,29 @@ func TestProxyGroup(t *testing.T) {
// The fake client does not clean up objects whose owner has been
// deleted, so we can't test for the owned resources getting deleted.
})
}
func TestProxyGroupTypes(t *testing.T) {
pc := &tsapi.ProxyClass{
ObjectMeta: metav1.ObjectMeta{
Name: "test",
Generation: 1,
},
Spec: tsapi.ProxyClassSpec{},
}
fc := fake.NewClientBuilder().
WithScheme(tsapi.GlobalScheme).
WithObjects(pc).
WithStatusSubresource(pc).
Build()
mustUpdateStatus(t, fc, "", pc.Name, func(p *tsapi.ProxyClass) {
p.Status.Conditions = []metav1.Condition{{
Type: string(tsapi.ProxyClassReady),
Status: metav1.ConditionTrue,
ObservedGeneration: 1,
}}
})
zl, _ := zap.NewDevelopment()
reconciler := &ProxyGroupReconciler{
@@ -274,9 +291,7 @@ func TestProxyGroupTypes(t *testing.T) {
Replicas: ptr.To[int32](0),
},
}
if err := fc.Create(context.Background(), pg); err != nil {
t.Fatal(err)
}
mustCreate(t, fc, pg)
expectReconciled(t, reconciler, "", pg.Name)
verifyProxyGroupCounts(t, reconciler, 0, 1)
@@ -286,7 +301,8 @@ func TestProxyGroupTypes(t *testing.T) {
t.Fatalf("failed to get StatefulSet: %v", err)
}
verifyEnvVar(t, sts, "TS_INTERNAL_APP", kubetypes.AppProxyGroupEgress)
verifyEnvVar(t, sts, "TS_EGRESS_SERVICES_CONFIG_PATH", fmt.Sprintf("/etc/proxies/%s", egressservices.KeyEgressServices))
verifyEnvVar(t, sts, "TS_EGRESS_PROXIES_CONFIG_PATH", "/etc/proxies")
verifyEnvVar(t, sts, "TS_ENABLE_HEALTH_CHECK", "true")
// Verify that egress configuration has been set up.
cm := &corev1.ConfigMap{}
@@ -323,6 +339,57 @@ func TestProxyGroupTypes(t *testing.T) {
if diff := cmp.Diff(expectedVolumeMounts, sts.Spec.Template.Spec.Containers[0].VolumeMounts); diff != "" {
t.Errorf("unexpected volume mounts (-want +got):\n%s", diff)
}
expectedLifecycle := corev1.Lifecycle{
PreStop: &corev1.LifecycleHandler{
HTTPGet: &corev1.HTTPGetAction{
Path: kubetypes.EgessServicesPreshutdownEP,
Port: intstr.FromInt(defaultLocalAddrPort),
},
},
}
if diff := cmp.Diff(expectedLifecycle, *sts.Spec.Template.Spec.Containers[0].Lifecycle); diff != "" {
t.Errorf("unexpected lifecycle (-want +got):\n%s", diff)
}
if *sts.Spec.Template.DeletionGracePeriodSeconds != deletionGracePeriodSeconds {
t.Errorf("unexpected deletion grace period seconds %d, want %d", *sts.Spec.Template.DeletionGracePeriodSeconds, deletionGracePeriodSeconds)
}
})
t.Run("egress_type_no_lifecycle_hook_when_local_addr_port_set", func(t *testing.T) {
pg := &tsapi.ProxyGroup{
ObjectMeta: metav1.ObjectMeta{
Name: "test-egress-no-lifecycle",
UID: "test-egress-no-lifecycle-uid",
},
Spec: tsapi.ProxyGroupSpec{
Type: tsapi.ProxyGroupTypeEgress,
Replicas: ptr.To[int32](0),
ProxyClass: "test",
},
}
mustCreate(t, fc, pg)
mustUpdate(t, fc, "", pc.Name, func(p *tsapi.ProxyClass) {
p.Spec.StatefulSet = &tsapi.StatefulSet{
Pod: &tsapi.Pod{
TailscaleContainer: &tsapi.Container{
Env: []tsapi.Env{{
Name: "TS_LOCAL_ADDR_PORT",
Value: "127.0.0.1:8080",
}},
},
},
}
})
expectReconciled(t, reconciler, "", pg.Name)
sts := &appsv1.StatefulSet{}
if err := fc.Get(context.Background(), client.ObjectKey{Namespace: tsNamespace, Name: pg.Name}, sts); err != nil {
t.Fatalf("failed to get StatefulSet: %v", err)
}
if sts.Spec.Template.Spec.Containers[0].Lifecycle != nil {
t.Error("lifecycle hook was set when TS_LOCAL_ADDR_PORT was configured via ProxyClass")
}
})
t.Run("ingress_type", func(t *testing.T) {
@@ -341,7 +408,7 @@ func TestProxyGroupTypes(t *testing.T) {
}
expectReconciled(t, reconciler, "", pg.Name)
verifyProxyGroupCounts(t, reconciler, 1, 1)
verifyProxyGroupCounts(t, reconciler, 1, 2)
sts := &appsv1.StatefulSet{}
if err := fc.Get(context.Background(), client.ObjectKey{Namespace: tsNamespace, Name: pg.Name}, sts); err != nil {
@@ -402,13 +469,13 @@ func verifyEnvVar(t *testing.T, sts *appsv1.StatefulSet, name, expectedValue str
t.Errorf("%s environment variable not found", name)
}
func expectProxyGroupResources(t *testing.T, fc client.WithWatch, pg *tsapi.ProxyGroup, shouldExist bool, cfgHash string) {
func expectProxyGroupResources(t *testing.T, fc client.WithWatch, pg *tsapi.ProxyGroup, shouldExist bool, cfgHash string, proxyClass *tsapi.ProxyClass) {
t.Helper()
role := pgRole(pg, tsNamespace)
roleBinding := pgRoleBinding(pg, tsNamespace)
serviceAccount := pgServiceAccount(pg, tsNamespace)
statefulSet, err := pgStatefulSet(pg, tsNamespace, testProxyImage, "auto")
statefulSet, err := pgStatefulSet(pg, tsNamespace, testProxyImage, "auto", proxyClass)
if err != nil {
t.Fatal(err)
}

View File

@@ -101,6 +101,9 @@ const (
proxyTypeIngressResource = "ingress_resource"
proxyTypeConnector = "connector"
proxyTypeProxyGroup = "proxygroup"
envVarTSLocalAddrPort = "TS_LOCAL_ADDR_PORT"
defaultLocalAddrPort = 9002 // metrics and health check port
)
var (
@@ -694,7 +697,7 @@ func (a *tailscaleSTSReconciler) reconcileSTS(ctx context.Context, logger *zap.S
// being created, there is no need for a restart.
// TODO(irbekrm): remove this in 1.84.
hash := tsConfigHash
if dev != nil && dev.capver >= 110 {
if dev == nil || dev.capver >= 110 {
hash = s.Spec.Template.GetAnnotations()[podAnnotationLastSetConfigFileHash]
}
s.Spec = ss.Spec

View File

@@ -583,6 +583,21 @@ func mustCreate(t *testing.T, client client.Client, obj client.Object) {
t.Fatalf("creating %q: %v", obj.GetName(), err)
}
}
func mustCreateAll(t *testing.T, client client.Client, objs ...client.Object) {
t.Helper()
for _, obj := range objs {
mustCreate(t, client, obj)
}
}
func mustDeleteAll(t *testing.T, client client.Client, objs ...client.Object) {
t.Helper()
for _, obj := range objs {
if err := client.Delete(context.Background(), obj); err != nil {
t.Fatalf("deleting %q: %v", obj.GetName(), err)
}
}
}
func mustUpdate[T any, O ptrObject[T]](t *testing.T, client client.Client, ns, name string, update func(O)) {
t.Helper()
@@ -706,6 +721,19 @@ func expectRequeue(t *testing.T, sr reconcile.Reconciler, ns, name string) {
t.Fatalf("expected timed requeue, got success")
}
}
func expectError(t *testing.T, sr reconcile.Reconciler, ns, name string) {
t.Helper()
req := reconcile.Request{
NamespacedName: types.NamespacedName{
Name: name,
Namespace: ns,
},
}
_, err := sr.Reconcile(context.Background(), req)
if err == nil {
t.Error("Reconcile: expected error but did not get one")
}
}
// expectEvents accepts a test recorder and a list of events, tests that expected
// events are sent down the recorder's channel. Waits for 5s for each event.

View File

@@ -10,6 +10,7 @@ import (
"context"
"encoding/binary"
"errors"
"expvar"
"flag"
"fmt"
"log"
@@ -26,6 +27,8 @@ import (
"github.com/inetaf/tcpproxy"
"github.com/peterbourgon/ff/v3"
"golang.org/x/net/dns/dnsmessage"
"gvisor.dev/gvisor/pkg/tcpip"
"gvisor.dev/gvisor/pkg/tcpip/transport/tcp"
"tailscale.com/client/tailscale"
"tailscale.com/envknob"
"tailscale.com/hostinfo"
@@ -37,6 +40,7 @@ import (
"tailscale.com/tsweb"
"tailscale.com/util/dnsname"
"tailscale.com/util/mak"
"tailscale.com/wgengine/netstack"
)
func main() {
@@ -112,6 +116,7 @@ func main() {
ts.Port = uint16(*wgPort)
}
defer ts.Close()
if *verboseTSNet {
ts.Logf = log.Printf
}
@@ -129,6 +134,36 @@ func main() {
log.Fatalf("debug serve: %v", http.Serve(dln, mux))
}()
}
if err := ts.Start(); err != nil {
log.Fatalf("ts.Start: %v", err)
}
// TODO(raggi): this is not a public interface or guarantee.
ns := ts.Sys().Netstack.Get().(*netstack.Impl)
tcpRXBufOpt := tcpip.TCPReceiveBufferSizeRangeOption{
Min: tcp.MinBufferSize,
Default: tcp.DefaultReceiveBufferSize,
Max: tcp.MaxBufferSize,
}
if err := ns.SetTransportProtocolOption(tcp.ProtocolNumber, &tcpRXBufOpt); err != nil {
log.Fatalf("could not set TCP RX buf size: %v", err)
}
tcpTXBufOpt := tcpip.TCPSendBufferSizeRangeOption{
Min: tcp.MinBufferSize,
Default: tcp.DefaultSendBufferSize,
Max: tcp.MaxBufferSize,
}
if err := ns.SetTransportProtocolOption(tcp.ProtocolNumber, &tcpTXBufOpt); err != nil {
log.Fatalf("could not set TCP TX buf size: %v", err)
}
mslOpt := tcpip.TCPTimeWaitTimeoutOption(5 * time.Second)
if err := ns.SetTransportProtocolOption(tcp.ProtocolNumber, &mslOpt); err != nil {
log.Fatalf("could not set TCP MSL: %v", err)
}
if *debugPort != 0 {
expvar.Publish("netstack", ns.ExpVar())
}
lc, err := ts.LocalClient()
if err != nil {
log.Fatalf("LocalClient() failed: %v", err)

View File

@@ -89,6 +89,8 @@ tailscale.com/cmd/stund dependencies: (generated by github.com/tailscale/depawar
golang.org/x/crypto/cryptobyte/asn1 from crypto/ecdsa+
golang.org/x/crypto/curve25519 from golang.org/x/crypto/nacl/box+
golang.org/x/crypto/hkdf from crypto/tls+
golang.org/x/crypto/internal/alias from golang.org/x/crypto/chacha20+
golang.org/x/crypto/internal/poly1305 from golang.org/x/crypto/chacha20poly1305+
golang.org/x/crypto/nacl/box from tailscale.com/types/key
golang.org/x/crypto/nacl/secretbox from golang.org/x/crypto/nacl/box
golang.org/x/crypto/salsa20/salsa from golang.org/x/crypto/nacl/box+
@@ -123,6 +125,18 @@ tailscale.com/cmd/stund dependencies: (generated by github.com/tailscale/depawar
crypto/ed25519 from crypto/tls+
crypto/elliptic from crypto/ecdsa+
crypto/hmac from crypto/tls+
crypto/internal/alias from crypto/aes+
crypto/internal/bigmod from crypto/ecdsa+
crypto/internal/boring from crypto/aes+
crypto/internal/boring/bbig from crypto/ecdsa+
crypto/internal/boring/sig from crypto/internal/boring
crypto/internal/edwards25519 from crypto/ed25519
crypto/internal/edwards25519/field from crypto/ecdh+
crypto/internal/hpke from crypto/tls
crypto/internal/mlkem768 from crypto/tls
crypto/internal/nistec from crypto/ecdh+
crypto/internal/nistec/fiat from crypto/internal/nistec
crypto/internal/randutil from crypto/dsa+
crypto/md5 from crypto/tls+
crypto/rand from crypto/ed25519+
crypto/rc4 from crypto/tls
@@ -133,6 +147,7 @@ tailscale.com/cmd/stund dependencies: (generated by github.com/tailscale/depawar
crypto/subtle from crypto/aes+
crypto/tls from net/http+
crypto/x509 from crypto/tls
D crypto/x509/internal/macos from crypto/x509
crypto/x509/pkix from crypto/x509
embed from crypto/internal/nistec+
encoding from encoding/json+
@@ -153,6 +168,44 @@ tailscale.com/cmd/stund dependencies: (generated by github.com/tailscale/depawar
hash/fnv from google.golang.org/protobuf/internal/detrand
hash/maphash from go4.org/mem
html from net/http/pprof+
internal/abi from crypto/x509/internal/macos+
internal/asan from syscall
internal/bisect from internal/godebug
internal/bytealg from bytes+
internal/byteorder from crypto/aes+
internal/chacha8rand from math/rand/v2+
internal/concurrent from unique
internal/coverage/rtcov from runtime
internal/cpu from crypto/aes+
internal/filepathlite from os+
internal/fmtsort from fmt
internal/goarch from crypto/aes+
internal/godebug from crypto/tls+
internal/godebugs from internal/godebug+
internal/goexperiment from runtime
internal/goos from crypto/x509+
internal/itoa from internal/poll+
internal/msan from syscall
internal/nettrace from net+
internal/oserror from io/fs+
internal/poll from net+
internal/profile from net/http/pprof
internal/profilerecord from runtime+
internal/race from internal/poll+
internal/reflectlite from context+
internal/runtime/atomic from internal/runtime/exithook+
internal/runtime/exithook from runtime
L internal/runtime/syscall from runtime+
internal/singleflight from net
internal/stringslite from embed+
internal/syscall/execenv from os
LD internal/syscall/unix from crypto/rand+
W internal/syscall/windows from crypto/rand+
W internal/syscall/windows/registry from mime+
W internal/syscall/windows/sysdll from internal/syscall/windows+
internal/testlog from os
internal/unsafeheader from internal/reflectlite+
internal/weak from unique
io from bufio+
io/fs from crypto/x509+
iter from maps+
@@ -171,6 +224,7 @@ tailscale.com/cmd/stund dependencies: (generated by github.com/tailscale/depawar
net/http from expvar+
net/http/httptrace from net/http
net/http/internal from net/http
net/http/internal/ascii from net/http
net/http/pprof from tailscale.com/tsweb
net/netip from go4.org/netipx+
net/textproto from golang.org/x/net/http/httpguts+
@@ -182,7 +236,10 @@ tailscale.com/cmd/stund dependencies: (generated by github.com/tailscale/depawar
reflect from crypto/x509+
regexp from github.com/prometheus/client_golang/prometheus/internal+
regexp/syntax from regexp
runtime from crypto/internal/nistec+
runtime/debug from github.com/prometheus/client_golang/prometheus+
runtime/internal/math from runtime
runtime/internal/sys from runtime
runtime/metrics from github.com/prometheus/client_golang/prometheus+
runtime/pprof from net/http/pprof
runtime/trace from net/http/pprof
@@ -199,3 +256,4 @@ tailscale.com/cmd/stund dependencies: (generated by github.com/tailscale/depawar
unicode/utf16 from crypto/x509+
unicode/utf8 from bufio+
unique from net/netip
unsafe from bytes+

View File

@@ -212,7 +212,7 @@ change in the future.
exitNodeCmd(),
updateCmd,
whoisCmd,
debugCmd,
debugCmd(),
driveCmd,
idTokenCmd,
advertiseCmd(),

View File

@@ -25,10 +25,12 @@ import (
"tailscale.com/tailcfg"
"tailscale.com/tka"
"tailscale.com/tstest"
"tailscale.com/tstest/deptest"
"tailscale.com/types/logger"
"tailscale.com/types/opt"
"tailscale.com/types/persist"
"tailscale.com/types/preftype"
"tailscale.com/util/set"
"tailscale.com/version/distro"
)
@@ -1568,3 +1570,31 @@ func TestDocs(t *testing.T) {
}
walk(t, root)
}
func TestDeps(t *testing.T) {
deptest.DepChecker{
GOOS: "linux",
GOARCH: "arm64",
WantDeps: set.Of(
"tailscale.com/feature/capture/dissector", // want the Lua by default
),
BadDeps: map[string]string{
"tailscale.com/feature/capture": "don't link capture code",
"tailscale.com/net/packet": "why we passing packets in the CLI?",
"tailscale.com/net/flowtrack": "why we tracking flows in the CLI?",
},
}.Check(t)
}
func TestDepsNoCapture(t *testing.T) {
deptest.DepChecker{
GOOS: "linux",
GOARCH: "arm64",
Tags: "ts_omit_capture",
BadDeps: map[string]string{
"tailscale.com/feature/capture": "don't link capture code",
"tailscale.com/feature/capture/dissector": "don't like the Lua",
},
}.Check(t)
}

View File

@@ -0,0 +1,80 @@
// Copyright (c) Tailscale Inc & AUTHORS
// SPDX-License-Identifier: BSD-3-Clause
//go:build !ios && !ts_omit_capture
package cli
import (
"context"
"flag"
"fmt"
"io"
"os"
"os/exec"
"github.com/peterbourgon/ff/v3/ffcli"
"tailscale.com/feature/capture/dissector"
)
func init() {
debugCaptureCmd = mkDebugCaptureCmd
}
func mkDebugCaptureCmd() *ffcli.Command {
return &ffcli.Command{
Name: "capture",
ShortUsage: "tailscale debug capture",
Exec: runCapture,
ShortHelp: "Stream pcaps for debugging",
FlagSet: (func() *flag.FlagSet {
fs := newFlagSet("capture")
fs.StringVar(&captureArgs.outFile, "o", "", "path to stream the pcap (or - for stdout), leave empty to start wireshark")
return fs
})(),
}
}
var captureArgs struct {
outFile string
}
func runCapture(ctx context.Context, args []string) error {
stream, err := localClient.StreamDebugCapture(ctx)
if err != nil {
return err
}
defer stream.Close()
switch captureArgs.outFile {
case "-":
fmt.Fprintln(Stderr, "Press Ctrl-C to stop the capture.")
_, err = io.Copy(os.Stdout, stream)
return err
case "":
lua, err := os.CreateTemp("", "ts-dissector")
if err != nil {
return err
}
defer os.Remove(lua.Name())
io.WriteString(lua, dissector.Lua)
if err := lua.Close(); err != nil {
return err
}
wireshark := exec.CommandContext(ctx, "wireshark", "-X", "lua_script:"+lua.Name(), "-k", "-i", "-")
wireshark.Stdin = stream
wireshark.Stdout = os.Stdout
wireshark.Stderr = os.Stderr
return wireshark.Run()
}
f, err := os.OpenFile(captureArgs.outFile, os.O_WRONLY|os.O_CREATE|os.O_TRUNC, 0644)
if err != nil {
return err
}
defer f.Close()
fmt.Fprintln(Stderr, "Press Ctrl-C to stop the capture.")
_, err = io.Copy(f, stream)
return err
}

View File

@@ -20,7 +20,6 @@ import (
"net/netip"
"net/url"
"os"
"os/exec"
"runtime"
"runtime/debug"
"strconv"
@@ -45,307 +44,302 @@ import (
"tailscale.com/types/key"
"tailscale.com/types/logger"
"tailscale.com/util/must"
"tailscale.com/wgengine/capture"
)
var debugCmd = &ffcli.Command{
Name: "debug",
Exec: runDebug,
ShortUsage: "tailscale debug <debug-flags | subcommand>",
ShortHelp: "Debug commands",
LongHelp: hidden + `"tailscale debug" contains misc debug facilities; it is not a stable interface.`,
FlagSet: (func() *flag.FlagSet {
fs := newFlagSet("debug")
fs.StringVar(&debugArgs.file, "file", "", "get, delete:NAME, or NAME")
fs.StringVar(&debugArgs.cpuFile, "cpu-profile", "", "if non-empty, grab a CPU profile for --profile-seconds seconds and write it to this file; - for stdout")
fs.StringVar(&debugArgs.memFile, "mem-profile", "", "if non-empty, grab a memory profile and write it to this file; - for stdout")
fs.IntVar(&debugArgs.cpuSec, "profile-seconds", 15, "number of seconds to run a CPU profile for, when --cpu-profile is non-empty")
return fs
})(),
Subcommands: []*ffcli.Command{
{
Name: "derp-map",
ShortUsage: "tailscale debug derp-map",
Exec: runDERPMap,
ShortHelp: "Print DERP map",
},
{
Name: "component-logs",
ShortUsage: "tailscale debug component-logs [" + strings.Join(ipn.DebuggableComponents, "|") + "]",
Exec: runDebugComponentLogs,
ShortHelp: "Enable/disable debug logs for a component",
FlagSet: (func() *flag.FlagSet {
fs := newFlagSet("component-logs")
fs.DurationVar(&debugComponentLogsArgs.forDur, "for", time.Hour, "how long to enable debug logs for; zero or negative means to disable")
return fs
})(),
},
{
Name: "daemon-goroutines",
ShortUsage: "tailscale debug daemon-goroutines",
Exec: runDaemonGoroutines,
ShortHelp: "Print tailscaled's goroutines",
},
{
Name: "daemon-logs",
ShortUsage: "tailscale debug daemon-logs",
Exec: runDaemonLogs,
ShortHelp: "Watch tailscaled's server logs",
FlagSet: (func() *flag.FlagSet {
fs := newFlagSet("daemon-logs")
fs.IntVar(&daemonLogsArgs.verbose, "verbose", 0, "verbosity level")
fs.BoolVar(&daemonLogsArgs.time, "time", false, "include client time")
return fs
})(),
},
{
Name: "metrics",
ShortUsage: "tailscale debug metrics",
Exec: runDaemonMetrics,
ShortHelp: "Print tailscaled's metrics",
FlagSet: (func() *flag.FlagSet {
fs := newFlagSet("metrics")
fs.BoolVar(&metricsArgs.watch, "watch", false, "print JSON dump of delta values")
return fs
})(),
},
{
Name: "env",
ShortUsage: "tailscale debug env",
Exec: runEnv,
ShortHelp: "Print cmd/tailscale environment",
},
{
Name: "stat",
ShortUsage: "tailscale debug stat <files...>",
Exec: runStat,
ShortHelp: "Stat a file",
},
{
Name: "hostinfo",
ShortUsage: "tailscale debug hostinfo",
Exec: runHostinfo,
ShortHelp: "Print hostinfo",
},
{
Name: "local-creds",
ShortUsage: "tailscale debug local-creds",
Exec: runLocalCreds,
ShortHelp: "Print how to access Tailscale LocalAPI",
},
{
Name: "restun",
ShortUsage: "tailscale debug restun",
Exec: localAPIAction("restun"),
ShortHelp: "Force a magicsock restun",
},
{
Name: "rebind",
ShortUsage: "tailscale debug rebind",
Exec: localAPIAction("rebind"),
ShortHelp: "Force a magicsock rebind",
},
{
Name: "derp-set-on-demand",
ShortUsage: "tailscale debug derp-set-on-demand",
Exec: localAPIAction("derp-set-homeless"),
ShortHelp: "Enable DERP on-demand mode (breaks reachability)",
},
{
Name: "derp-unset-on-demand",
ShortUsage: "tailscale debug derp-unset-on-demand",
Exec: localAPIAction("derp-unset-homeless"),
ShortHelp: "Disable DERP on-demand mode",
},
{
Name: "break-tcp-conns",
ShortUsage: "tailscale debug break-tcp-conns",
Exec: localAPIAction("break-tcp-conns"),
ShortHelp: "Break any open TCP connections from the daemon",
},
{
Name: "break-derp-conns",
ShortUsage: "tailscale debug break-derp-conns",
Exec: localAPIAction("break-derp-conns"),
ShortHelp: "Break any open DERP connections from the daemon",
},
{
Name: "pick-new-derp",
ShortUsage: "tailscale debug pick-new-derp",
Exec: localAPIAction("pick-new-derp"),
ShortHelp: "Switch to some other random DERP home region for a short time",
},
{
Name: "force-prefer-derp",
ShortUsage: "tailscale debug force-prefer-derp",
Exec: forcePreferDERP,
ShortHelp: "Prefer the given region ID if reachable (until restart, or 0 to clear)",
},
{
Name: "force-netmap-update",
ShortUsage: "tailscale debug force-netmap-update",
Exec: localAPIAction("force-netmap-update"),
ShortHelp: "Force a full no-op netmap update (for load testing)",
},
{
// TODO(bradfitz,maisem): eventually promote this out of debug
Name: "reload-config",
ShortUsage: "tailscale debug reload-config",
Exec: reloadConfig,
ShortHelp: "Reload config",
},
{
Name: "control-knobs",
ShortUsage: "tailscale debug control-knobs",
Exec: debugControlKnobs,
ShortHelp: "See current control knobs",
},
{
Name: "prefs",
ShortUsage: "tailscale debug prefs",
Exec: runPrefs,
ShortHelp: "Print prefs",
FlagSet: (func() *flag.FlagSet {
fs := newFlagSet("prefs")
fs.BoolVar(&prefsArgs.pretty, "pretty", false, "If true, pretty-print output")
return fs
})(),
},
{
Name: "watch-ipn",
ShortUsage: "tailscale debug watch-ipn",
Exec: runWatchIPN,
ShortHelp: "Subscribe to IPN message bus",
FlagSet: (func() *flag.FlagSet {
fs := newFlagSet("watch-ipn")
fs.BoolVar(&watchIPNArgs.netmap, "netmap", true, "include netmap in messages")
fs.BoolVar(&watchIPNArgs.initial, "initial", false, "include initial status")
fs.BoolVar(&watchIPNArgs.rateLimit, "rate-limit", true, "rate limit messags")
fs.BoolVar(&watchIPNArgs.showPrivateKey, "show-private-key", false, "include node private key in printed netmap")
fs.IntVar(&watchIPNArgs.count, "count", 0, "exit after printing this many statuses, or 0 to keep going forever")
return fs
})(),
},
{
Name: "netmap",
ShortUsage: "tailscale debug netmap",
Exec: runNetmap,
ShortHelp: "Print the current network map",
FlagSet: (func() *flag.FlagSet {
fs := newFlagSet("netmap")
fs.BoolVar(&netmapArgs.showPrivateKey, "show-private-key", false, "include node private key in printed netmap")
return fs
})(),
},
{
Name: "via",
ShortUsage: "tailscale debug via <site-id> <v4-cidr>\n" +
"tailscale debug via <v6-route>",
Exec: runVia,
ShortHelp: "Convert between site-specific IPv4 CIDRs and IPv6 'via' routes",
},
{
Name: "ts2021",
ShortUsage: "tailscale debug ts2021",
Exec: runTS2021,
ShortHelp: "Debug ts2021 protocol connectivity",
FlagSet: (func() *flag.FlagSet {
fs := newFlagSet("ts2021")
fs.StringVar(&ts2021Args.host, "host", "controlplane.tailscale.com", "hostname of control plane")
fs.IntVar(&ts2021Args.version, "version", int(tailcfg.CurrentCapabilityVersion), "protocol version")
fs.BoolVar(&ts2021Args.verbose, "verbose", false, "be extra verbose")
return fs
})(),
},
{
Name: "set-expire",
ShortUsage: "tailscale debug set-expire --in=1m",
Exec: runSetExpire,
ShortHelp: "Manipulate node key expiry for testing",
FlagSet: (func() *flag.FlagSet {
fs := newFlagSet("set-expire")
fs.DurationVar(&setExpireArgs.in, "in", 0, "if non-zero, set node key to expire this duration from now")
return fs
})(),
},
{
Name: "dev-store-set",
ShortUsage: "tailscale debug dev-store-set",
Exec: runDevStoreSet,
ShortHelp: "Set a key/value pair during development",
FlagSet: (func() *flag.FlagSet {
fs := newFlagSet("store-set")
fs.BoolVar(&devStoreSetArgs.danger, "danger", false, "accept danger")
return fs
})(),
},
{
Name: "derp",
ShortUsage: "tailscale debug derp",
Exec: runDebugDERP,
ShortHelp: "Test a DERP configuration",
},
{
Name: "capture",
ShortUsage: "tailscale debug capture",
Exec: runCapture,
ShortHelp: "Stream pcaps for debugging",
FlagSet: (func() *flag.FlagSet {
fs := newFlagSet("capture")
fs.StringVar(&captureArgs.outFile, "o", "", "path to stream the pcap (or - for stdout), leave empty to start wireshark")
return fs
})(),
},
{
Name: "portmap",
ShortUsage: "tailscale debug portmap",
Exec: debugPortmap,
ShortHelp: "Run portmap debugging",
FlagSet: (func() *flag.FlagSet {
fs := newFlagSet("portmap")
fs.DurationVar(&debugPortmapArgs.duration, "duration", 5*time.Second, "timeout for port mapping")
fs.StringVar(&debugPortmapArgs.ty, "type", "", `portmap debug type (one of "", "pmp", "pcp", or "upnp")`)
fs.StringVar(&debugPortmapArgs.gatewayAddr, "gateway-addr", "", `override gateway IP (must also pass --self-addr)`)
fs.StringVar(&debugPortmapArgs.selfAddr, "self-addr", "", `override self IP (must also pass --gateway-addr)`)
fs.BoolVar(&debugPortmapArgs.logHTTP, "log-http", false, `print all HTTP requests and responses to the log`)
return fs
})(),
},
{
Name: "peer-endpoint-changes",
ShortUsage: "tailscale debug peer-endpoint-changes <hostname-or-IP>",
Exec: runPeerEndpointChanges,
ShortHelp: "Print debug information about a peer's endpoint changes",
},
{
Name: "dial-types",
ShortUsage: "tailscale debug dial-types <hostname-or-IP> <port>",
Exec: runDebugDialTypes,
ShortHelp: "Print debug information about connecting to a given host or IP",
FlagSet: (func() *flag.FlagSet {
fs := newFlagSet("dial-types")
fs.StringVar(&debugDialTypesArgs.network, "network", "tcp", `network type to dial ("tcp", "udp", etc.)`)
return fs
})(),
},
{
Name: "resolve",
ShortUsage: "tailscale debug resolve <hostname>",
Exec: runDebugResolve,
ShortHelp: "Does a DNS lookup",
FlagSet: (func() *flag.FlagSet {
fs := newFlagSet("resolve")
fs.StringVar(&resolveArgs.net, "net", "ip", "network type to resolve (ip, ip4, ip6)")
return fs
})(),
},
{
Name: "go-buildinfo",
ShortUsage: "tailscale debug go-buildinfo",
ShortHelp: "Print Go's runtime/debug.BuildInfo",
Exec: runGoBuildInfo,
},
},
var (
debugCaptureCmd func() *ffcli.Command // or nil
)
func debugCmd() *ffcli.Command {
return &ffcli.Command{
Name: "debug",
Exec: runDebug,
ShortUsage: "tailscale debug <debug-flags | subcommand>",
ShortHelp: "Debug commands",
LongHelp: hidden + `"tailscale debug" contains misc debug facilities; it is not a stable interface.`,
FlagSet: (func() *flag.FlagSet {
fs := newFlagSet("debug")
fs.StringVar(&debugArgs.file, "file", "", "get, delete:NAME, or NAME")
fs.StringVar(&debugArgs.cpuFile, "cpu-profile", "", "if non-empty, grab a CPU profile for --profile-seconds seconds and write it to this file; - for stdout")
fs.StringVar(&debugArgs.memFile, "mem-profile", "", "if non-empty, grab a memory profile and write it to this file; - for stdout")
fs.IntVar(&debugArgs.cpuSec, "profile-seconds", 15, "number of seconds to run a CPU profile for, when --cpu-profile is non-empty")
return fs
})(),
Subcommands: nonNilCmds([]*ffcli.Command{
{
Name: "derp-map",
ShortUsage: "tailscale debug derp-map",
Exec: runDERPMap,
ShortHelp: "Print DERP map",
},
{
Name: "component-logs",
ShortUsage: "tailscale debug component-logs [" + strings.Join(ipn.DebuggableComponents, "|") + "]",
Exec: runDebugComponentLogs,
ShortHelp: "Enable/disable debug logs for a component",
FlagSet: (func() *flag.FlagSet {
fs := newFlagSet("component-logs")
fs.DurationVar(&debugComponentLogsArgs.forDur, "for", time.Hour, "how long to enable debug logs for; zero or negative means to disable")
return fs
})(),
},
{
Name: "daemon-goroutines",
ShortUsage: "tailscale debug daemon-goroutines",
Exec: runDaemonGoroutines,
ShortHelp: "Print tailscaled's goroutines",
},
{
Name: "daemon-logs",
ShortUsage: "tailscale debug daemon-logs",
Exec: runDaemonLogs,
ShortHelp: "Watch tailscaled's server logs",
FlagSet: (func() *flag.FlagSet {
fs := newFlagSet("daemon-logs")
fs.IntVar(&daemonLogsArgs.verbose, "verbose", 0, "verbosity level")
fs.BoolVar(&daemonLogsArgs.time, "time", false, "include client time")
return fs
})(),
},
{
Name: "metrics",
ShortUsage: "tailscale debug metrics",
Exec: runDaemonMetrics,
ShortHelp: "Print tailscaled's metrics",
FlagSet: (func() *flag.FlagSet {
fs := newFlagSet("metrics")
fs.BoolVar(&metricsArgs.watch, "watch", false, "print JSON dump of delta values")
return fs
})(),
},
{
Name: "env",
ShortUsage: "tailscale debug env",
Exec: runEnv,
ShortHelp: "Print cmd/tailscale environment",
},
{
Name: "stat",
ShortUsage: "tailscale debug stat <files...>",
Exec: runStat,
ShortHelp: "Stat a file",
},
{
Name: "hostinfo",
ShortUsage: "tailscale debug hostinfo",
Exec: runHostinfo,
ShortHelp: "Print hostinfo",
},
{
Name: "local-creds",
ShortUsage: "tailscale debug local-creds",
Exec: runLocalCreds,
ShortHelp: "Print how to access Tailscale LocalAPI",
},
{
Name: "restun",
ShortUsage: "tailscale debug restun",
Exec: localAPIAction("restun"),
ShortHelp: "Force a magicsock restun",
},
{
Name: "rebind",
ShortUsage: "tailscale debug rebind",
Exec: localAPIAction("rebind"),
ShortHelp: "Force a magicsock rebind",
},
{
Name: "derp-set-on-demand",
ShortUsage: "tailscale debug derp-set-on-demand",
Exec: localAPIAction("derp-set-homeless"),
ShortHelp: "Enable DERP on-demand mode (breaks reachability)",
},
{
Name: "derp-unset-on-demand",
ShortUsage: "tailscale debug derp-unset-on-demand",
Exec: localAPIAction("derp-unset-homeless"),
ShortHelp: "Disable DERP on-demand mode",
},
{
Name: "break-tcp-conns",
ShortUsage: "tailscale debug break-tcp-conns",
Exec: localAPIAction("break-tcp-conns"),
ShortHelp: "Break any open TCP connections from the daemon",
},
{
Name: "break-derp-conns",
ShortUsage: "tailscale debug break-derp-conns",
Exec: localAPIAction("break-derp-conns"),
ShortHelp: "Break any open DERP connections from the daemon",
},
{
Name: "pick-new-derp",
ShortUsage: "tailscale debug pick-new-derp",
Exec: localAPIAction("pick-new-derp"),
ShortHelp: "Switch to some other random DERP home region for a short time",
},
{
Name: "force-prefer-derp",
ShortUsage: "tailscale debug force-prefer-derp",
Exec: forcePreferDERP,
ShortHelp: "Prefer the given region ID if reachable (until restart, or 0 to clear)",
},
{
Name: "force-netmap-update",
ShortUsage: "tailscale debug force-netmap-update",
Exec: localAPIAction("force-netmap-update"),
ShortHelp: "Force a full no-op netmap update (for load testing)",
},
{
// TODO(bradfitz,maisem): eventually promote this out of debug
Name: "reload-config",
ShortUsage: "tailscale debug reload-config",
Exec: reloadConfig,
ShortHelp: "Reload config",
},
{
Name: "control-knobs",
ShortUsage: "tailscale debug control-knobs",
Exec: debugControlKnobs,
ShortHelp: "See current control knobs",
},
{
Name: "prefs",
ShortUsage: "tailscale debug prefs",
Exec: runPrefs,
ShortHelp: "Print prefs",
FlagSet: (func() *flag.FlagSet {
fs := newFlagSet("prefs")
fs.BoolVar(&prefsArgs.pretty, "pretty", false, "If true, pretty-print output")
return fs
})(),
},
{
Name: "watch-ipn",
ShortUsage: "tailscale debug watch-ipn",
Exec: runWatchIPN,
ShortHelp: "Subscribe to IPN message bus",
FlagSet: (func() *flag.FlagSet {
fs := newFlagSet("watch-ipn")
fs.BoolVar(&watchIPNArgs.netmap, "netmap", true, "include netmap in messages")
fs.BoolVar(&watchIPNArgs.initial, "initial", false, "include initial status")
fs.BoolVar(&watchIPNArgs.rateLimit, "rate-limit", true, "rate limit messags")
fs.BoolVar(&watchIPNArgs.showPrivateKey, "show-private-key", false, "include node private key in printed netmap")
fs.IntVar(&watchIPNArgs.count, "count", 0, "exit after printing this many statuses, or 0 to keep going forever")
return fs
})(),
},
{
Name: "netmap",
ShortUsage: "tailscale debug netmap",
Exec: runNetmap,
ShortHelp: "Print the current network map",
FlagSet: (func() *flag.FlagSet {
fs := newFlagSet("netmap")
fs.BoolVar(&netmapArgs.showPrivateKey, "show-private-key", false, "include node private key in printed netmap")
return fs
})(),
},
{
Name: "via",
ShortUsage: "tailscale debug via <site-id> <v4-cidr>\n" +
"tailscale debug via <v6-route>",
Exec: runVia,
ShortHelp: "Convert between site-specific IPv4 CIDRs and IPv6 'via' routes",
},
{
Name: "ts2021",
ShortUsage: "tailscale debug ts2021",
Exec: runTS2021,
ShortHelp: "Debug ts2021 protocol connectivity",
FlagSet: (func() *flag.FlagSet {
fs := newFlagSet("ts2021")
fs.StringVar(&ts2021Args.host, "host", "controlplane.tailscale.com", "hostname of control plane")
fs.IntVar(&ts2021Args.version, "version", int(tailcfg.CurrentCapabilityVersion), "protocol version")
fs.BoolVar(&ts2021Args.verbose, "verbose", false, "be extra verbose")
return fs
})(),
},
{
Name: "set-expire",
ShortUsage: "tailscale debug set-expire --in=1m",
Exec: runSetExpire,
ShortHelp: "Manipulate node key expiry for testing",
FlagSet: (func() *flag.FlagSet {
fs := newFlagSet("set-expire")
fs.DurationVar(&setExpireArgs.in, "in", 0, "if non-zero, set node key to expire this duration from now")
return fs
})(),
},
{
Name: "dev-store-set",
ShortUsage: "tailscale debug dev-store-set",
Exec: runDevStoreSet,
ShortHelp: "Set a key/value pair during development",
FlagSet: (func() *flag.FlagSet {
fs := newFlagSet("store-set")
fs.BoolVar(&devStoreSetArgs.danger, "danger", false, "accept danger")
return fs
})(),
},
{
Name: "derp",
ShortUsage: "tailscale debug derp",
Exec: runDebugDERP,
ShortHelp: "Test a DERP configuration",
},
ccall(debugCaptureCmd),
{
Name: "portmap",
ShortUsage: "tailscale debug portmap",
Exec: debugPortmap,
ShortHelp: "Run portmap debugging",
FlagSet: (func() *flag.FlagSet {
fs := newFlagSet("portmap")
fs.DurationVar(&debugPortmapArgs.duration, "duration", 5*time.Second, "timeout for port mapping")
fs.StringVar(&debugPortmapArgs.ty, "type", "", `portmap debug type (one of "", "pmp", "pcp", or "upnp")`)
fs.StringVar(&debugPortmapArgs.gatewayAddr, "gateway-addr", "", `override gateway IP (must also pass --self-addr)`)
fs.StringVar(&debugPortmapArgs.selfAddr, "self-addr", "", `override self IP (must also pass --gateway-addr)`)
fs.BoolVar(&debugPortmapArgs.logHTTP, "log-http", false, `print all HTTP requests and responses to the log`)
return fs
})(),
},
{
Name: "peer-endpoint-changes",
ShortUsage: "tailscale debug peer-endpoint-changes <hostname-or-IP>",
Exec: runPeerEndpointChanges,
ShortHelp: "Print debug information about a peer's endpoint changes",
},
{
Name: "dial-types",
ShortUsage: "tailscale debug dial-types <hostname-or-IP> <port>",
Exec: runDebugDialTypes,
ShortHelp: "Print debug information about connecting to a given host or IP",
FlagSet: (func() *flag.FlagSet {
fs := newFlagSet("dial-types")
fs.StringVar(&debugDialTypesArgs.network, "network", "tcp", `network type to dial ("tcp", "udp", etc.)`)
return fs
})(),
},
{
Name: "resolve",
ShortUsage: "tailscale debug resolve <hostname>",
Exec: runDebugResolve,
ShortHelp: "Does a DNS lookup",
FlagSet: (func() *flag.FlagSet {
fs := newFlagSet("resolve")
fs.StringVar(&resolveArgs.net, "net", "ip", "network type to resolve (ip, ip4, ip6)")
return fs
})(),
},
{
Name: "go-buildinfo",
ShortUsage: "tailscale debug go-buildinfo",
ShortHelp: "Print Go's runtime/debug.BuildInfo",
Exec: runGoBuildInfo,
},
}...),
}
}
func runGoBuildInfo(ctx context.Context, args []string) error {
@@ -1036,50 +1030,6 @@ func runSetExpire(ctx context.Context, args []string) error {
return localClient.DebugSetExpireIn(ctx, setExpireArgs.in)
}
var captureArgs struct {
outFile string
}
func runCapture(ctx context.Context, args []string) error {
stream, err := localClient.StreamDebugCapture(ctx)
if err != nil {
return err
}
defer stream.Close()
switch captureArgs.outFile {
case "-":
fmt.Fprintln(Stderr, "Press Ctrl-C to stop the capture.")
_, err = io.Copy(os.Stdout, stream)
return err
case "":
lua, err := os.CreateTemp("", "ts-dissector")
if err != nil {
return err
}
defer os.Remove(lua.Name())
lua.Write([]byte(capture.DissectorLua))
if err := lua.Close(); err != nil {
return err
}
wireshark := exec.CommandContext(ctx, "wireshark", "-X", "lua_script:"+lua.Name(), "-k", "-i", "-")
wireshark.Stdin = stream
wireshark.Stdout = os.Stdout
wireshark.Stderr = os.Stderr
return wireshark.Run()
}
f, err := os.OpenFile(captureArgs.outFile, os.O_WRONLY|os.O_CREATE|os.O_TRUNC, 0644)
if err != nil {
return err
}
defer f.Close()
fmt.Fprintln(Stderr, "Press Ctrl-C to stop the capture.")
_, err = io.Copy(f, stream)
return err
}
var debugPortmapArgs struct {
duration time.Duration
gatewayAddr string

View File

@@ -139,7 +139,7 @@ func newUpFlagSet(goos string, upArgs *upArgsT, cmd string) *flag.FlagSet {
// Some flags are only for "up", not "login".
upf.BoolVar(&upArgs.json, "json", false, "output in JSON format (WARNING: format subject to change)")
upf.BoolVar(&upArgs.reset, "reset", false, "reset unspecified settings to their default values")
upf.BoolVar(&upArgs.forceReauth, "force-reauth", false, "force reauthentication")
upf.BoolVar(&upArgs.forceReauth, "force-reauth", false, "force reauthentication (WARNING: this will bring down the Tailscale connection and thus should not be done remotely over SSH or RDP)")
registerAcceptRiskFlag(upf, &upArgs.acceptedRisks)
}

View File

@@ -88,6 +88,7 @@ tailscale.com/cmd/tailscale dependencies: (generated by github.com/tailscale/dep
tailscale.com/drive from tailscale.com/client/tailscale+
tailscale.com/envknob from tailscale.com/client/tailscale+
tailscale.com/envknob/featureknob from tailscale.com/client/web
tailscale.com/feature/capture/dissector from tailscale.com/cmd/tailscale/cli
tailscale.com/health from tailscale.com/net/tlsdial+
tailscale.com/health/healthmsg from tailscale.com/cmd/tailscale/cli
tailscale.com/hostinfo from tailscale.com/client/web+
@@ -102,7 +103,6 @@ tailscale.com/cmd/tailscale dependencies: (generated by github.com/tailscale/dep
tailscale.com/net/dns/recursive from tailscale.com/net/dnsfallback
tailscale.com/net/dnscache from tailscale.com/control/controlhttp+
tailscale.com/net/dnsfallback from tailscale.com/control/controlhttp+
tailscale.com/net/flowtrack from tailscale.com/net/packet
tailscale.com/net/netaddr from tailscale.com/ipn+
tailscale.com/net/netcheck from tailscale.com/cmd/tailscale/cli
tailscale.com/net/neterror from tailscale.com/net/netcheck+
@@ -110,7 +110,6 @@ tailscale.com/cmd/tailscale dependencies: (generated by github.com/tailscale/dep
💣 tailscale.com/net/netmon from tailscale.com/cmd/tailscale/cli+
💣 tailscale.com/net/netns from tailscale.com/derp/derphttp+
tailscale.com/net/netutil from tailscale.com/client/tailscale+
tailscale.com/net/packet from tailscale.com/wgengine/capture
tailscale.com/net/ping from tailscale.com/net/netcheck
tailscale.com/net/portmapper from tailscale.com/cmd/tailscale/cli+
tailscale.com/net/sockstats from tailscale.com/control/controlhttp+
@@ -133,7 +132,7 @@ tailscale.com/cmd/tailscale dependencies: (generated by github.com/tailscale/dep
tailscale.com/tsweb/varz from tailscale.com/util/usermetric
tailscale.com/types/dnstype from tailscale.com/tailcfg+
tailscale.com/types/empty from tailscale.com/ipn
tailscale.com/types/ipproto from tailscale.com/net/flowtrack+
tailscale.com/types/ipproto from tailscale.com/ipn+
tailscale.com/types/key from tailscale.com/client/tailscale+
tailscale.com/types/lazy from tailscale.com/util/testenv+
tailscale.com/types/logger from tailscale.com/client/web+
@@ -185,7 +184,6 @@ tailscale.com/cmd/tailscale dependencies: (generated by github.com/tailscale/dep
W 💣 tailscale.com/util/winutil/winenv from tailscale.com/hostinfo+
tailscale.com/version from tailscale.com/client/web+
tailscale.com/version/distro from tailscale.com/client/web+
tailscale.com/wgengine/capture from tailscale.com/cmd/tailscale/cli
tailscale.com/wgengine/filter/filtertype from tailscale.com/types/netmap
golang.org/x/crypto/argon2 from tailscale.com/tka
golang.org/x/crypto/blake2b from golang.org/x/crypto/argon2+
@@ -196,6 +194,8 @@ tailscale.com/cmd/tailscale dependencies: (generated by github.com/tailscale/dep
golang.org/x/crypto/cryptobyte/asn1 from crypto/ecdsa+
golang.org/x/crypto/curve25519 from golang.org/x/crypto/nacl/box+
golang.org/x/crypto/hkdf from crypto/tls+
golang.org/x/crypto/internal/alias from golang.org/x/crypto/chacha20+
golang.org/x/crypto/internal/poly1305 from golang.org/x/crypto/chacha20poly1305+
golang.org/x/crypto/nacl/box from tailscale.com/types/key
golang.org/x/crypto/nacl/secretbox from golang.org/x/crypto/nacl/box
golang.org/x/crypto/pbkdf2 from software.sslmate.com/src/go-pkcs12
@@ -211,6 +211,10 @@ tailscale.com/cmd/tailscale dependencies: (generated by github.com/tailscale/dep
golang.org/x/net/http2/hpack from net/http+
golang.org/x/net/icmp from tailscale.com/net/ping
golang.org/x/net/idna from golang.org/x/net/http/httpguts+
golang.org/x/net/internal/httpcommon from golang.org/x/net/http2
golang.org/x/net/internal/iana from golang.org/x/net/icmp+
golang.org/x/net/internal/socket from golang.org/x/net/icmp+
golang.org/x/net/internal/socks from golang.org/x/net/proxy
golang.org/x/net/ipv4 from github.com/miekg/dns+
golang.org/x/net/ipv6 from github.com/miekg/dns+
golang.org/x/net/proxy from tailscale.com/net/netns
@@ -249,6 +253,18 @@ tailscale.com/cmd/tailscale dependencies: (generated by github.com/tailscale/dep
crypto/ed25519 from crypto/tls+
crypto/elliptic from crypto/ecdsa+
crypto/hmac from crypto/tls+
crypto/internal/alias from crypto/aes+
crypto/internal/bigmod from crypto/ecdsa+
crypto/internal/boring from crypto/aes+
crypto/internal/boring/bbig from crypto/ecdsa+
crypto/internal/boring/sig from crypto/internal/boring
crypto/internal/edwards25519 from crypto/ed25519
crypto/internal/edwards25519/field from crypto/ecdh+
crypto/internal/hpke from crypto/tls
crypto/internal/mlkem768 from crypto/tls
crypto/internal/nistec from crypto/ecdh+
crypto/internal/nistec/fiat from crypto/internal/nistec
crypto/internal/randutil from crypto/dsa+
crypto/md5 from crypto/tls+
crypto/rand from crypto/ed25519+
crypto/rc4 from crypto/tls
@@ -259,6 +275,7 @@ tailscale.com/cmd/tailscale dependencies: (generated by github.com/tailscale/dep
crypto/subtle from crypto/aes+
crypto/tls from github.com/miekg/dns+
crypto/x509 from crypto/tls+
D crypto/x509/internal/macos from crypto/x509
crypto/x509/pkix from crypto/x509+
DW database/sql/driver from github.com/google/uuid
W debug/dwarf from debug/pe
@@ -287,6 +304,44 @@ tailscale.com/cmd/tailscale dependencies: (generated by github.com/tailscale/dep
image from github.com/skip2/go-qrcode+
image/color from github.com/skip2/go-qrcode+
image/png from github.com/skip2/go-qrcode
internal/abi from crypto/x509/internal/macos+
internal/asan from syscall
internal/bisect from internal/godebug
internal/bytealg from bytes+
internal/byteorder from crypto/aes+
internal/chacha8rand from math/rand/v2+
internal/concurrent from unique
internal/coverage/rtcov from runtime
internal/cpu from crypto/aes+
internal/filepathlite from os+
internal/fmtsort from fmt+
internal/goarch from crypto/aes+
internal/godebug from archive/tar+
internal/godebugs from internal/godebug+
internal/goexperiment from runtime
internal/goos from crypto/x509+
internal/itoa from internal/poll+
internal/msan from syscall
internal/nettrace from net+
internal/oserror from io/fs+
internal/poll from net+
internal/profilerecord from runtime
internal/race from internal/poll+
internal/reflectlite from context+
internal/runtime/atomic from internal/runtime/exithook+
internal/runtime/exithook from runtime
L internal/runtime/syscall from runtime+
internal/saferio from debug/pe+
internal/singleflight from net
internal/stringslite from embed+
internal/syscall/execenv from os+
LD internal/syscall/unix from crypto/rand+
W internal/syscall/windows from crypto/rand+
W internal/syscall/windows/registry from mime+
W internal/syscall/windows/sysdll from internal/syscall/windows+
internal/testlog from os
internal/unsafeheader from internal/reflectlite+
internal/weak from unique
io from archive/tar+
io/fs from archive/tar+
io/ioutil from github.com/mitchellh/go-ps+
@@ -308,6 +363,7 @@ tailscale.com/cmd/tailscale dependencies: (generated by github.com/tailscale/dep
net/http/httptrace from golang.org/x/net/http2+
net/http/httputil from tailscale.com/client/web+
net/http/internal from net/http+
net/http/internal/ascii from net/http+
net/netip from go4.org/netipx+
net/textproto from golang.org/x/net/http/httpguts+
net/url from crypto/x509+
@@ -320,7 +376,10 @@ tailscale.com/cmd/tailscale dependencies: (generated by github.com/tailscale/dep
reflect from archive/tar+
regexp from github.com/coreos/go-iptables/iptables+
regexp/syntax from regexp
runtime from archive/tar+
runtime/debug from tailscale.com+
runtime/internal/math from runtime
runtime/internal/sys from runtime
slices from tailscale.com/client/web+
sort from compress/flate+
strconv from archive/tar+
@@ -336,3 +395,4 @@ tailscale.com/cmd/tailscale dependencies: (generated by github.com/tailscale/dep
unicode/utf16 from crypto/x509+
unicode/utf8 from bufio+
unique from net/netip
unsafe from bytes+

View File

@@ -152,7 +152,6 @@ tailscale.com/cmd/tailscaled dependencies: (generated by github.com/tailscale/de
W 💣 github.com/tailscale/go-winio/internal/socket from github.com/tailscale/go-winio
W github.com/tailscale/go-winio/internal/stringbuffer from github.com/tailscale/go-winio/internal/fs
W github.com/tailscale/go-winio/pkg/guid from github.com/tailscale/go-winio+
github.com/tailscale/golang-x-crypto/acme from tailscale.com/ipn/ipnlocal
LD github.com/tailscale/golang-x-crypto/internal/poly1305 from github.com/tailscale/golang-x-crypto/ssh
LD github.com/tailscale/golang-x-crypto/ssh from tailscale.com/ipn/ipnlocal+
LD github.com/tailscale/golang-x-crypto/ssh/internal/bcrypt_pbkdf from github.com/tailscale/golang-x-crypto/ssh
@@ -260,6 +259,7 @@ tailscale.com/cmd/tailscaled dependencies: (generated by github.com/tailscale/de
tailscale.com/envknob from tailscale.com/client/tailscale+
tailscale.com/envknob/featureknob from tailscale.com/client/web+
tailscale.com/feature from tailscale.com/feature/wakeonlan+
tailscale.com/feature/capture from tailscale.com/feature/condregister
tailscale.com/feature/condregister from tailscale.com/cmd/tailscaled
L tailscale.com/feature/tap from tailscale.com/feature/condregister
tailscale.com/feature/wakeonlan from tailscale.com/feature/condregister
@@ -273,7 +273,7 @@ tailscale.com/cmd/tailscaled dependencies: (generated by github.com/tailscale/de
tailscale.com/ipn/ipnlocal from tailscale.com/cmd/tailscaled+
tailscale.com/ipn/ipnserver from tailscale.com/cmd/tailscaled
tailscale.com/ipn/ipnstate from tailscale.com/client/tailscale+
tailscale.com/ipn/localapi from tailscale.com/ipn/ipnserver
tailscale.com/ipn/localapi from tailscale.com/ipn/ipnserver+
tailscale.com/ipn/policy from tailscale.com/ipn/ipnlocal
tailscale.com/ipn/store from tailscale.com/cmd/tailscaled+
L tailscale.com/ipn/store/awsstore from tailscale.com/ipn/store
@@ -338,8 +338,10 @@ tailscale.com/cmd/tailscaled dependencies: (generated by github.com/tailscale/de
tailscale.com/syncs from tailscale.com/cmd/tailscaled+
tailscale.com/tailcfg from tailscale.com/client/tailscale+
tailscale.com/taildrop from tailscale.com/ipn/ipnlocal+
tailscale.com/tempfork/acme from tailscale.com/ipn/ipnlocal
LD tailscale.com/tempfork/gliderlabs/ssh from tailscale.com/ssh/tailssh
tailscale.com/tempfork/heap from tailscale.com/wgengine/magicsock
tailscale.com/tempfork/httprec from tailscale.com/control/controlclient
tailscale.com/tka from tailscale.com/client/tailscale+
tailscale.com/tsconst from tailscale.com/net/netmon+
tailscale.com/tsd from tailscale.com/cmd/tailscaled+
@@ -422,7 +424,6 @@ tailscale.com/cmd/tailscaled dependencies: (generated by github.com/tailscale/de
tailscale.com/version/distro from tailscale.com/client/web+
W tailscale.com/wf from tailscale.com/cmd/tailscaled
tailscale.com/wgengine from tailscale.com/cmd/tailscaled+
tailscale.com/wgengine/capture from tailscale.com/ipn/ipnlocal+
tailscale.com/wgengine/filter from tailscale.com/control/controlclient+
tailscale.com/wgengine/filter/filtertype from tailscale.com/types/netmap+
💣 tailscale.com/wgengine/magicsock from tailscale.com/ipn/ipnlocal+
@@ -445,12 +446,15 @@ tailscale.com/cmd/tailscaled dependencies: (generated by github.com/tailscale/de
golang.org/x/crypto/cryptobyte/asn1 from crypto/ecdsa+
golang.org/x/crypto/curve25519 from github.com/tailscale/golang-x-crypto/ssh+
golang.org/x/crypto/hkdf from crypto/tls+
golang.org/x/crypto/internal/alias from golang.org/x/crypto/chacha20+
golang.org/x/crypto/internal/poly1305 from golang.org/x/crypto/chacha20poly1305+
golang.org/x/crypto/nacl/box from tailscale.com/types/key
golang.org/x/crypto/nacl/secretbox from golang.org/x/crypto/nacl/box
golang.org/x/crypto/poly1305 from github.com/tailscale/wireguard-go/device
golang.org/x/crypto/salsa20/salsa from golang.org/x/crypto/nacl/box+
golang.org/x/crypto/sha3 from crypto/internal/mlkem768+
LD golang.org/x/crypto/ssh from github.com/pkg/sftp+
LD golang.org/x/crypto/ssh/internal/bcrypt_pbkdf from golang.org/x/crypto/ssh
golang.org/x/exp/constraints from github.com/dblohm7/wingoes/pe+
golang.org/x/exp/maps from tailscale.com/ipn/store/mem+
golang.org/x/net/bpf from github.com/mdlayher/genetlink+
@@ -462,6 +466,10 @@ tailscale.com/cmd/tailscaled dependencies: (generated by github.com/tailscale/de
golang.org/x/net/http2/hpack from golang.org/x/net/http2+
golang.org/x/net/icmp from tailscale.com/net/ping+
golang.org/x/net/idna from golang.org/x/net/http/httpguts+
golang.org/x/net/internal/httpcommon from golang.org/x/net/http2
golang.org/x/net/internal/iana from golang.org/x/net/icmp+
golang.org/x/net/internal/socket from golang.org/x/net/icmp+
golang.org/x/net/internal/socks from golang.org/x/net/proxy
golang.org/x/net/ipv4 from github.com/miekg/dns+
golang.org/x/net/ipv6 from github.com/miekg/dns+
golang.org/x/net/proxy from tailscale.com/net/netns
@@ -501,6 +509,18 @@ tailscale.com/cmd/tailscaled dependencies: (generated by github.com/tailscale/de
crypto/ed25519 from crypto/tls+
crypto/elliptic from crypto/ecdsa+
crypto/hmac from crypto/tls+
crypto/internal/alias from crypto/aes+
crypto/internal/bigmod from crypto/ecdsa+
crypto/internal/boring from crypto/aes+
crypto/internal/boring/bbig from crypto/ecdsa+
crypto/internal/boring/sig from crypto/internal/boring
crypto/internal/edwards25519 from crypto/ed25519
crypto/internal/edwards25519/field from crypto/ecdh+
crypto/internal/hpke from crypto/tls
crypto/internal/mlkem768 from crypto/tls
crypto/internal/nistec from crypto/ecdh+
crypto/internal/nistec/fiat from crypto/internal/nistec
crypto/internal/randutil from crypto/dsa+
crypto/md5 from crypto/tls+
crypto/rand from crypto/ed25519+
crypto/rc4 from crypto/tls+
@@ -511,6 +531,7 @@ tailscale.com/cmd/tailscaled dependencies: (generated by github.com/tailscale/de
crypto/subtle from crypto/aes+
crypto/tls from github.com/aws/aws-sdk-go-v2/aws/transport/http+
crypto/x509 from crypto/tls+
D crypto/x509/internal/macos from crypto/x509
crypto/x509/pkix from crypto/x509+
DW database/sql/driver from github.com/google/uuid
W debug/dwarf from debug/pe
@@ -528,7 +549,7 @@ tailscale.com/cmd/tailscaled dependencies: (generated by github.com/tailscale/de
encoding/xml from github.com/aws/aws-sdk-go-v2/aws/protocol/xml+
errors from archive/tar+
expvar from tailscale.com/derp+
flag from net/http/httptest+
flag from tailscale.com/cmd/tailscaled+
fmt from archive/tar+
hash from compress/zlib+
hash/adler32 from compress/zlib+
@@ -536,6 +557,45 @@ tailscale.com/cmd/tailscaled dependencies: (generated by github.com/tailscale/de
hash/maphash from go4.org/mem
html from html/template+
html/template from github.com/gorilla/csrf
internal/abi from crypto/x509/internal/macos+
internal/asan from syscall
internal/bisect from internal/godebug
internal/bytealg from bytes+
internal/byteorder from crypto/aes+
internal/chacha8rand from math/rand/v2+
internal/concurrent from unique
internal/coverage/rtcov from runtime
internal/cpu from crypto/aes+
internal/filepathlite from os+
internal/fmtsort from fmt+
internal/goarch from crypto/aes+
internal/godebug from archive/tar+
internal/godebugs from internal/godebug+
internal/goexperiment from runtime
internal/goos from crypto/x509+
internal/itoa from internal/poll+
internal/msan from syscall
internal/nettrace from net+
internal/oserror from io/fs+
internal/poll from net+
internal/profile from net/http/pprof
internal/profilerecord from runtime+
internal/race from internal/poll+
internal/reflectlite from context+
internal/runtime/atomic from internal/runtime/exithook+
internal/runtime/exithook from runtime
L internal/runtime/syscall from runtime+
internal/saferio from debug/pe+
internal/singleflight from net
internal/stringslite from embed+
internal/syscall/execenv from os+
LD internal/syscall/unix from crypto/rand+
W internal/syscall/windows from crypto/rand+
W internal/syscall/windows/registry from mime+
W internal/syscall/windows/sysdll from internal/syscall/windows+
internal/testlog from os
internal/unsafeheader from internal/reflectlite+
internal/weak from unique
io from archive/tar+
io/fs from archive/tar+
io/ioutil from github.com/aws/aws-sdk-go-v2/aws/protocol/query+
@@ -554,10 +614,10 @@ tailscale.com/cmd/tailscaled dependencies: (generated by github.com/tailscale/de
mime/quotedprintable from mime/multipart
net from crypto/tls+
net/http from expvar+
net/http/httptest from tailscale.com/control/controlclient
net/http/httptrace from github.com/prometheus-community/pro-bing+
net/http/httputil from github.com/aws/smithy-go/transport/http+
net/http/internal from net/http+
net/http/internal/ascii from net/http+
net/http/pprof from tailscale.com/cmd/tailscaled+
net/netip from github.com/tailscale/wireguard-go/conn+
net/textproto from github.com/aws/aws-sdk-go-v2/aws/signer/v4+
@@ -571,7 +631,10 @@ tailscale.com/cmd/tailscaled dependencies: (generated by github.com/tailscale/de
reflect from archive/tar+
regexp from github.com/aws/aws-sdk-go-v2/internal/endpoints/awsrulesfn+
regexp/syntax from regexp
runtime from archive/tar+
runtime/debug from github.com/aws/aws-sdk-go-v2/internal/sync/singleflight+
runtime/internal/math from runtime
runtime/internal/sys from runtime
runtime/pprof from net/http/pprof+
runtime/trace from net/http/pprof
slices from tailscale.com/appc+
@@ -589,3 +652,4 @@ tailscale.com/cmd/tailscaled dependencies: (generated by github.com/tailscale/de
unicode/utf16 from crypto/x509+
unicode/utf8 from bufio+
unique from net/netip
unsafe from bytes+

View File

@@ -22,6 +22,8 @@ func TestDeps(t *testing.T) {
BadDeps: map[string]string{
"testing": "do not use testing package in production code",
"gvisor.dev/gvisor/pkg/hostarch": "will crash on non-4K page sizes; see https://github.com/tailscale/tailscale/issues/8658",
"net/http/httptest": "do not use httptest in production code",
"net/http/internal/testcert": "do not use httptest in production code",
},
}.Check(t)

View File

@@ -21,6 +21,7 @@ import (
"tailscale.com/types/netmap"
"tailscale.com/types/persist"
"tailscale.com/types/structs"
"tailscale.com/util/clientmetric"
"tailscale.com/util/execqueue"
)
@@ -131,6 +132,8 @@ type Auto struct {
// the server.
lastUpdateGen updateGen
lastStatus atomic.Pointer[Status]
paused bool // whether we should stop making HTTP requests
unpauseWaiters []chan bool // chans that gets sent true (once) on wake, or false on Shutdown
loggedIn bool // true if currently logged in
@@ -596,21 +599,90 @@ func (c *Auto) sendStatus(who string, err error, url string, nm *netmap.NetworkM
// not logged in.
nm = nil
}
new := Status{
newSt := &Status{
URL: url,
Persist: p,
NetMap: nm,
Err: err,
state: state,
}
c.lastStatus.Store(newSt)
// Launch a new goroutine to avoid blocking the caller while the observer
// does its thing, which may result in a call back into the client.
metricQueued.Add(1)
c.observerQueue.Add(func() {
c.observer.SetControlClientStatus(c, new)
if canSkipStatus(newSt, c.lastStatus.Load()) {
metricSkippable.Add(1)
if !c.direct.controlKnobs.DisableSkipStatusQueue.Load() {
metricSkipped.Add(1)
return
}
}
c.observer.SetControlClientStatus(c, *newSt)
// Best effort stop retaining the memory now that we've sent it to the
// observer (LocalBackend). We CAS here because the caller goroutine is
// doing a Store which we want to win a race. This is only a memory
// optimization and is not for correctness.
//
// If the CAS fails, that means somebody else's Store replaced our
// pointer (so mission accomplished: our netmap is no longer retained in
// any case) and that Store caller will be responsible for removing
// their own netmap (or losing their race too, down the chain).
// Eventually the last caller will win this CAS and zero lastStatus.
c.lastStatus.CompareAndSwap(newSt, nil)
})
}
var (
metricQueued = clientmetric.NewCounter("controlclient_auto_status_queued")
metricSkippable = clientmetric.NewCounter("controlclient_auto_status_queue_skippable")
metricSkipped = clientmetric.NewCounter("controlclient_auto_status_queue_skipped")
)
// canSkipStatus reports whether we can skip sending s1, knowing
// that s2 is enqueued sometime in the future after s1.
//
// s1 must be non-nil. s2 may be nil.
func canSkipStatus(s1, s2 *Status) bool {
if s2 == nil {
// Nothing in the future.
return false
}
if s1 == s2 {
// If the last item in the queue is the same as s1,
// we can't skip it.
return false
}
if s1.Err != nil || s1.URL != "" {
// If s1 has an error or a URL, we shouldn't skip it, lest the error go
// away in s2 or in-between. We want to make sure all the subsystems see
// it. Plus there aren't many of these, so not worth skipping.
return false
}
if !s1.Persist.Equals(s2.Persist) || s1.state != s2.state {
// If s1 has a different Persist or state than s2,
// don't skip it. We only care about skipping the typical
// entries where the only difference is the NetMap.
return false
}
// If nothing above precludes it, and both s1 and s2 have NetMaps, then
// we can skip it, because s2's NetMap is a newer version and we can
// jump straight from whatever state we had before to s2's state,
// without passing through s1's state first. A NetMap is regrettably a
// full snapshot of the state, not an incremental delta. We're slowly
// moving towards passing around only deltas around internally at all
// layers, but this is explicitly the case where we didn't have a delta
// path for the message we received over the wire and had to resort
// to the legacy full NetMap path. And then we can get behind processing
// these full NetMap snapshots in LocalBackend/wgengine/magicsock/netstack
// and this path (when it returns true) lets us skip over useless work
// and not get behind in the queue. This matters in particular for tailnets
// that are both very large + very churny.
return s1.NetMap != nil && s2.NetMap != nil
}
func (c *Auto) Login(flags LoginFlags) {
c.logf("client.Login(%v)", flags)

View File

@@ -4,8 +4,13 @@
package controlclient
import (
"io"
"reflect"
"slices"
"testing"
"tailscale.com/types/netmap"
"tailscale.com/types/persist"
)
func fieldsOf(t reflect.Type) (fields []string) {
@@ -62,3 +67,83 @@ func TestStatusEqual(t *testing.T) {
}
}
}
// tests [canSkipStatus].
func TestCanSkipStatus(t *testing.T) {
st := new(Status)
nm1 := &netmap.NetworkMap{}
nm2 := &netmap.NetworkMap{}
tests := []struct {
name string
s1, s2 *Status
want bool
}{
{
name: "nil-s2",
s1: st,
s2: nil,
want: false,
},
{
name: "equal",
s1: st,
s2: st,
want: false,
},
{
name: "s1-error",
s1: &Status{Err: io.EOF, NetMap: nm1},
s2: &Status{NetMap: nm2},
want: false,
},
{
name: "s1-url",
s1: &Status{URL: "foo", NetMap: nm1},
s2: &Status{NetMap: nm2},
want: false,
},
{
name: "s1-persist-diff",
s1: &Status{Persist: new(persist.Persist).View(), NetMap: nm1},
s2: &Status{NetMap: nm2},
want: false,
},
{
name: "s1-state-diff",
s1: &Status{state: 123, NetMap: nm1},
s2: &Status{NetMap: nm2},
want: false,
},
{
name: "s1-no-netmap1",
s1: &Status{NetMap: nil},
s2: &Status{NetMap: nm2},
want: false,
},
{
name: "s1-no-netmap2",
s1: &Status{NetMap: nm1},
s2: &Status{NetMap: nil},
want: false,
},
{
name: "skip",
s1: &Status{NetMap: nm1},
s2: &Status{NetMap: nm2},
want: true,
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
if got := canSkipStatus(tt.s1, tt.s2); got != tt.want {
t.Errorf("canSkipStatus = %v, want %v", got, tt.want)
}
})
}
want := []string{"Err", "URL", "NetMap", "Persist", "state"}
if f := fieldsOf(reflect.TypeFor[Status]()); !slices.Equal(f, want) {
t.Errorf("Status fields = %q; this code was only written to handle fields %q", f, want)
}
}

View File

@@ -15,7 +15,6 @@ import (
"log"
"net"
"net/http"
"net/http/httptest"
"net/netip"
"net/url"
"os"
@@ -42,6 +41,7 @@ import (
"tailscale.com/net/tsdial"
"tailscale.com/net/tshttpproxy"
"tailscale.com/tailcfg"
"tailscale.com/tempfork/httprec"
"tailscale.com/tka"
"tailscale.com/tstime"
"tailscale.com/types/key"
@@ -1384,7 +1384,7 @@ func answerC2NPing(logf logger.Logf, c2nHandler http.Handler, c *http.Client, pr
handlerCtx, cancel := context.WithTimeout(context.Background(), handlerTimeout)
defer cancel()
hreq = hreq.WithContext(handlerCtx)
rec := httptest.NewRecorder()
rec := httprec.NewRecorder()
c2nHandler.ServeHTTP(rec, hreq)
cancel()

View File

@@ -195,10 +195,6 @@ func (ms *mapSession) HandleNonKeepAliveMapResponse(ctx context.Context, resp *t
ms.updateStateFromResponse(resp)
// Occasionally clean up old userprofile if it grows too much
// from e.g. ephemeral tagged nodes.
ms.cleanLastUserProfile()
if ms.tryHandleIncrementally(resp) {
ms.occasionallyPrintSummary(ms.lastNetmapSummary)
return nil
@@ -296,10 +292,20 @@ func (ms *mapSession) updateStateFromResponse(resp *tailcfg.MapResponse) {
for _, up := range resp.UserProfiles {
ms.lastUserProfile[up.ID] = up
}
// TODO(bradfitz): clean up old user profiles? maybe not worth it.
if dm := resp.DERPMap; dm != nil {
ms.vlogf("netmap: new map contains DERP map")
// Guard against the control server accidentally sending
// a nil region definition, which at least Headscale was
// observed to send.
for rid, r := range dm.Regions {
if r == nil {
delete(dm.Regions, rid)
}
}
// Zero-valued fields in a DERPMap mean that we're not changing
// anything and are using the previous value(s).
if ldm := ms.lastDERPMap; ldm != nil {
@@ -535,32 +541,6 @@ func (ms *mapSession) addUserProfile(nm *netmap.NetworkMap, userID tailcfg.UserI
}
}
// cleanLastUserProfile deletes any entries from lastUserProfile
// that are not referenced by any peer or the self node.
//
// This is expensive enough that we don't do this on every message
// from the server, but only when it's grown enough to matter.
func (ms *mapSession) cleanLastUserProfile() {
if len(ms.lastUserProfile) < len(ms.peers)*2 {
// Hasn't grown enough to be worth cleaning.
return
}
keep := set.Set[tailcfg.UserID]{}
if node := ms.lastNode; node.Valid() {
keep.Add(node.User())
}
for _, n := range ms.peers {
keep.Add(n.User())
keep.Add(n.Sharer())
}
for userID := range ms.lastUserProfile {
if !keep.Contains(userID) {
delete(ms.lastUserProfile, userID)
}
}
}
var debugPatchifyPeer = envknob.RegisterBool("TS_DEBUG_PATCHIFY_PEER")
// patchifyPeersChanged mutates resp to promote PeersChanged entries to PeersChangedPatch

View File

@@ -6,6 +6,8 @@
package controlknobs
import (
"fmt"
"reflect"
"sync/atomic"
"tailscale.com/syncs"
@@ -103,6 +105,11 @@ type Knobs struct {
// DisableCaptivePortalDetection is whether the node should not perform captive portal detection
// automatically when the network state changes.
DisableCaptivePortalDetection atomic.Bool
// DisableSkipStatusQueue is whether the node should disable skipping
// of queued netmap.NetworkMap between the controlclient and LocalBackend.
// See tailscale/tailscale#14768.
DisableSkipStatusQueue atomic.Bool
}
// UpdateFromNodeAttributes updates k (if non-nil) based on the provided self
@@ -132,6 +139,7 @@ func (k *Knobs) UpdateFromNodeAttributes(capMap tailcfg.NodeCapMap) {
disableLocalDNSOverrideViaNRPT = has(tailcfg.NodeAttrDisableLocalDNSOverrideViaNRPT)
disableCryptorouting = has(tailcfg.NodeAttrDisableMagicSockCryptoRouting)
disableCaptivePortalDetection = has(tailcfg.NodeAttrDisableCaptivePortalDetection)
disableSkipStatusQueue = has(tailcfg.NodeAttrDisableSkipStatusQueue)
)
if has(tailcfg.NodeAttrOneCGNATEnable) {
@@ -159,6 +167,7 @@ func (k *Knobs) UpdateFromNodeAttributes(capMap tailcfg.NodeCapMap) {
k.DisableLocalDNSOverrideViaNRPT.Store(disableLocalDNSOverrideViaNRPT)
k.DisableCryptorouting.Store(disableCryptorouting)
k.DisableCaptivePortalDetection.Store(disableCaptivePortalDetection)
k.DisableSkipStatusQueue.Store(disableSkipStatusQueue)
}
// AsDebugJSON returns k as something that can be marshalled with json.Marshal
@@ -167,25 +176,19 @@ func (k *Knobs) AsDebugJSON() map[string]any {
if k == nil {
return nil
}
return map[string]any{
"DisableUPnP": k.DisableUPnP.Load(),
"KeepFullWGConfig": k.KeepFullWGConfig.Load(),
"RandomizeClientPort": k.RandomizeClientPort.Load(),
"OneCGNAT": k.OneCGNAT.Load(),
"ForceBackgroundSTUN": k.ForceBackgroundSTUN.Load(),
"DisableDeltaUpdates": k.DisableDeltaUpdates.Load(),
"PeerMTUEnable": k.PeerMTUEnable.Load(),
"DisableDNSForwarderTCPRetries": k.DisableDNSForwarderTCPRetries.Load(),
"SilentDisco": k.SilentDisco.Load(),
"LinuxForceIPTables": k.LinuxForceIPTables.Load(),
"LinuxForceNfTables": k.LinuxForceNfTables.Load(),
"SeamlessKeyRenewal": k.SeamlessKeyRenewal.Load(),
"ProbeUDPLifetime": k.ProbeUDPLifetime.Load(),
"AppCStoreRoutes": k.AppCStoreRoutes.Load(),
"UserDialUseRoutes": k.UserDialUseRoutes.Load(),
"DisableSplitDNSWhenNoCustomResolvers": k.DisableSplitDNSWhenNoCustomResolvers.Load(),
"DisableLocalDNSOverrideViaNRPT": k.DisableLocalDNSOverrideViaNRPT.Load(),
"DisableCryptorouting": k.DisableCryptorouting.Load(),
"DisableCaptivePortalDetection": k.DisableCaptivePortalDetection.Load(),
ret := map[string]any{}
rt := reflect.TypeFor[Knobs]()
rv := reflect.ValueOf(k).Elem() // of *k
for i := 0; i < rt.NumField(); i++ {
name := rt.Field(i).Name
switch v := rv.Field(i).Addr().Interface().(type) {
case *atomic.Bool:
ret[name] = v.Load()
case *syncs.AtomicValue[opt.Bool]:
ret[name] = v.Load()
default:
panic(fmt.Sprintf("unknown field type %T for %v", v, name))
}
}
return ret
}

View File

@@ -6,6 +6,8 @@ package controlknobs
import (
"reflect"
"testing"
"tailscale.com/types/logger"
)
func TestAsDebugJSON(t *testing.T) {
@@ -18,4 +20,5 @@ func TestAsDebugJSON(t *testing.T) {
if want := reflect.TypeFor[Knobs]().NumField(); len(got) != want {
t.Errorf("AsDebugJSON map has %d fields; want %v", len(got), want)
}
t.Logf("Got: %v", logger.AsJSON(got))
}

View File

@@ -55,8 +55,7 @@ func CanRunTailscaleSSH() error {
func CanUseExitNode() error {
switch dist := distro.Get(); dist {
case distro.Synology, // see https://github.com/tailscale/tailscale/issues/1995
distro.QNAP,
distro.Unraid:
distro.QNAP:
return errors.New("Tailscale exit nodes cannot be used on " + string(dist))
}

View File

@@ -13,21 +13,44 @@ import (
"sync"
"time"
_ "embed"
"tailscale.com/feature"
"tailscale.com/ipn/localapi"
"tailscale.com/net/packet"
"tailscale.com/util/set"
)
//go:embed ts-dissector.lua
var DissectorLua string
func init() {
feature.Register("capture")
localapi.Register("debug-capture", serveLocalAPIDebugCapture)
}
// Callback describes a function which is called to
// record packets when debugging packet-capture.
// Such callbacks must not take ownership of the
// provided data slice: it may only copy out of it
// within the lifetime of the function.
type Callback func(Path, time.Time, []byte, packet.CaptureMeta)
func serveLocalAPIDebugCapture(h *localapi.Handler, w http.ResponseWriter, r *http.Request) {
ctx := r.Context()
if !h.PermitWrite {
http.Error(w, "debug access denied", http.StatusForbidden)
return
}
if r.Method != "POST" {
http.Error(w, "POST required", http.StatusMethodNotAllowed)
return
}
w.WriteHeader(http.StatusOK)
w.(http.Flusher).Flush()
b := h.LocalBackend()
s := b.GetOrSetCaptureSink(newSink)
unregister := s.RegisterOutput(w)
select {
case <-ctx.Done():
case <-s.WaitCh():
}
unregister()
b.ClearCaptureSink()
}
var bufferPool = sync.Pool{
New: func() any {
@@ -57,29 +80,8 @@ func writePktHeader(w *bytes.Buffer, when time.Time, length int) {
binary.Write(w, binary.LittleEndian, uint32(length)) // total length
}
// Path describes where in the data path the packet was captured.
type Path uint8
// Valid Path values.
const (
// FromLocal indicates the packet was logged as it traversed the FromLocal path:
// i.e.: A packet from the local system into the TUN.
FromLocal Path = 0
// FromPeer indicates the packet was logged upon reception from a remote peer.
FromPeer Path = 1
// SynthesizedToLocal indicates the packet was generated from within tailscaled,
// and is being routed to the local machine's network stack.
SynthesizedToLocal Path = 2
// SynthesizedToPeer indicates the packet was generated from within tailscaled,
// and is being routed to a remote Wireguard peer.
SynthesizedToPeer Path = 3
// PathDisco indicates the packet is information about a disco frame.
PathDisco Path = 254
)
// New creates a new capture sink.
func New() *Sink {
// newSink creates a new capture sink.
func newSink() packet.CaptureSink {
ctx, c := context.WithCancel(context.Background())
return &Sink{
ctx: ctx,
@@ -126,6 +128,10 @@ func (s *Sink) RegisterOutput(w io.Writer) (unregister func()) {
}
}
func (s *Sink) CaptureCallback() packet.CaptureCallback {
return s.LogPacket
}
// NumOutputs returns the number of outputs registered with the sink.
func (s *Sink) NumOutputs() int {
s.mu.Lock()
@@ -174,7 +180,7 @@ func customDataLen(meta packet.CaptureMeta) int {
// LogPacket is called to insert a packet into the capture.
//
// This function does not take ownership of the provided data slice.
func (s *Sink) LogPacket(path Path, when time.Time, data []byte, meta packet.CaptureMeta) {
func (s *Sink) LogPacket(path packet.CapturePath, when time.Time, data []byte, meta packet.CaptureMeta) {
select {
case <-s.ctx.Done():
return

View File

@@ -0,0 +1,12 @@
// Copyright (c) Tailscale Inc & AUTHORS
// SPDX-License-Identifier: BSD-3-Clause
// Package dissector contains the Lua dissector for Tailscale packets.
package dissector
import (
_ "embed"
)
//go:embed ts-dissector.lua
var Lua string

View File

@@ -0,0 +1,8 @@
// Copyright (c) Tailscale Inc & AUTHORS
// SPDX-License-Identifier: BSD-3-Clause
//go:build !ios && !ts_omit_capture
package condregister
import _ "tailscale.com/feature/capture"

18
go.mod
View File

@@ -74,7 +74,7 @@ require (
github.com/skip2/go-qrcode v0.0.0-20200617195104-da1b6568686e
github.com/studio-b12/gowebdav v0.9.0
github.com/tailscale/certstore v0.1.1-0.20231202035212-d3fa0460f47e
github.com/tailscale/depaware v0.0.0-20210622194025-720c4b409502
github.com/tailscale/depaware v0.0.0-20250112153213-b748de04d81b
github.com/tailscale/goexpect v0.0.0-20210902213824-6e8c725cea41
github.com/tailscale/golang-x-crypto v0.0.0-20240604161659-3fde5e568aa4
github.com/tailscale/goupnp v1.0.1-0.20210804011211-c64d0f06ea05
@@ -82,7 +82,7 @@ require (
github.com/tailscale/mkctr v0.0.0-20250110151924-54977352e4a6
github.com/tailscale/netlink v1.1.1-0.20240822203006-4d49adab4de7
github.com/tailscale/peercred v0.0.0-20250107143737-35a0c7bd7edc
github.com/tailscale/web-client-prebuilt v0.0.0-20240226180453-5db17b287bf1
github.com/tailscale/web-client-prebuilt v0.0.0-20250124233751-d4cd19a26976
github.com/tailscale/wf v0.0.0-20240214030419-6fbb0a674ee6
github.com/tailscale/wireguard-go v0.0.0-20250107165329-0b8b35511f19
github.com/tailscale/xnet v0.0.0-20240729143630-8497ac4dab2e
@@ -94,14 +94,14 @@ require (
go.uber.org/zap v1.27.0
go4.org/mem v0.0.0-20240501181205-ae6ca9944745
go4.org/netipx v0.0.0-20231129151722-fdeea329fbba
golang.org/x/crypto v0.32.0
golang.org/x/crypto v0.33.0
golang.org/x/exp v0.0.0-20250106191152-7588d65b2ba8
golang.org/x/mod v0.22.0
golang.org/x/net v0.34.0
golang.org/x/net v0.35.0
golang.org/x/oauth2 v0.25.0
golang.org/x/sync v0.10.0
golang.org/x/sys v0.29.1-0.20250107080300-1c14dcadc3ab
golang.org/x/term v0.28.0
golang.org/x/sync v0.11.0
golang.org/x/sys v0.30.0
golang.org/x/term v0.29.0
golang.org/x/time v0.9.0
golang.org/x/tools v0.29.0
golang.zx2c4.com/wintun v0.0.0-20230126152724-0fa3db229ce2
@@ -265,7 +265,7 @@ require (
github.com/gordonklaus/ineffassign v0.1.0 // indirect
github.com/goreleaser/chglog v0.5.0 // indirect
github.com/goreleaser/fileglob v1.3.0 // indirect
github.com/gorilla/csrf v1.7.2
github.com/gorilla/csrf v1.7.3-0.20250123201450-9dd6af1f6d30
github.com/gostaticanalysis/analysisutil v0.7.1 // indirect
github.com/gostaticanalysis/comment v1.4.2 // indirect
github.com/gostaticanalysis/forcetypeassert v0.1.0 // indirect
@@ -385,7 +385,7 @@ require (
go.uber.org/multierr v1.11.0 // indirect
golang.org/x/exp/typeparams v0.0.0-20240314144324-c7f7c6466f7f // indirect
golang.org/x/image v0.23.0 // indirect
golang.org/x/text v0.21.0 // indirect
golang.org/x/text v0.22.0 // indirect
gomodules.xyz/jsonpatch/v2 v2.4.0 // indirect
google.golang.org/protobuf v1.35.1 // indirect
gopkg.in/inf.v0 v0.9.1 // indirect

36
go.sum
View File

@@ -529,8 +529,8 @@ github.com/goreleaser/fileglob v1.3.0 h1:/X6J7U8lbDpQtBvGcwwPS6OpzkNVlVEsFUVRx9+
github.com/goreleaser/fileglob v1.3.0/go.mod h1:Jx6BoXv3mbYkEzwm9THo7xbr5egkAraxkGorbJb4RxU=
github.com/goreleaser/nfpm/v2 v2.33.1 h1:EkdAzZyVhAI9JC1vjmjjbmnNzyH1J6Cu4JCsA7YcQuc=
github.com/goreleaser/nfpm/v2 v2.33.1/go.mod h1:8wwWWvJWmn84xo/Sqiv0aMvEGTHlHZTXTEuVSgQpkIM=
github.com/gorilla/csrf v1.7.2 h1:oTUjx0vyf2T+wkrx09Trsev1TE+/EbDAeHtSTbtC2eI=
github.com/gorilla/csrf v1.7.2/go.mod h1:F1Fj3KG23WYHE6gozCmBAezKookxbIvUJT+121wTuLk=
github.com/gorilla/csrf v1.7.3-0.20250123201450-9dd6af1f6d30 h1:fiJdrgVBkjZ5B1HJ2WQwNOaXB+QyYcNXTA3t1XYLz0M=
github.com/gorilla/csrf v1.7.3-0.20250123201450-9dd6af1f6d30/go.mod h1:F1Fj3KG23WYHE6gozCmBAezKookxbIvUJT+121wTuLk=
github.com/gorilla/securecookie v1.1.2 h1:YCIWL56dvtr73r6715mJs5ZvhtnY73hBvEF8kXD8ePA=
github.com/gorilla/securecookie v1.1.2/go.mod h1:NfCASbcHqRSY+3a8tlWJwsQap2VX5pwzwo4h3eOamfo=
github.com/gostaticanalysis/analysisutil v0.7.1 h1:ZMCjoue3DtDWQ5WyU16YbjbQEQ3VuzwxALrpYd+HeKk=
@@ -915,8 +915,8 @@ github.com/t-yuki/gocover-cobertura v0.0.0-20180217150009-aaee18c8195c h1:+aPplB
github.com/t-yuki/gocover-cobertura v0.0.0-20180217150009-aaee18c8195c/go.mod h1:SbErYREK7xXdsRiigaQiQkI9McGRzYMvlKYaP3Nimdk=
github.com/tailscale/certstore v0.1.1-0.20231202035212-d3fa0460f47e h1:PtWT87weP5LWHEY//SWsYkSO3RWRZo4OSWagh3YD2vQ=
github.com/tailscale/certstore v0.1.1-0.20231202035212-d3fa0460f47e/go.mod h1:XrBNfAFN+pwoWuksbFS9Ccxnopa15zJGgXRFN90l3K4=
github.com/tailscale/depaware v0.0.0-20210622194025-720c4b409502 h1:34icjjmqJ2HPjrSuJYEkdZ+0ItmGQAQ75cRHIiftIyE=
github.com/tailscale/depaware v0.0.0-20210622194025-720c4b409502/go.mod h1:p9lPsd+cx33L3H9nNoecRRxPssFKUwwI50I3pZ0yT+8=
github.com/tailscale/depaware v0.0.0-20250112153213-b748de04d81b h1:ewWb4cA+YO9/3X+v5UhdV+eKFsNBOPcGRh39Glshx/4=
github.com/tailscale/depaware v0.0.0-20250112153213-b748de04d81b/go.mod h1:p9lPsd+cx33L3H9nNoecRRxPssFKUwwI50I3pZ0yT+8=
github.com/tailscale/go-winio v0.0.0-20231025203758-c4f33415bf55 h1:Gzfnfk2TWrk8Jj4P4c1a3CtQyMaTVCznlkLZI++hok4=
github.com/tailscale/go-winio v0.0.0-20231025203758-c4f33415bf55/go.mod h1:4k4QO+dQ3R5FofL+SanAUZe+/QfeK0+OIuwDIRu2vSg=
github.com/tailscale/goexpect v0.0.0-20210902213824-6e8c725cea41 h1:/V2rCMMWcsjYaYO2MeovLw+ClP63OtXgCF2Y1eb8+Ns=
@@ -933,8 +933,8 @@ github.com/tailscale/netlink v1.1.1-0.20240822203006-4d49adab4de7 h1:uFsXVBE9Qr4
github.com/tailscale/netlink v1.1.1-0.20240822203006-4d49adab4de7/go.mod h1:NzVQi3Mleb+qzq8VmcWpSkcSYxXIg0DkI6XDzpVkhJ0=
github.com/tailscale/peercred v0.0.0-20250107143737-35a0c7bd7edc h1:24heQPtnFR+yfntqhI3oAu9i27nEojcQ4NuBQOo5ZFA=
github.com/tailscale/peercred v0.0.0-20250107143737-35a0c7bd7edc/go.mod h1:f93CXfllFsO9ZQVq+Zocb1Gp4G5Fz0b0rXHLOzt/Djc=
github.com/tailscale/web-client-prebuilt v0.0.0-20240226180453-5db17b287bf1 h1:tdUdyPqJ0C97SJfjB9tW6EylTtreyee9C44de+UBG0g=
github.com/tailscale/web-client-prebuilt v0.0.0-20240226180453-5db17b287bf1/go.mod h1:agQPE6y6ldqCOui2gkIh7ZMztTkIQKH049tv8siLuNQ=
github.com/tailscale/web-client-prebuilt v0.0.0-20250124233751-d4cd19a26976 h1:UBPHPtv8+nEAy2PD8RyAhOYvau1ek0HDJqLS/Pysi14=
github.com/tailscale/web-client-prebuilt v0.0.0-20250124233751-d4cd19a26976/go.mod h1:agQPE6y6ldqCOui2gkIh7ZMztTkIQKH049tv8siLuNQ=
github.com/tailscale/wf v0.0.0-20240214030419-6fbb0a674ee6 h1:l10Gi6w9jxvinoiq15g8OToDdASBni4CyJOdHY1Hr8M=
github.com/tailscale/wf v0.0.0-20240214030419-6fbb0a674ee6/go.mod h1:ZXRML051h7o4OcI0d3AaILDIad/Xw0IkXaHM17dic1Y=
github.com/tailscale/wireguard-go v0.0.0-20250107165329-0b8b35511f19 h1:BcEJP2ewTIK2ZCsqgl6YGpuO6+oKqqag5HHb7ehljKw=
@@ -1058,8 +1058,8 @@ golang.org/x/crypto v0.0.0-20210921155107-089bfa567519/go.mod h1:GvvjBRRGRdwPK5y
golang.org/x/crypto v0.0.0-20220622213112-05595931fe9d/go.mod h1:IxCIyHEi3zRg3s0A5j5BB6A9Jmi73HwBIUl50j+osU4=
golang.org/x/crypto v0.1.0/go.mod h1:RecgLatLF4+eUMCP1PoPZQb+cVrJcOPbHkTkbkB9sbw=
golang.org/x/crypto v0.3.0/go.mod h1:hebNnKkNXi2UzZN1eVRvBB7co0a+JxK6XbPiWVs/3J4=
golang.org/x/crypto v0.32.0 h1:euUpcYgM8WcP71gNpTqQCn6rC2t6ULUPiOzfWaXVVfc=
golang.org/x/crypto v0.32.0/go.mod h1:ZnnJkOaASj8g0AjIduWNlq2NRxL0PlBrbKVyZ6V/Ugc=
golang.org/x/crypto v0.33.0 h1:IOBPskki6Lysi0lo9qQvbxiQ+FvsCC/YWOecCHAixus=
golang.org/x/crypto v0.33.0/go.mod h1:bVdXmD7IV/4GdElGPozy6U7lWdRXA4qyRVGJV57uQ5M=
golang.org/x/exp v0.0.0-20190121172915-509febef88a4/go.mod h1:CJ0aWSM057203Lf6IL+f9T1iT9GByDxfZKAQTCR3kQA=
golang.org/x/exp v0.0.0-20190306152737-a1d7652674e8/go.mod h1:CJ0aWSM057203Lf6IL+f9T1iT9GByDxfZKAQTCR3kQA=
golang.org/x/exp v0.0.0-20190510132918-efd6b22b2522/go.mod h1:ZjyILWgesfNpC6sMxTJOJm9Kp84zZh5NQWvqDGG3Qr8=
@@ -1148,8 +1148,8 @@ golang.org/x/net v0.1.0/go.mod h1:Cx3nUiGt4eDBEyega/BKRp+/AlGL8hYe7U9odMt2Cco=
golang.org/x/net v0.2.0/go.mod h1:KqCZLdyyvdV855qA2rE3GC2aiw5xGR5TEjj8smXukLY=
golang.org/x/net v0.5.0/go.mod h1:DivGGAXEgPSlEBzxGzZI+ZLohi+xUj054jfeKui00ws=
golang.org/x/net v0.6.0/go.mod h1:2Tu9+aMcznHK/AK1HMvgo6xiTLG5rD5rZLDS+rp2Bjs=
golang.org/x/net v0.34.0 h1:Mb7Mrk043xzHgnRM88suvJFwzVrRfHEHJEl5/71CKw0=
golang.org/x/net v0.34.0/go.mod h1:di0qlW3YNM5oh6GqDGQr92MyTozJPmybPK4Ev/Gm31k=
golang.org/x/net v0.35.0 h1:T5GQRQb2y08kTAByq9L4/bz8cipCdA8FbRTXewonqY8=
golang.org/x/net v0.35.0/go.mod h1:EglIi67kWsHKlRzzVMUD93VMSWGFOMSZgxFjparz1Qk=
golang.org/x/oauth2 v0.0.0-20180821212333-d2e6202438be/go.mod h1:N/0e6XlmueqKjAGxoOufVs8QHGRruUQn6yWY3a++T0U=
golang.org/x/oauth2 v0.0.0-20190226205417-e64efc72b421/go.mod h1:gOpvHmFTYa4IltrdGE7lF6nIHvwfUNPOp7c8zoXwtLw=
golang.org/x/oauth2 v0.0.0-20190604053449-0f29369cfe45/go.mod h1:gOpvHmFTYa4IltrdGE7lF6nIHvwfUNPOp7c8zoXwtLw=
@@ -1171,8 +1171,8 @@ golang.org/x/sync v0.0.0-20201207232520-09787c993a3a/go.mod h1:RxMgew5VJxzue5/jJ
golang.org/x/sync v0.0.0-20210220032951-036812b2e83c/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM=
golang.org/x/sync v0.0.0-20220722155255-886fb9371eb4/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM=
golang.org/x/sync v0.1.0/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM=
golang.org/x/sync v0.10.0 h1:3NQrjDixjgGwUOCaF8w2+VYHv0Ve/vGYSbdkTa98gmQ=
golang.org/x/sync v0.10.0/go.mod h1:Czt+wKu1gCyEFDUtn0jG5QVvpJ6rzVqr5aXyt9drQfk=
golang.org/x/sync v0.11.0 h1:GGz8+XQP4FvTTrjZPzNKTMFtSXH80RAzG+5ghFPgK9w=
golang.org/x/sync v0.11.0/go.mod h1:Czt+wKu1gCyEFDUtn0jG5QVvpJ6rzVqr5aXyt9drQfk=
golang.org/x/sys v0.0.0-20180830151530-49385e6e1522/go.mod h1:STP8DvDyc/dI5b8T5hshtkjS+E42TnysNCUPdjciGhY=
golang.org/x/sys v0.0.0-20180905080454-ebe1bf3edb33/go.mod h1:STP8DvDyc/dI5b8T5hshtkjS+E42TnysNCUPdjciGhY=
golang.org/x/sys v0.0.0-20181116152217-5ac8a444bdc5/go.mod h1:STP8DvDyc/dI5b8T5hshtkjS+E42TnysNCUPdjciGhY=
@@ -1231,16 +1231,16 @@ golang.org/x/sys v0.2.0/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
golang.org/x/sys v0.4.0/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
golang.org/x/sys v0.5.0/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
golang.org/x/sys v0.6.0/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
golang.org/x/sys v0.29.1-0.20250107080300-1c14dcadc3ab h1:BMkEEWYOjkvOX7+YKOGbp6jCyQ5pR2j0Ah47p1Vdsx4=
golang.org/x/sys v0.29.1-0.20250107080300-1c14dcadc3ab/go.mod h1:/VUhepiaJMQUp4+oa/7Zr1D23ma6VTLIYjOOTFZPUcA=
golang.org/x/sys v0.30.0 h1:QjkSwP/36a20jFYWkSue1YwXzLmsV5Gfq7Eiy72C1uc=
golang.org/x/sys v0.30.0/go.mod h1:/VUhepiaJMQUp4+oa/7Zr1D23ma6VTLIYjOOTFZPUcA=
golang.org/x/term v0.0.0-20201126162022-7de9c90e9dd1/go.mod h1:bj7SfCRtBDWHUb9snDiAeCFNEtKQo2Wmx5Cou7ajbmo=
golang.org/x/term v0.0.0-20210927222741-03fcf44c2211/go.mod h1:jbD1KX2456YbFQfuXm/mYQcufACuNUgVhRMnK/tPxf8=
golang.org/x/term v0.1.0/go.mod h1:jbD1KX2456YbFQfuXm/mYQcufACuNUgVhRMnK/tPxf8=
golang.org/x/term v0.2.0/go.mod h1:TVmDHMZPmdnySmBfhjOoOdhjzdE1h4u1VwSiw2l1Nuc=
golang.org/x/term v0.4.0/go.mod h1:9P2UbLfCdcvo3p/nzKvsmas4TnlujnuoV9hGgYzW1lQ=
golang.org/x/term v0.5.0/go.mod h1:jMB1sMXY+tzblOD4FWmEbocvup2/aLOaQEp7JmGp78k=
golang.org/x/term v0.28.0 h1:/Ts8HFuMR2E6IP/jlo7QVLZHggjKQbhu/7H0LJFr3Gg=
golang.org/x/term v0.28.0/go.mod h1:Sw/lC2IAUZ92udQNf3WodGtn4k/XoLyZoh8v/8uiwek=
golang.org/x/term v0.29.0 h1:L6pJp37ocefwRRtYPKSWOWzOtWSxVajvz2ldH/xi3iU=
golang.org/x/term v0.29.0/go.mod h1:6bl4lRlvVuDgSf3179VpIxBF0o10JUpXWOnI7nErv7s=
golang.org/x/text v0.0.0-20170915032832-14c0d48ead0c/go.mod h1:NqM8EUOU14njkJ3fqMW+pc6Ldnwhi/IjpwHt7yyuwOQ=
golang.org/x/text v0.3.0/go.mod h1:NqM8EUOU14njkJ3fqMW+pc6Ldnwhi/IjpwHt7yyuwOQ=
golang.org/x/text v0.3.1-0.20180807135948-17ff2d5776d2/go.mod h1:NqM8EUOU14njkJ3fqMW+pc6Ldnwhi/IjpwHt7yyuwOQ=
@@ -1251,8 +1251,8 @@ golang.org/x/text v0.3.7/go.mod h1:u+2+/6zg+i71rQMx5EYifcz6MCKuco9NR6JIITiCfzQ=
golang.org/x/text v0.4.0/go.mod h1:mrYo+phRRbMaCq/xk9113O4dZlRixOauAjOtrjsXDZ8=
golang.org/x/text v0.6.0/go.mod h1:mrYo+phRRbMaCq/xk9113O4dZlRixOauAjOtrjsXDZ8=
golang.org/x/text v0.7.0/go.mod h1:mrYo+phRRbMaCq/xk9113O4dZlRixOauAjOtrjsXDZ8=
golang.org/x/text v0.21.0 h1:zyQAAkrwaneQ066sspRyJaG9VNi/YJ1NfzcGB3hZ/qo=
golang.org/x/text v0.21.0/go.mod h1:4IBbMaMmOPCJ8SecivzSH54+73PCFmPWxNTLm+vZkEQ=
golang.org/x/text v0.22.0 h1:bofq7m3/HAFvbF51jz3Q9wLg3jkvSPuiZu/pD1XwgtM=
golang.org/x/text v0.22.0/go.mod h1:YRoo4H8PVmsu+E3Ou7cqLVH8oXWIHVoX0jqUWALQhfY=
golang.org/x/time v0.0.0-20181108054448-85acf8d2951c/go.mod h1:tRJNPiyCQ0inRvYxbN9jk5I+vvW/OXSQhTDSoE431IQ=
golang.org/x/time v0.0.0-20190308202827-9d24e82272b4/go.mod h1:tRJNPiyCQ0inRvYxbN9jk5I+vvW/OXSQhTDSoE431IQ=
golang.org/x/time v0.0.0-20191024005414-555d28b269f0/go.mod h1:tRJNPiyCQ0inRvYxbN9jk5I+vvW/OXSQhTDSoE431IQ=

View File

@@ -1 +1 @@
161c3b79ed91039e65eb148f2547dea6b91e2247
64f7854906c3121fe3ada3d05f1936d3420d6ffa

View File

@@ -22,6 +22,7 @@ import (
"tailscale.com/envknob"
"tailscale.com/metrics"
"tailscale.com/tailcfg"
"tailscale.com/tstime"
"tailscale.com/types/opt"
"tailscale.com/util/cibuild"
"tailscale.com/util/mak"
@@ -73,6 +74,8 @@ type Tracker struct {
// mu should not be held during init.
initOnce sync.Once
testClock tstime.Clock // nil means use time.Now / tstime.StdClock{}
// mu guards everything that follows.
mu sync.Mutex
@@ -80,13 +83,13 @@ type Tracker struct {
warnableVal map[*Warnable]*warningState
// pendingVisibleTimers contains timers for Warnables that are unhealthy, but are
// not visible to the user yet, because they haven't been unhealthy for TimeToVisible
pendingVisibleTimers map[*Warnable]*time.Timer
pendingVisibleTimers map[*Warnable]tstime.TimerController
// sysErr maps subsystems to their current error (or nil if the subsystem is healthy)
// Deprecated: using Warnables should be preferred
sysErr map[Subsystem]error
watchers set.HandleSet[func(*Warnable, *UnhealthyState)] // opt func to run if error state changes
timer *time.Timer
timer tstime.TimerController
latestVersion *tailcfg.ClientVersion // or nil
checkForUpdates bool
@@ -115,6 +118,20 @@ type Tracker struct {
metricHealthMessage *metrics.MultiLabelMap[metricHealthMessageLabel]
}
func (t *Tracker) now() time.Time {
if t.testClock != nil {
return t.testClock.Now()
}
return time.Now()
}
func (t *Tracker) clock() tstime.Clock {
if t.testClock != nil {
return t.testClock
}
return tstime.StdClock{}
}
// Subsystem is the name of a subsystem whose health can be monitored.
//
// Deprecated: Registering a Warnable using Register() and updating its health state
@@ -214,9 +231,11 @@ type Warnable struct {
// TODO(angott): turn this into a SeverityFunc, which allows the Warnable to change its severity based on
// the Args of the unhappy state, just like we do in the Text function.
Severity Severity
// DependsOn is a set of Warnables that this Warnable depends, on and need to be healthy
// before this Warnable can also be healthy again. The GUI can use this information to ignore
// DependsOn is a set of Warnables that this Warnable depends on and need to be healthy
// before this Warnable is relevant. The GUI can use this information to ignore
// this Warnable if one of its dependencies is unhealthy.
// That is, if any of these Warnables are unhealthy, then this Warnable is not relevant
// and should be considered healthy to bother the user about.
DependsOn []*Warnable
// MapDebugFlag is a MapRequest.DebugFlag that is sent to control when this Warnable is unhealthy
@@ -309,11 +328,11 @@ func (ws *warningState) Equal(other *warningState) bool {
// IsVisible returns whether the Warnable should be visible to the user, based on the TimeToVisible
// field of the Warnable and the BrokenSince time when the Warnable became unhealthy.
func (w *Warnable) IsVisible(ws *warningState) bool {
func (w *Warnable) IsVisible(ws *warningState, clockNow func() time.Time) bool {
if ws == nil || w.TimeToVisible == 0 {
return true
}
return time.Since(ws.BrokenSince) >= w.TimeToVisible
return clockNow().Sub(ws.BrokenSince) >= w.TimeToVisible
}
// SetMetricsRegistry sets up the metrics for the Tracker. It takes
@@ -363,7 +382,7 @@ func (t *Tracker) setUnhealthyLocked(w *Warnable, args Args) {
// If we already have a warningState for this Warnable with an earlier BrokenSince time, keep that
// BrokenSince time.
brokenSince := time.Now()
brokenSince := t.now()
if existingWS := t.warnableVal[w]; existingWS != nil {
brokenSince = existingWS.BrokenSince
}
@@ -382,15 +401,15 @@ func (t *Tracker) setUnhealthyLocked(w *Warnable, args Args) {
// If the Warnable has been unhealthy for more than its TimeToVisible, the callback should be
// executed immediately. Otherwise, the callback should be enqueued to run once the Warnable
// becomes visible.
if w.IsVisible(ws) {
if w.IsVisible(ws, t.now) {
go cb(w, w.unhealthyState(ws))
continue
}
// The time remaining until the Warnable will be visible to the user is the TimeToVisible
// minus the time that has already passed since the Warnable became unhealthy.
visibleIn := w.TimeToVisible - time.Since(brokenSince)
mak.Set(&t.pendingVisibleTimers, w, time.AfterFunc(visibleIn, func() {
visibleIn := w.TimeToVisible - t.now().Sub(brokenSince)
var tc tstime.TimerController = t.clock().AfterFunc(visibleIn, func() {
t.mu.Lock()
defer t.mu.Unlock()
// Check if the Warnable is still unhealthy, as it could have become healthy between the time
@@ -399,7 +418,8 @@ func (t *Tracker) setUnhealthyLocked(w *Warnable, args Args) {
go cb(w, w.unhealthyState(ws))
delete(t.pendingVisibleTimers, w)
}
}))
})
mak.Set(&t.pendingVisibleTimers, w, tc)
}
}
}
@@ -474,7 +494,7 @@ func (t *Tracker) RegisterWatcher(cb func(w *Warnable, r *UnhealthyState)) (unre
}
handle := t.watchers.Add(cb)
if t.timer == nil {
t.timer = time.AfterFunc(time.Minute, t.timerSelfCheck)
t.timer = t.clock().AfterFunc(time.Minute, t.timerSelfCheck)
}
return func() {
t.mu.Lock()
@@ -638,10 +658,10 @@ func (t *Tracker) GotStreamedMapResponse() {
}
t.mu.Lock()
defer t.mu.Unlock()
t.lastStreamedMapResponse = time.Now()
t.lastStreamedMapResponse = t.now()
if !t.inMapPoll {
t.inMapPoll = true
t.inMapPollSince = time.Now()
t.inMapPollSince = t.now()
}
t.selfCheckLocked()
}
@@ -658,7 +678,7 @@ func (t *Tracker) SetOutOfPollNetMap() {
return
}
t.inMapPoll = false
t.lastMapPollEndedAt = time.Now()
t.lastMapPollEndedAt = t.now()
t.selfCheckLocked()
}
@@ -700,7 +720,7 @@ func (t *Tracker) NoteMapRequestHeard(mr *tailcfg.MapRequest) {
// against SetMagicSockDERPHome and
// SetDERPRegionConnectedState
t.lastMapRequestHeard = time.Now()
t.lastMapRequestHeard = t.now()
t.selfCheckLocked()
}
@@ -738,7 +758,7 @@ func (t *Tracker) NoteDERPRegionReceivedFrame(region int) {
}
t.mu.Lock()
defer t.mu.Unlock()
mak.Set(&t.derpRegionLastFrame, region, time.Now())
mak.Set(&t.derpRegionLastFrame, region, t.now())
t.selfCheckLocked()
}
@@ -797,9 +817,9 @@ func (t *Tracker) SetIPNState(state string, wantRunning bool) {
// The first time we see wantRunning=true and it used to be false, it means the user requested
// the backend to start. We store this timestamp and use it to silence some warnings that are
// expected during startup.
t.ipnWantRunningLastTrue = time.Now()
t.ipnWantRunningLastTrue = t.now()
t.setUnhealthyLocked(warmingUpWarnable, nil)
time.AfterFunc(warmingUpWarnableDuration, func() {
t.clock().AfterFunc(warmingUpWarnableDuration, func() {
t.mu.Lock()
t.updateWarmingUpWarnableLocked()
t.mu.Unlock()
@@ -936,10 +956,13 @@ func (t *Tracker) Strings() []string {
func (t *Tracker) stringsLocked() []string {
result := []string{}
for w, ws := range t.warnableVal {
if !w.IsVisible(ws) {
if !w.IsVisible(ws, t.now) {
// Do not append invisible warnings.
continue
}
if t.isEffectivelyHealthyLocked(w) {
continue
}
if ws.Args == nil {
result = append(result, w.Text(Args{}))
} else {
@@ -1005,7 +1028,7 @@ func (t *Tracker) updateBuiltinWarnablesLocked() {
t.setHealthyLocked(localLogWarnable)
}
now := time.Now()
now := t.now()
// How long we assume we'll have heard a DERP frame or a MapResponse
// KeepAlive by.
@@ -1015,8 +1038,10 @@ func (t *Tracker) updateBuiltinWarnablesLocked() {
recentlyOn := now.Sub(t.ipnWantRunningLastTrue) < 5*time.Second
homeDERP := t.derpHomeRegion
if recentlyOn {
if recentlyOn || !t.inMapPoll {
// If user just turned Tailscale on, don't warn for a bit.
// Also, if we're not in a map poll, that means we don't yet
// have a DERPMap or aren't in a state where we even want
t.setHealthyLocked(noDERPHomeWarnable)
t.setHealthyLocked(noDERPConnectionWarnable)
t.setHealthyLocked(derpTimeoutWarnable)
@@ -1165,7 +1190,7 @@ func (t *Tracker) updateBuiltinWarnablesLocked() {
// updateWarmingUpWarnableLocked ensures the warmingUpWarnable is healthy if wantRunning has been set to true
// for more than warmingUpWarnableDuration.
func (t *Tracker) updateWarmingUpWarnableLocked() {
if !t.ipnWantRunningLastTrue.IsZero() && time.Now().After(t.ipnWantRunningLastTrue.Add(warmingUpWarnableDuration)) {
if !t.ipnWantRunningLastTrue.IsZero() && t.now().After(t.ipnWantRunningLastTrue.Add(warmingUpWarnableDuration)) {
t.setHealthyLocked(warmingUpWarnable)
}
}
@@ -1277,7 +1302,7 @@ func (t *Tracker) LastNoiseDialWasRecent() bool {
t.mu.Lock()
defer t.mu.Unlock()
now := time.Now()
now := t.now()
dur := now.Sub(t.lastNoiseDial)
t.lastNoiseDial = now
return dur < 2*time.Minute

View File

@@ -12,6 +12,7 @@ import (
"time"
"tailscale.com/tailcfg"
"tailscale.com/tstest"
"tailscale.com/types/opt"
"tailscale.com/util/usermetric"
"tailscale.com/version"
@@ -257,9 +258,15 @@ func TestCheckDependsOnAppearsInUnhealthyState(t *testing.T) {
}
ht.SetUnhealthy(w2, Args{ArgError: "w2 is also unhealthy now"})
us2, ok := ht.CurrentState().Warnings[w2.Code]
if !ok {
t.Fatalf("Expected an UnhealthyState for w2, got nothing")
if ok {
t.Fatalf("Saw w2 being unhealthy but it shouldn't be, as it depends on unhealthy w1")
}
ht.SetHealthy(w1)
us2, ok = ht.CurrentState().Warnings[w2.Code]
if !ok {
t.Fatalf("w2 wasn't unhealthy; want it to be unhealthy now that w1 is back healthy")
}
wantDependsOn = slices.Concat([]WarnableCode{w1.Code}, wantDependsOn)
if !reflect.DeepEqual(us2.DependsOn, wantDependsOn) {
t.Fatalf("Expected DependsOn = %v in the unhealthy state, got: %v", wantDependsOn, us2.DependsOn)
@@ -400,3 +407,47 @@ func TestHealthMetric(t *testing.T) {
})
}
}
// TestNoDERPHomeWarnable checks that we don't
// complain about no DERP home if we're not in a
// map poll.
func TestNoDERPHomeWarnable(t *testing.T) {
t.Skip("TODO: fix https://github.com/tailscale/tailscale/issues/14798 to make this test not deadlock")
clock := tstest.NewClock(tstest.ClockOpts{
Start: time.Unix(123, 0),
FollowRealTime: false,
})
ht := &Tracker{
testClock: clock,
}
ht.SetIPNState("NeedsLogin", true)
// Advance 30 seconds to get past the "recentlyLoggedIn" check.
clock.Advance(30 * time.Second)
ht.updateBuiltinWarnablesLocked()
// Advance to get past the the TimeToVisible delay.
clock.Advance(noDERPHomeWarnable.TimeToVisible * 2)
ht.updateBuiltinWarnablesLocked()
if ws, ok := ht.CurrentState().Warnings[noDERPHomeWarnable.Code]; ok {
t.Fatalf("got unexpected noDERPHomeWarnable warnable: %v", ws)
}
}
// TestNoDERPHomeWarnableManual is like TestNoDERPHomeWarnable
// but doesn't use tstest.Clock so avoids the deadlock
// I hit: https://github.com/tailscale/tailscale/issues/14798
func TestNoDERPHomeWarnableManual(t *testing.T) {
ht := &Tracker{}
ht.SetIPNState("NeedsLogin", true)
// Avoid wantRunning:
ht.ipnWantRunningLastTrue = ht.ipnWantRunningLastTrue.Add(-10 * time.Second)
ht.updateBuiltinWarnablesLocked()
ws, ok := ht.warnableVal[noDERPHomeWarnable]
if ok {
t.Fatalf("got unexpected noDERPHomeWarnable warnable: %v", ws)
}
}

View File

@@ -86,10 +86,15 @@ func (t *Tracker) CurrentState() *State {
wm := map[WarnableCode]UnhealthyState{}
for w, ws := range t.warnableVal {
if !w.IsVisible(ws) {
if !w.IsVisible(ws, t.now) {
// Skip invisible Warnables.
continue
}
if t.isEffectivelyHealthyLocked(w) {
// Skip Warnables that are unhealthy if they have dependencies
// that are unhealthy.
continue
}
wm[w.Code] = *w.unhealthyState(ws)
}
@@ -97,3 +102,23 @@ func (t *Tracker) CurrentState() *State {
Warnings: wm,
}
}
// isEffectivelyHealthyLocked reports whether w is effectively healthy.
// That means it's either actually healthy or it has a dependency that
// that's unhealthy, so we should treat w as healthy to not spam users
// with multiple warnings when only the root cause is relevant.
func (t *Tracker) isEffectivelyHealthyLocked(w *Warnable) bool {
if _, ok := t.warnableVal[w]; !ok {
// Warnable not found in the tracker. So healthy.
return true
}
for _, d := range w.DependsOn {
if !t.isEffectivelyHealthyLocked(d) {
// If one of our deps is unhealthy, we're healthy.
return true
}
}
// If we have no unhealthy deps and had warnableVal set,
// we're unhealthy.
return false
}

View File

@@ -32,7 +32,6 @@ import (
"sync"
"time"
"github.com/tailscale/golang-x-crypto/acme"
"tailscale.com/atomicfile"
"tailscale.com/envknob"
"tailscale.com/hostinfo"
@@ -41,6 +40,7 @@ import (
"tailscale.com/ipn/store"
"tailscale.com/ipn/store/mem"
"tailscale.com/net/bakedroots"
"tailscale.com/tempfork/acme"
"tailscale.com/types/logger"
"tailscale.com/util/testenv"
"tailscale.com/version"
@@ -556,6 +556,7 @@ func (b *LocalBackend) getCertPEM(ctx context.Context, cs certStore, logf logger
}
logf("requesting cert...")
traceACME(csr)
der, _, err := ac.CreateOrderCert(ctx, order.FinalizeURL, csr, true)
if err != nil {
return nil, fmt.Errorf("CreateOrder: %v", err)
@@ -578,10 +579,10 @@ func (b *LocalBackend) getCertPEM(ctx context.Context, cs certStore, logf logger
}
// certRequest generates a CSR for the given common name cn and optional SANs.
func certRequest(key crypto.Signer, cn string, ext []pkix.Extension, san ...string) ([]byte, error) {
func certRequest(key crypto.Signer, name string, ext []pkix.Extension) ([]byte, error) {
req := &x509.CertificateRequest{
Subject: pkix.Name{CommonName: cn},
DNSNames: san,
Subject: pkix.Name{CommonName: name},
DNSNames: []string{name},
ExtraExtensions: ext,
}
return x509.CreateCertificateRequest(rand.Reader, req, key)
@@ -658,8 +659,9 @@ func acmeClient(cs certStore) (*acme.Client, error) {
// LetsEncrypt), we should make sure that they support ARI extension (see
// shouldStartDomainRenewalARI).
return &acme.Client{
Key: key,
UserAgent: "tailscaled/" + version.Long(),
Key: key,
UserAgent: "tailscaled/" + version.Long(),
DirectoryURL: envknob.String("TS_DEBUG_ACME_DIRECTORY_URL"),
}, nil
}

View File

@@ -199,3 +199,19 @@ func TestShouldStartDomainRenewal(t *testing.T) {
})
}
}
func TestDebugACMEDirectoryURL(t *testing.T) {
for _, tc := range []string{"", "https://acme-staging-v02.api.letsencrypt.org/directory"} {
const setting = "TS_DEBUG_ACME_DIRECTORY_URL"
t.Run(tc, func(t *testing.T) {
t.Setenv(setting, tc)
ac, err := acmeClient(certStateStore{StateStore: new(mem.Store)})
if err != nil {
t.Fatalf("acmeClient creation err: %v", err)
}
if ac.DirectoryURL != tc {
t.Fatalf("acmeClient.DirectoryURL = %q, want %q", ac.DirectoryURL, tc)
}
})
}
}

View File

@@ -73,6 +73,7 @@ import (
"tailscale.com/net/netmon"
"tailscale.com/net/netns"
"tailscale.com/net/netutil"
"tailscale.com/net/packet"
"tailscale.com/net/tsaddr"
"tailscale.com/net/tsdial"
"tailscale.com/paths"
@@ -115,7 +116,6 @@ import (
"tailscale.com/version"
"tailscale.com/version/distro"
"tailscale.com/wgengine"
"tailscale.com/wgengine/capture"
"tailscale.com/wgengine/filter"
"tailscale.com/wgengine/magicsock"
"tailscale.com/wgengine/router"
@@ -209,7 +209,7 @@ type LocalBackend struct {
// Tailscale on port 5252.
exposeRemoteWebClientAtomicBool atomic.Bool
shutdownCalled bool // if Shutdown has been called
debugSink *capture.Sink
debugSink packet.CaptureSink
sockstatLogger *sockstatlog.Logger
// getTCPHandlerForFunnelFlow returns a handler for an incoming TCP flow for
@@ -948,6 +948,40 @@ func (b *LocalBackend) onHealthChange(w *health.Warnable, us *health.UnhealthySt
}
}
// GetOrSetCaptureSink returns the current packet capture sink, creating it
// with the provided newSink function if it does not already exist.
func (b *LocalBackend) GetOrSetCaptureSink(newSink func() packet.CaptureSink) packet.CaptureSink {
b.mu.Lock()
defer b.mu.Unlock()
if b.debugSink != nil {
return b.debugSink
}
s := newSink()
b.debugSink = s
b.e.InstallCaptureHook(s.CaptureCallback())
return s
}
func (b *LocalBackend) ClearCaptureSink() {
// Shut down & uninstall the sink if there are no longer
// any outputs on it.
b.mu.Lock()
defer b.mu.Unlock()
select {
case <-b.ctx.Done():
return
default:
}
if b.debugSink != nil && b.debugSink.NumOutputs() == 0 {
s := b.debugSink
b.e.InstallCaptureHook(nil)
b.debugSink = nil
s.Close()
}
}
// Shutdown halts the backend and all its sub-components. The backend
// can no longer be used after Shutdown returns.
func (b *LocalBackend) Shutdown() {
@@ -1048,7 +1082,6 @@ func stripKeysFromPrefs(p ipn.PrefsView) ipn.PrefsView {
}
p2 := p.AsStruct()
p2.Persist.LegacyFrontendPrivateMachineKey = key.MachinePrivate{}
p2.Persist.PrivateNodeKey = key.NodePrivate{}
p2.Persist.OldPrivateNodeKey = key.NodePrivate{}
p2.Persist.NetworkLockKey = key.NLPrivate{}
@@ -3309,11 +3342,6 @@ func (b *LocalBackend) initMachineKeyLocked() (err error) {
return nil
}
var legacyMachineKey key.MachinePrivate
if p := b.pm.CurrentPrefs().Persist(); p.Valid() {
legacyMachineKey = p.LegacyFrontendPrivateMachineKey()
}
keyText, err := b.store.ReadState(ipn.MachineKeyStateKey)
if err == nil {
if err := b.machinePrivKey.UnmarshalText(keyText); err != nil {
@@ -3322,9 +3350,6 @@ func (b *LocalBackend) initMachineKeyLocked() (err error) {
if b.machinePrivKey.IsZero() {
return fmt.Errorf("invalid zero key stored in %v key of %v", ipn.MachineKeyStateKey, b.store)
}
if !legacyMachineKey.IsZero() && !legacyMachineKey.Equal(b.machinePrivKey) {
b.logf("frontend-provided legacy machine key ignored; used value from server state")
}
return nil
}
if err != ipn.ErrStateNotExist {
@@ -3334,12 +3359,8 @@ func (b *LocalBackend) initMachineKeyLocked() (err error) {
// If we didn't find one already on disk and the prefs already
// have a legacy machine key, use that. Otherwise generate a
// new one.
if !legacyMachineKey.IsZero() {
b.machinePrivKey = legacyMachineKey
} else {
b.logf("generating new machine key")
b.machinePrivKey = key.NewMachine()
}
b.logf("generating new machine key")
b.machinePrivKey = key.NewMachine()
keyText, _ = b.machinePrivKey.MarshalText()
if err := ipn.WriteState(b.store, ipn.MachineKeyStateKey, keyText); err != nil {
@@ -7154,48 +7175,6 @@ func (b *LocalBackend) ResetAuth() error {
return b.resetForProfileChangeLockedOnEntry(unlock)
}
// StreamDebugCapture writes a pcap stream of packets traversing
// tailscaled to the provided response writer.
func (b *LocalBackend) StreamDebugCapture(ctx context.Context, w io.Writer) error {
var s *capture.Sink
b.mu.Lock()
if b.debugSink == nil {
s = capture.New()
b.debugSink = s
b.e.InstallCaptureHook(s.LogPacket)
} else {
s = b.debugSink
}
b.mu.Unlock()
unregister := s.RegisterOutput(w)
select {
case <-ctx.Done():
case <-s.WaitCh():
}
unregister()
// Shut down & uninstall the sink if there are no longer
// any outputs on it.
b.mu.Lock()
defer b.mu.Unlock()
select {
case <-b.ctx.Done():
return nil
default:
}
if b.debugSink != nil && b.debugSink.NumOutputs() == 0 {
s := b.debugSink
b.e.InstallCaptureHook(nil)
b.debugSink = nil
return s.Close()
}
return nil
}
func (b *LocalBackend) GetPeerEndpointChanges(ctx context.Context, ip netip.Addr) ([]magicsock.EndpointChange, error) {
pip, ok := b.e.PeerForIP(ip)
if !ok {

View File

@@ -949,8 +949,6 @@ func TestEditPrefsHasNoKeys(t *testing.T) {
Persist: &persist.Persist{
PrivateNodeKey: key.NewNode(),
OldPrivateNodeKey: key.NewNode(),
LegacyFrontendPrivateMachineKey: key.NewMachine(),
},
}).View(), ipn.NetworkProfile{})
if p := b.pm.CurrentPrefs().Persist(); !p.Valid() || p.PrivateNodeKey().IsZero() {
@@ -977,10 +975,6 @@ func TestEditPrefsHasNoKeys(t *testing.T) {
t.Errorf("OldPrivateNodeKey = %v; want zero", p.Persist().OldPrivateNodeKey())
}
if !p.Persist().LegacyFrontendPrivateMachineKey().IsZero() {
t.Errorf("LegacyFrontendPrivateMachineKey = %v; want zero", p.Persist().LegacyFrontendPrivateMachineKey())
}
if !p.Persist().NetworkLockKey().IsZero() {
t.Errorf("NetworkLockKey= %v; want zero", p.Persist().NetworkLockKey())
}

View File

@@ -68,12 +68,12 @@ import (
"tailscale.com/wgengine/magicsock"
)
type localAPIHandler func(*Handler, http.ResponseWriter, *http.Request)
type LocalAPIHandler func(*Handler, http.ResponseWriter, *http.Request)
// handler is the set of LocalAPI handlers, keyed by the part of the
// Request.URL.Path after "/localapi/v0/". If the key ends with a trailing slash
// then it's a prefix match.
var handler = map[string]localAPIHandler{
var handler = map[string]LocalAPIHandler{
// The prefix match handlers end with a slash:
"cert/": (*Handler).serveCert,
"file-put/": (*Handler).serveFilePut,
@@ -90,7 +90,6 @@ var handler = map[string]localAPIHandler{
"check-udp-gro-forwarding": (*Handler).serveCheckUDPGROForwarding,
"component-debug-logging": (*Handler).serveComponentDebugLogging,
"debug": (*Handler).serveDebug,
"debug-capture": (*Handler).serveDebugCapture,
"debug-derp-region": (*Handler).serveDebugDERPRegion,
"debug-dial-types": (*Handler).serveDebugDialTypes,
"debug-log": (*Handler).serveDebugLog,
@@ -152,6 +151,14 @@ var handler = map[string]localAPIHandler{
"whois": (*Handler).serveWhoIs,
}
// Register registers a new LocalAPI handler for the given name.
func Register(name string, fn LocalAPIHandler) {
if _, ok := handler[name]; ok {
panic("duplicate LocalAPI handler registration: " + name)
}
handler[name] = fn
}
var (
// The clientmetrics package is stateful, but we want to expose a simple
// imperative API to local clients, so we need to keep track of
@@ -196,6 +203,10 @@ type Handler struct {
clock tstime.Clock
}
func (h *Handler) LocalBackend() *ipnlocal.LocalBackend {
return h.b
}
func (h *Handler) ServeHTTP(w http.ResponseWriter, r *http.Request) {
if h.b == nil {
http.Error(w, "server has no local backend", http.StatusInternalServerError)
@@ -260,7 +271,7 @@ func (h *Handler) validHost(hostname string) bool {
// handlerForPath returns the LocalAPI handler for the provided Request.URI.Path.
// (the path doesn't include any query parameters)
func handlerForPath(urlPath string) (h localAPIHandler, ok bool) {
func handlerForPath(urlPath string) (h LocalAPIHandler, ok bool) {
if urlPath == "/" {
return (*Handler).serveLocalAPIRoot, true
}
@@ -2689,21 +2700,6 @@ func defBool(a string, def bool) bool {
return v
}
func (h *Handler) serveDebugCapture(w http.ResponseWriter, r *http.Request) {
if !h.PermitWrite {
http.Error(w, "debug access denied", http.StatusForbidden)
return
}
if r.Method != "POST" {
http.Error(w, "POST required", http.StatusMethodNotAllowed)
return
}
w.WriteHeader(http.StatusOK)
w.(http.Flusher).Flush()
h.b.StreamDebugCapture(r.Context(), w)
}
func (h *Handler) serveDebugLog(w http.ResponseWriter, r *http.Request) {
if !h.PermitRead {
http.Error(w, "debug-log access denied", http.StatusForbidden)

View File

@@ -467,13 +467,6 @@ func TestPrefsPretty(t *testing.T) {
"darwin",
`Prefs{ra=false dns=false want=true tags=tag:foo,tag:bar url="http://localhost:1234" update=off Persist=nil}`,
},
{
Prefs{
Persist: &persist.Persist{},
},
"linux",
`Prefs{ra=false dns=false want=false routes=[] nf=off update=off Persist{lm=, o=, n= u=""}}`,
},
{
Prefs{
Persist: &persist.Persist{
@@ -481,7 +474,7 @@ func TestPrefsPretty(t *testing.T) {
},
},
"linux",
`Prefs{ra=false dns=false want=false routes=[] nf=off update=off Persist{lm=, o=, n=[B1VKl] u=""}}`,
`Prefs{ra=false dns=false want=false routes=[] nf=off update=off Persist{o=, n=[B1VKl] u=""}}`,
},
{
Prefs{

View File

@@ -599,7 +599,7 @@ _Appears in:_
| Field | Description | Default | Validation |
| --- | --- | --- | --- |
| `type` _[ProxyGroupType](#proxygrouptype)_ | Type of the ProxyGroup proxies. Supported types are egress and ingress.<br />Type is immutable once a ProxyGroup is created. | | Enum: [egress ingress] <br />Type: string <br /> |
| `type` _[ProxyGroupType](#proxygrouptype)_ | Type of the ProxyGroup proxies. Currently the only supported type is egress.<br />Type is immutable once a ProxyGroup is created. | | Enum: [egress ingress] <br />Type: string <br /> |
| `tags` _[Tags](#tags)_ | Tags that the Tailscale devices will be tagged with. Defaults to [tag:k8s].<br />If you specify custom tags here, make sure you also make the operator<br />an owner of these tags.<br />See https://tailscale.com/kb/1236/kubernetes-operator/#setting-up-the-kubernetes-operator.<br />Tags cannot be changed once a ProxyGroup device has been created.<br />Tag values must be in form ^tag:[a-zA-Z][a-zA-Z0-9-]*$. | | Pattern: `^tag:[a-zA-Z][a-zA-Z0-9-]*$` <br />Type: string <br /> |
| `replicas` _integer_ | Replicas specifies how many replicas to create the StatefulSet with.<br />Defaults to 2. | | Minimum: 0 <br /> |
| `hostnamePrefix` _[HostnamePrefix](#hostnameprefix)_ | HostnamePrefix is the hostname prefix to use for tailnet devices created<br />by the ProxyGroup. Each device will have the integer number from its<br />StatefulSet pod appended to this prefix to form the full hostname.<br />HostnamePrefix can contain lower case letters, numbers and dashes, it<br />must not start with a dash and must be between 1 and 62 characters long. | | Pattern: `^[a-z0-9][a-z0-9-]{0,61}$` <br />Type: string <br /> |

View File

@@ -48,7 +48,7 @@ type ProxyGroupList struct {
}
type ProxyGroupSpec struct {
// Type of the ProxyGroup proxies. Supported types are egress and ingress.
// Type of the ProxyGroup proxies. Currently the only supported type is egress.
// Type is immutable once a ProxyGroup is created.
// +kubebuilder:validation:XValidation:rule="self == oldSelf",message="ProxyGroup type is immutable"
Type ProxyGroupType `json:"type"`

View File

@@ -75,16 +75,6 @@ func RemoveServiceCondition(svc *corev1.Service, conditionType tsapi.ConditionTy
})
}
func EgressServiceIsValidAndConfigured(svc *corev1.Service) bool {
for _, typ := range []tsapi.ConditionType{tsapi.EgressSvcValid, tsapi.EgressSvcConfigured} {
cond := GetServiceCondition(svc, typ)
if cond == nil || cond.Status != metav1.ConditionTrue {
return false
}
}
return true
}
// SetRecorderCondition ensures that Recorder status has a condition with the
// given attributes. LastTransitionTime gets set every time condition's status
// changes.

View File

@@ -13,9 +13,15 @@ import (
"net/netip"
)
// KeyEgressServices is name of the proxy state Secret field that contains the
// currently applied egress proxy config.
const KeyEgressServices = "egress-services"
const (
// KeyEgressServices is name of the proxy state Secret field that contains the
// currently applied egress proxy config.
KeyEgressServices = "egress-services"
// KeyHEPPings is the number of times an egress service health check endpoint needs to be pinged to ensure that
// each currently configured backend is hit. In practice, it depends on the number of ProxyGroup replicas.
KeyHEPPings = "hep-pings"
)
// Configs contains the desired configuration for egress services keyed by
// service name.
@@ -24,6 +30,7 @@ type Configs map[string]Config
// Config is an egress service configuration.
// TODO(irbekrm): version this?
type Config struct {
HealthCheckEndpoint string `json:"healthCheckEndpoint"`
// TailnetTarget is the target to which cluster traffic for this service
// should be proxied.
TailnetTarget TailnetTarget `json:"tailnetTarget"`

View File

@@ -55,7 +55,7 @@ func Test_jsonMarshalConfig(t *testing.T) {
protocol: "tcp",
matchPort: 4003,
targetPort: 80,
wantsBs: []byte(`{"tailnetTarget":{"ip":"","fqdn":""},"ports":[{"protocol":"tcp","matchPort":4003,"targetPort":80}]}`),
wantsBs: []byte(`{"healthCheckEndpoint":"","tailnetTarget":{"ip":"","fqdn":""},"ports":[{"protocol":"tcp","matchPort":4003,"targetPort":80}]}`),
},
}
for _, tt := range tests {

View File

@@ -43,4 +43,9 @@ const (
// that cluster workloads behind the Ingress can now be accessed via the given DNS name over HTTPS.
KeyHTTPSEndpoint string = "https_endpoint"
ValueNoHTTPS string = "no-https"
// Pod's IPv4 address header key as returned by containerboot health check endpoint.
PodIPv4Header string = "Pod-IPv4"
EgessServicesPreshutdownEP = "/internal-egress-services-preshutdown"
)

View File

@@ -56,7 +56,19 @@ func (m *darwinRouteMon) Receive() (message, error) {
if err != nil {
return nil, err
}
msgs, err := route.ParseRIB(route.RIBTypeRoute, m.buf[:n])
msgs, err := func() (msgs []route.Message, err error) {
defer func() {
// #14201: permanent panic protection, as we have been burned by
// ParseRIB panics too many times.
msg := recover()
if msg != nil {
msgs = nil
m.logf("[unexpected] netmon: panic in route.ParseRIB from % 02x", m.buf[:n])
err = fmt.Errorf("panic in route.ParseRIB: %s", msg)
}
}()
return route.ParseRIB(route.RIBTypeRoute, m.buf[:n])
}()
if err != nil {
if debugRouteMessages {
m.logf("read %d bytes (% 02x), failed to parse RIB: %v", n, m.buf[:n], err)

75
net/packet/capture.go Normal file
View File

@@ -0,0 +1,75 @@
// Copyright (c) Tailscale Inc & AUTHORS
// SPDX-License-Identifier: BSD-3-Clause
package packet
import (
"io"
"net/netip"
"time"
)
// Callback describes a function which is called to
// record packets when debugging packet-capture.
// Such callbacks must not take ownership of the
// provided data slice: it may only copy out of it
// within the lifetime of the function.
type CaptureCallback func(CapturePath, time.Time, []byte, CaptureMeta)
// CaptureSink is the minimal interface from [tailscale.com/feature/capture]'s
// Sink type that is needed by the core (magicsock/LocalBackend/wgengine/etc).
// This lets the relativel heavy feature/capture package be optionally linked.
type CaptureSink interface {
// Close closes
Close() error
// NumOutputs returns the number of outputs registered with the sink.
NumOutputs() int
// CaptureCallback returns a callback which can be used to
// write packets to the sink.
CaptureCallback() CaptureCallback
// WaitCh returns a channel which blocks until
// the sink is closed.
WaitCh() <-chan struct{}
// RegisterOutput connects an output to this sink, which
// will be written to with a pcap stream as packets are logged.
// A function is returned which unregisters the output when
// called.
//
// If w implements io.Closer, it will be closed upon error
// or when the sink is closed. If w implements http.Flusher,
// it will be flushed periodically.
RegisterOutput(w io.Writer) (unregister func())
}
// CaptureMeta contains metadata that is used when debugging.
type CaptureMeta struct {
DidSNAT bool // SNAT was performed & the address was updated.
OriginalSrc netip.AddrPort // The source address before SNAT was performed.
DidDNAT bool // DNAT was performed & the address was updated.
OriginalDst netip.AddrPort // The destination address before DNAT was performed.
}
// CapturePath describes where in the data path the packet was captured.
type CapturePath uint8
// CapturePath values
const (
// FromLocal indicates the packet was logged as it traversed the FromLocal path:
// i.e.: A packet from the local system into the TUN.
FromLocal CapturePath = 0
// FromPeer indicates the packet was logged upon reception from a remote peer.
FromPeer CapturePath = 1
// SynthesizedToLocal indicates the packet was generated from within tailscaled,
// and is being routed to the local machine's network stack.
SynthesizedToLocal CapturePath = 2
// SynthesizedToPeer indicates the packet was generated from within tailscaled,
// and is being routed to a remote Wireguard peer.
SynthesizedToPeer CapturePath = 3
// PathDisco indicates the packet is information about a disco frame.
PathDisco CapturePath = 254
)

View File

@@ -34,14 +34,6 @@ const (
TCPECNBits TCPFlag = TCPECNEcho | TCPCWR
)
// CaptureMeta contains metadata that is used when debugging.
type CaptureMeta struct {
DidSNAT bool // SNAT was performed & the address was updated.
OriginalSrc netip.AddrPort // The source address before SNAT was performed.
DidDNAT bool // DNAT was performed & the address was updated.
OriginalDst netip.AddrPort // The destination address before DNAT was performed.
}
// Parsed is a minimal decoding of a packet suitable for use in filters.
type Parsed struct {
// b is the byte buffer that this decodes.

View File

@@ -36,7 +36,6 @@ import (
"tailscale.com/types/logger"
"tailscale.com/util/clientmetric"
"tailscale.com/util/usermetric"
"tailscale.com/wgengine/capture"
"tailscale.com/wgengine/filter"
"tailscale.com/wgengine/netstack/gro"
"tailscale.com/wgengine/wgcfg"
@@ -208,7 +207,7 @@ type Wrapper struct {
// stats maintains per-connection counters.
stats atomic.Pointer[connstats.Statistics]
captureHook syncs.AtomicValue[capture.Callback]
captureHook syncs.AtomicValue[packet.CaptureCallback]
metrics *metrics
}
@@ -955,7 +954,7 @@ func (t *Wrapper) Read(buffs [][]byte, sizes []int, offset int) (int, error) {
}
}
if captHook != nil {
captHook(capture.FromLocal, t.now(), p.Buffer(), p.CaptureMeta)
captHook(packet.FromLocal, t.now(), p.Buffer(), p.CaptureMeta)
}
if !t.disableFilter {
var response filter.Response
@@ -1101,9 +1100,9 @@ func (t *Wrapper) injectedRead(res tunInjectedRead, outBuffs [][]byte, sizes []i
return n, err
}
func (t *Wrapper) filterPacketInboundFromWireGuard(p *packet.Parsed, captHook capture.Callback, pc *peerConfigTable, gro *gro.GRO) (filter.Response, *gro.GRO) {
func (t *Wrapper) filterPacketInboundFromWireGuard(p *packet.Parsed, captHook packet.CaptureCallback, pc *peerConfigTable, gro *gro.GRO) (filter.Response, *gro.GRO) {
if captHook != nil {
captHook(capture.FromPeer, t.now(), p.Buffer(), p.CaptureMeta)
captHook(packet.FromPeer, t.now(), p.Buffer(), p.CaptureMeta)
}
if p.IPProto == ipproto.TSMP {
@@ -1317,7 +1316,7 @@ func (t *Wrapper) InjectInboundPacketBuffer(pkt *stack.PacketBuffer, buffs [][]b
p.Decode(buf)
captHook := t.captureHook.Load()
if captHook != nil {
captHook(capture.SynthesizedToLocal, t.now(), p.Buffer(), p.CaptureMeta)
captHook(packet.SynthesizedToLocal, t.now(), p.Buffer(), p.CaptureMeta)
}
invertGSOChecksum(buf, pkt.GSOOptions)
@@ -1449,7 +1448,7 @@ func (t *Wrapper) InjectOutboundPacketBuffer(pkt *stack.PacketBuffer) error {
}
if capt := t.captureHook.Load(); capt != nil {
b := pkt.ToBuffer()
capt(capture.SynthesizedToPeer, t.now(), b.Flatten(), packet.CaptureMeta{})
capt(packet.SynthesizedToPeer, t.now(), b.Flatten(), packet.CaptureMeta{})
}
t.injectOutbound(tunInjectedRead{packet: pkt})
@@ -1491,6 +1490,6 @@ var (
metricPacketOutDropSelfDisco = clientmetric.NewCounter("tstun_out_to_wg_drop_self_disco")
)
func (t *Wrapper) InstallCaptureHook(cb capture.Callback) {
func (t *Wrapper) InstallCaptureHook(cb packet.CaptureCallback) {
t.captureHook.Store(cb)
}

View File

@@ -40,7 +40,6 @@ import (
"tailscale.com/types/views"
"tailscale.com/util/must"
"tailscale.com/util/usermetric"
"tailscale.com/wgengine/capture"
"tailscale.com/wgengine/filter"
"tailscale.com/wgengine/wgcfg"
)
@@ -871,14 +870,14 @@ func TestPeerCfg_NAT(t *testing.T) {
// with the correct parameters when various packet operations are performed.
func TestCaptureHook(t *testing.T) {
type captureRecord struct {
path capture.Path
path packet.CapturePath
now time.Time
pkt []byte
meta packet.CaptureMeta
}
var captured []captureRecord
hook := func(path capture.Path, now time.Time, pkt []byte, meta packet.CaptureMeta) {
hook := func(path packet.CapturePath, now time.Time, pkt []byte, meta packet.CaptureMeta) {
captured = append(captured, captureRecord{
path: path,
now: now,
@@ -935,19 +934,19 @@ func TestCaptureHook(t *testing.T) {
// Assert that the right packets are captured.
want := []captureRecord{
{
path: capture.FromPeer,
path: packet.FromPeer,
pkt: []byte("Write1"),
},
{
path: capture.FromPeer,
path: packet.FromPeer,
pkt: []byte("Write2"),
},
{
path: capture.SynthesizedToLocal,
path: packet.SynthesizedToLocal,
pkt: []byte("InjectInboundPacketBuffer"),
},
{
path: capture.SynthesizedToPeer,
path: packet.SynthesizedToPeer,
pkt: []byte("InjectOutboundPacketBuffer"),
},
}

View File

@@ -7,6 +7,7 @@
package prober
import (
"cmp"
"container/ring"
"context"
"encoding/json"
@@ -20,6 +21,7 @@ import (
"time"
"github.com/prometheus/client_golang/prometheus"
"tailscale.com/syncs"
"tailscale.com/tsweb"
)
@@ -44,6 +46,14 @@ type ProbeClass struct {
// exposed by this probe class.
Labels Labels
// Timeout is the maximum time the probe function is allowed to run before
// its context is cancelled. Defaults to 80% of the scheduling interval.
Timeout time.Duration
// Concurrency is the maximum number of concurrent probe executions
// allowed for this probe class. Defaults to 1.
Concurrency int
// Metrics allows a probe class to export custom Metrics. Can be nil.
Metrics func(prometheus.Labels) []prometheus.Metric
}
@@ -131,9 +141,12 @@ func newProbe(p *Prober, name string, interval time.Duration, l prometheus.Label
cancel: cancel,
stopped: make(chan struct{}),
runSema: syncs.NewSemaphore(cmp.Or(pc.Concurrency, 1)),
name: name,
probeClass: pc,
interval: interval,
timeout: cmp.Or(pc.Timeout, time.Duration(float64(interval)*0.8)),
initialDelay: initialDelay(name, interval),
successHist: ring.New(recentHistSize),
latencyHist: ring.New(recentHistSize),
@@ -226,11 +239,12 @@ type Probe struct {
ctx context.Context
cancel context.CancelFunc // run to initiate shutdown
stopped chan struct{} // closed when shutdown is complete
runMu sync.Mutex // ensures only one probe runs at a time
runSema syncs.Semaphore // restricts concurrency per probe
name string
probeClass ProbeClass
interval time.Duration
timeout time.Duration
initialDelay time.Duration
tick ticker
@@ -282,17 +296,15 @@ func (p *Probe) loop() {
t := p.prober.newTicker(p.initialDelay)
select {
case <-t.Chan():
p.run()
case <-p.ctx.Done():
t.Stop()
return
}
t.Stop()
} else {
p.run()
}
if p.prober.once {
p.run()
return
}
@@ -315,9 +327,12 @@ func (p *Probe) loop() {
p.tick = p.prober.newTicker(p.interval)
defer p.tick.Stop()
for {
// Run the probe in a new goroutine every tick. Default concurrency & timeout
// settings will ensure that only one probe is running at a time.
go p.run()
select {
case <-p.tick.Chan():
p.run()
case <-p.ctx.Done():
return
}
@@ -331,8 +346,13 @@ func (p *Probe) loop() {
// that the probe either succeeds or fails before the next cycle is scheduled to
// start.
func (p *Probe) run() (pi ProbeInfo, err error) {
p.runMu.Lock()
defer p.runMu.Unlock()
// Probes are scheduled each p.interval, so we don't wait longer than that.
semaCtx, cancel := context.WithTimeout(p.ctx, p.interval)
defer cancel()
if !p.runSema.AcquireContext(semaCtx) {
return pi, fmt.Errorf("probe %s: context cancelled", p.name)
}
defer p.runSema.Release()
p.recordStart()
defer func() {
@@ -344,19 +364,21 @@ func (p *Probe) run() (pi ProbeInfo, err error) {
if r := recover(); r != nil {
log.Printf("probe %s panicked: %v", p.name, r)
err = fmt.Errorf("panic: %v", r)
p.recordEnd(err)
p.recordEndLocked(err)
}
}()
ctx := p.ctx
if !p.IsContinuous() {
timeout := time.Duration(float64(p.interval) * 0.8)
var cancel func()
ctx, cancel = context.WithTimeout(ctx, timeout)
ctx, cancel = context.WithTimeout(ctx, p.timeout)
defer cancel()
}
err = p.probeClass.Probe(ctx)
p.recordEnd(err)
p.mu.Lock()
defer p.mu.Unlock()
p.recordEndLocked(err)
if err != nil {
log.Printf("probe %s: %v", p.name, err)
}
@@ -370,10 +392,8 @@ func (p *Probe) recordStart() {
p.mu.Unlock()
}
func (p *Probe) recordEnd(err error) {
func (p *Probe) recordEndLocked(err error) {
end := p.prober.now()
p.mu.Lock()
defer p.mu.Unlock()
p.end = end
p.succeeded = err == nil
p.lastErr = err

View File

@@ -149,6 +149,74 @@ func TestProberTimingSpread(t *testing.T) {
notCalled()
}
func TestProberTimeout(t *testing.T) {
clk := newFakeTime()
p := newForTest(clk.Now, clk.NewTicker)
var done sync.WaitGroup
done.Add(1)
pfunc := FuncProbe(func(ctx context.Context) error {
defer done.Done()
select {
case <-ctx.Done():
return ctx.Err()
}
})
pfunc.Timeout = time.Microsecond
probe := p.Run("foo", 30*time.Second, nil, pfunc)
waitActiveProbes(t, p, clk, 1)
done.Wait()
probe.mu.Lock()
info := probe.probeInfoLocked()
probe.mu.Unlock()
wantInfo := ProbeInfo{
Name: "foo",
Interval: 30 * time.Second,
Labels: map[string]string{"class": "", "name": "foo"},
Status: ProbeStatusFailed,
Error: "context deadline exceeded",
RecentResults: []bool{false},
RecentLatencies: nil,
}
if diff := cmp.Diff(wantInfo, info, cmpopts.IgnoreFields(ProbeInfo{}, "Start", "End", "Latency")); diff != "" {
t.Fatalf("unexpected ProbeInfo (-want +got):\n%s", diff)
}
if got := info.Latency; got > time.Second {
t.Errorf("info.Latency = %v, want at most 1s", got)
}
}
func TestProberConcurrency(t *testing.T) {
clk := newFakeTime()
p := newForTest(clk.Now, clk.NewTicker)
var ran atomic.Int64
stopProbe := make(chan struct{})
pfunc := FuncProbe(func(ctx context.Context) error {
ran.Add(1)
<-stopProbe
return nil
})
pfunc.Timeout = time.Hour
pfunc.Concurrency = 3
p.Run("foo", time.Second, nil, pfunc)
waitActiveProbes(t, p, clk, 1)
for range 50 {
clk.Advance(time.Second)
}
if err := tstest.WaitFor(convergenceTimeout, func() error {
if got, want := ran.Load(), int64(3); got != want {
return fmt.Errorf("expected %d probes to run concurrently, got %d", want, got)
}
return nil
}); err != nil {
t.Fatal(err)
}
close(stopProbe)
}
func TestProberRun(t *testing.T) {
clk := newFakeTime()
p := newForTest(clk.Now, clk.NewTicker)
@@ -450,9 +518,11 @@ func TestProbeInfoRecent(t *testing.T) {
for _, r := range tt.results {
probe.recordStart()
clk.Advance(r.latency)
probe.recordEnd(r.err)
probe.recordEndLocked(r.err)
}
probe.mu.Lock()
info := probe.probeInfoLocked()
probe.mu.Unlock()
if diff := cmp.Diff(tt.wantProbeInfo, info, cmpopts.IgnoreFields(ProbeInfo{}, "Start", "End", "Interval")); diff != "" {
t.Fatalf("unexpected ProbeInfo (-want +got):\n%s", diff)
}

View File

@@ -318,6 +318,12 @@ func checkHandlerType(apiPattern, browserPattern string) handlerType {
}
func (s *Server) ServeHTTP(w http.ResponseWriter, r *http.Request) {
// if we are not in a secure context, signal to the CSRF middleware that
// TLS-only header checks should be skipped
if !s.Config.SecureContext {
r = csrf.PlaintextHTTPRequest(r)
}
_, bp := s.BrowserMux.Handler(r)
_, ap := s.APIMux.Handler(r)
switch {

View File

@@ -199,7 +199,8 @@ func (srv *server) OnPolicyChange() {
// - ServerConfigCallback
//
// Do the user auth
// - NoClientAuthHandler
// - NoClientAuthHandler or publicKeyHandler
// - fakePasswordHandler if forcing password auth with the `+password` username suffix
//
// Once auth is done, the conn can be multiplexed with multiple sessions and
// channels concurrently. At which point any of the following can be called
@@ -337,6 +338,21 @@ func (c *conn) fakePasswordHandler(ctx ssh.Context, password string) bool {
return c.anyPasswordIsOkay
}
// publicKeyHandler is our implementation of the PublicKeyHandler hook that
// checks whether the user's public key is correct. It exists for clients that
// don't support "none" auth and instead insist on supplying a public key.
// This ignores the supplied public key and authenticates with Tailscale auth
// in the same way as NoClientAuthCallback.
func (c *conn) publicKeyHandler(ctx ssh.Context, pubKey ssh.PublicKey) error {
if err := c.doPolicyAuth(ctx); err != nil {
return err
}
if err := c.isAuthorized(ctx); err != nil {
return err
}
return nil
}
// doPolicyAuth verifies that conn can proceed.
// It returns nil if the matching policy action is Accept or
// HoldAndDelegate. Otherwise, it returns errDenied.
@@ -413,6 +429,14 @@ func (srv *server) newConn() (*conn, error) {
NoClientAuthHandler: c.NoClientAuthCallback,
PasswordHandler: c.fakePasswordHandler,
// The below handler exists for clients that don't support "none" auth
// and insist on supplying a public key. It ignores the supplied key
// and instead uses the same Tailscale auth as NoClientAuthCallback.
//
// As of 2025-02-10, tailssh_integration_test does not exercise this functionality.
// See tailscale/tailscale#14969.
PublicKeyHandler: c.publicKeyHandler,
Handler: c.handleSessionPostSSHAuth,
LocalPortForwardingCallback: c.mayForwardLocalPortTo,
ReversePortForwardingCallback: c.mayReversePortForwardTo,

View File

@@ -2470,6 +2470,11 @@ const (
// automatically when the network state changes.
NodeAttrDisableCaptivePortalDetection NodeCapability = "disable-captive-portal-detection"
// NodeAttrDisableSkipStatusQueue is set when the node should disable skipping
// of queued netmap.NetworkMap between the controlclient and LocalBackend.
// See tailscale/tailscale#14768.
NodeAttrDisableSkipStatusQueue NodeCapability = "disable-skip-status-queue"
// NodeAttrSSHEnvironmentVariables enables logic for handling environment variables sent
// via SendEnv in the SSH server and applying them to the SSH session.
NodeAttrSSHEnvironmentVariables NodeCapability = "ssh-env-vars"

14
tempfork/acme/README.md Normal file
View File

@@ -0,0 +1,14 @@
# tempfork/acme
This is a vendored copy of Tailscale's https://github.com/tailscale/golang-x-crypto,
which is a fork of golang.org/x/crypto/acme.
See https://github.com/tailscale/tailscale/issues/10238 for unforking
status.
The https://github.com/tailscale/golang-x-crypto location exists to
let us do rebases from upstream easily, and then we update tempfork/acme
in the same commit we go get github.com/tailscale/golang-x-crypto@main.
See the comment on the TestSyncedToUpstream test for details. That
test should catch that forgotten step.

861
tempfork/acme/acme.go Normal file
View File

@@ -0,0 +1,861 @@
// Copyright 2015 The Go Authors. All rights reserved.
// Use of this source code is governed by a BSD-style
// license that can be found in the LICENSE file.
// Package acme provides an implementation of the
// Automatic Certificate Management Environment (ACME) spec,
// most famously used by Let's Encrypt.
//
// The initial implementation of this package was based on an early version
// of the spec. The current implementation supports only the modern
// RFC 8555 but some of the old API surface remains for compatibility.
// While code using the old API will still compile, it will return an error.
// Note the deprecation comments to update your code.
//
// See https://tools.ietf.org/html/rfc8555 for the spec.
//
// Most common scenarios will want to use autocert subdirectory instead,
// which provides automatic access to certificates from Let's Encrypt
// and any other ACME-based CA.
package acme
import (
"context"
"crypto"
"crypto/ecdsa"
"crypto/elliptic"
"crypto/rand"
"crypto/sha256"
"crypto/tls"
"crypto/x509"
"crypto/x509/pkix"
"encoding/asn1"
"encoding/base64"
"encoding/hex"
"encoding/json"
"encoding/pem"
"errors"
"fmt"
"math/big"
"net/http"
"strings"
"sync"
"time"
)
const (
// LetsEncryptURL is the Directory endpoint of Let's Encrypt CA.
LetsEncryptURL = "https://acme-v02.api.letsencrypt.org/directory"
// ALPNProto is the ALPN protocol name used by a CA server when validating
// tls-alpn-01 challenges.
//
// Package users must ensure their servers can negotiate the ACME ALPN in
// order for tls-alpn-01 challenge verifications to succeed.
// See the crypto/tls package's Config.NextProtos field.
ALPNProto = "acme-tls/1"
)
// idPeACMEIdentifier is the OID for the ACME extension for the TLS-ALPN challenge.
// https://tools.ietf.org/html/draft-ietf-acme-tls-alpn-05#section-5.1
var idPeACMEIdentifier = asn1.ObjectIdentifier{1, 3, 6, 1, 5, 5, 7, 1, 31}
const (
maxChainLen = 5 // max depth and breadth of a certificate chain
maxCertSize = 1 << 20 // max size of a certificate, in DER bytes
// Used for decoding certs from application/pem-certificate-chain response,
// the default when in RFC mode.
maxCertChainSize = maxCertSize * maxChainLen
// Max number of collected nonces kept in memory.
// Expect usual peak of 1 or 2.
maxNonces = 100
)
// Client is an ACME client.
//
// The only required field is Key. An example of creating a client with a new key
// is as follows:
//
// key, err := rsa.GenerateKey(rand.Reader, 2048)
// if err != nil {
// log.Fatal(err)
// }
// client := &Client{Key: key}
type Client struct {
// Key is the account key used to register with a CA and sign requests.
// Key.Public() must return a *rsa.PublicKey or *ecdsa.PublicKey.
//
// The following algorithms are supported:
// RS256, ES256, ES384 and ES512.
// See RFC 7518 for more details about the algorithms.
Key crypto.Signer
// HTTPClient optionally specifies an HTTP client to use
// instead of http.DefaultClient.
HTTPClient *http.Client
// DirectoryURL points to the CA directory endpoint.
// If empty, LetsEncryptURL is used.
// Mutating this value after a successful call of Client's Discover method
// will have no effect.
DirectoryURL string
// RetryBackoff computes the duration after which the nth retry of a failed request
// should occur. The value of n for the first call on failure is 1.
// The values of r and resp are the request and response of the last failed attempt.
// If the returned value is negative or zero, no more retries are done and an error
// is returned to the caller of the original method.
//
// Requests which result in a 4xx client error are not retried,
// except for 400 Bad Request due to "bad nonce" errors and 429 Too Many Requests.
//
// If RetryBackoff is nil, a truncated exponential backoff algorithm
// with the ceiling of 10 seconds is used, where each subsequent retry n
// is done after either ("Retry-After" + jitter) or (2^n seconds + jitter),
// preferring the former if "Retry-After" header is found in the resp.
// The jitter is a random value up to 1 second.
RetryBackoff func(n int, r *http.Request, resp *http.Response) time.Duration
// UserAgent is prepended to the User-Agent header sent to the ACME server,
// which by default is this package's name and version.
//
// Reusable libraries and tools in particular should set this value to be
// identifiable by the server, in case they are causing issues.
UserAgent string
cacheMu sync.Mutex
dir *Directory // cached result of Client's Discover method
// KID is the key identifier provided by the CA. If not provided it will be
// retrieved from the CA by making a call to the registration endpoint.
KID KeyID
noncesMu sync.Mutex
nonces map[string]struct{} // nonces collected from previous responses
}
// accountKID returns a key ID associated with c.Key, the account identity
// provided by the CA during RFC based registration.
// It assumes c.Discover has already been called.
//
// accountKID requires at most one network roundtrip.
// It caches only successful result.
//
// When in pre-RFC mode or when c.getRegRFC responds with an error, accountKID
// returns noKeyID.
func (c *Client) accountKID(ctx context.Context) KeyID {
c.cacheMu.Lock()
defer c.cacheMu.Unlock()
if c.KID != noKeyID {
return c.KID
}
a, err := c.getRegRFC(ctx)
if err != nil {
return noKeyID
}
c.KID = KeyID(a.URI)
return c.KID
}
var errPreRFC = errors.New("acme: server does not support the RFC 8555 version of ACME")
// Discover performs ACME server discovery using c.DirectoryURL.
//
// It caches successful result. So, subsequent calls will not result in
// a network round-trip. This also means mutating c.DirectoryURL after successful call
// of this method will have no effect.
func (c *Client) Discover(ctx context.Context) (Directory, error) {
c.cacheMu.Lock()
defer c.cacheMu.Unlock()
if c.dir != nil {
return *c.dir, nil
}
res, err := c.get(ctx, c.directoryURL(), wantStatus(http.StatusOK))
if err != nil {
return Directory{}, err
}
defer res.Body.Close()
c.addNonce(res.Header)
var v struct {
Reg string `json:"newAccount"`
Authz string `json:"newAuthz"`
Order string `json:"newOrder"`
Revoke string `json:"revokeCert"`
Nonce string `json:"newNonce"`
KeyChange string `json:"keyChange"`
RenewalInfo string `json:"renewalInfo"`
Meta struct {
Terms string `json:"termsOfService"`
Website string `json:"website"`
CAA []string `json:"caaIdentities"`
ExternalAcct bool `json:"externalAccountRequired"`
}
}
if err := json.NewDecoder(res.Body).Decode(&v); err != nil {
return Directory{}, err
}
if v.Order == "" {
return Directory{}, errPreRFC
}
c.dir = &Directory{
RegURL: v.Reg,
AuthzURL: v.Authz,
OrderURL: v.Order,
RevokeURL: v.Revoke,
NonceURL: v.Nonce,
KeyChangeURL: v.KeyChange,
RenewalInfoURL: v.RenewalInfo,
Terms: v.Meta.Terms,
Website: v.Meta.Website,
CAA: v.Meta.CAA,
ExternalAccountRequired: v.Meta.ExternalAcct,
}
return *c.dir, nil
}
func (c *Client) directoryURL() string {
if c.DirectoryURL != "" {
return c.DirectoryURL
}
return LetsEncryptURL
}
// CreateCert was part of the old version of ACME. It is incompatible with RFC 8555.
//
// Deprecated: this was for the pre-RFC 8555 version of ACME. Callers should use CreateOrderCert.
func (c *Client) CreateCert(ctx context.Context, csr []byte, exp time.Duration, bundle bool) (der [][]byte, certURL string, err error) {
return nil, "", errPreRFC
}
// FetchCert retrieves already issued certificate from the given url, in DER format.
// It retries the request until the certificate is successfully retrieved,
// context is cancelled by the caller or an error response is received.
//
// If the bundle argument is true, the returned value also contains the CA (issuer)
// certificate chain.
//
// FetchCert returns an error if the CA's response or chain was unreasonably large.
// Callers are encouraged to parse the returned value to ensure the certificate is valid
// and has expected features.
func (c *Client) FetchCert(ctx context.Context, url string, bundle bool) ([][]byte, error) {
if _, err := c.Discover(ctx); err != nil {
return nil, err
}
return c.fetchCertRFC(ctx, url, bundle)
}
// RevokeCert revokes a previously issued certificate cert, provided in DER format.
//
// The key argument, used to sign the request, must be authorized
// to revoke the certificate. It's up to the CA to decide which keys are authorized.
// For instance, the key pair of the certificate may be authorized.
// If the key is nil, c.Key is used instead.
func (c *Client) RevokeCert(ctx context.Context, key crypto.Signer, cert []byte, reason CRLReasonCode) error {
if _, err := c.Discover(ctx); err != nil {
return err
}
return c.revokeCertRFC(ctx, key, cert, reason)
}
// FetchRenewalInfo retrieves the RenewalInfo from Directory.RenewalInfoURL.
func (c *Client) FetchRenewalInfo(ctx context.Context, leaf []byte) (*RenewalInfo, error) {
if _, err := c.Discover(ctx); err != nil {
return nil, err
}
parsedLeaf, err := x509.ParseCertificate(leaf)
if err != nil {
return nil, fmt.Errorf("parsing leaf certificate: %w", err)
}
renewalURL, err := c.getRenewalURL(parsedLeaf)
if err != nil {
return nil, fmt.Errorf("generating renewal info URL: %w", err)
}
res, err := c.get(ctx, renewalURL, wantStatus(http.StatusOK))
if err != nil {
return nil, fmt.Errorf("fetching renewal info: %w", err)
}
defer res.Body.Close()
var info RenewalInfo
if err := json.NewDecoder(res.Body).Decode(&info); err != nil {
return nil, fmt.Errorf("parsing renewal info response: %w", err)
}
return &info, nil
}
func (c *Client) getRenewalURL(cert *x509.Certificate) (string, error) {
// See https://www.ietf.org/archive/id/draft-ietf-acme-ari-04.html#name-the-renewalinfo-resource
// for how the request URL is built.
url := c.dir.RenewalInfoURL
if !strings.HasSuffix(url, "/") {
url += "/"
}
aki := base64.RawURLEncoding.EncodeToString(cert.AuthorityKeyId)
serial := base64.RawURLEncoding.EncodeToString(cert.SerialNumber.Bytes())
return fmt.Sprintf("%s%s.%s", url, aki, serial), nil
}
// AcceptTOS always returns true to indicate the acceptance of a CA's Terms of Service
// during account registration. See Register method of Client for more details.
func AcceptTOS(tosURL string) bool { return true }
// Register creates a new account with the CA using c.Key.
// It returns the registered account. The account acct is not modified.
//
// The registration may require the caller to agree to the CA's Terms of Service (TOS).
// If so, and the account has not indicated the acceptance of the terms (see Account for details),
// Register calls prompt with a TOS URL provided by the CA. Prompt should report
// whether the caller agrees to the terms. To always accept the terms, the caller can use AcceptTOS.
//
// When interfacing with an RFC-compliant CA, non-RFC 8555 fields of acct are ignored
// and prompt is called if Directory's Terms field is non-zero.
// Also see Error's Instance field for when a CA requires already registered accounts to agree
// to an updated Terms of Service.
func (c *Client) Register(ctx context.Context, acct *Account, prompt func(tosURL string) bool) (*Account, error) {
if c.Key == nil {
return nil, errors.New("acme: client.Key must be set to Register")
}
if _, err := c.Discover(ctx); err != nil {
return nil, err
}
return c.registerRFC(ctx, acct, prompt)
}
// GetReg retrieves an existing account associated with c.Key.
//
// The url argument is a legacy artifact of the pre-RFC 8555 API
// and is ignored.
func (c *Client) GetReg(ctx context.Context, url string) (*Account, error) {
if _, err := c.Discover(ctx); err != nil {
return nil, err
}
return c.getRegRFC(ctx)
}
// UpdateReg updates an existing registration.
// It returns an updated account copy. The provided account is not modified.
//
// The account's URI is ignored and the account URL associated with
// c.Key is used instead.
func (c *Client) UpdateReg(ctx context.Context, acct *Account) (*Account, error) {
if _, err := c.Discover(ctx); err != nil {
return nil, err
}
return c.updateRegRFC(ctx, acct)
}
// AccountKeyRollover attempts to transition a client's account key to a new key.
// On success client's Key is updated which is not concurrency safe.
// On failure an error will be returned.
// The new key is already registered with the ACME provider if the following is true:
// - error is of type acme.Error
// - StatusCode should be 409 (Conflict)
// - Location header will have the KID of the associated account
//
// More about account key rollover can be found at
// https://tools.ietf.org/html/rfc8555#section-7.3.5.
func (c *Client) AccountKeyRollover(ctx context.Context, newKey crypto.Signer) error {
return c.accountKeyRollover(ctx, newKey)
}
// Authorize performs the initial step in the pre-authorization flow,
// as opposed to order-based flow.
// The caller will then need to choose from and perform a set of returned
// challenges using c.Accept in order to successfully complete authorization.
//
// Once complete, the caller can use AuthorizeOrder which the CA
// should provision with the already satisfied authorization.
// For pre-RFC CAs, the caller can proceed directly to requesting a certificate
// using CreateCert method.
//
// If an authorization has been previously granted, the CA may return
// a valid authorization which has its Status field set to StatusValid.
//
// More about pre-authorization can be found at
// https://tools.ietf.org/html/rfc8555#section-7.4.1.
func (c *Client) Authorize(ctx context.Context, domain string) (*Authorization, error) {
return c.authorize(ctx, "dns", domain)
}
// AuthorizeIP is the same as Authorize but requests IP address authorization.
// Clients which successfully obtain such authorization may request to issue
// a certificate for IP addresses.
//
// See the ACME spec extension for more details about IP address identifiers:
// https://tools.ietf.org/html/draft-ietf-acme-ip.
func (c *Client) AuthorizeIP(ctx context.Context, ipaddr string) (*Authorization, error) {
return c.authorize(ctx, "ip", ipaddr)
}
func (c *Client) authorize(ctx context.Context, typ, val string) (*Authorization, error) {
if _, err := c.Discover(ctx); err != nil {
return nil, err
}
type authzID struct {
Type string `json:"type"`
Value string `json:"value"`
}
req := struct {
Resource string `json:"resource"`
Identifier authzID `json:"identifier"`
}{
Resource: "new-authz",
Identifier: authzID{Type: typ, Value: val},
}
res, err := c.post(ctx, nil, c.dir.AuthzURL, req, wantStatus(http.StatusCreated))
if err != nil {
return nil, err
}
defer res.Body.Close()
var v wireAuthz
if err := json.NewDecoder(res.Body).Decode(&v); err != nil {
return nil, fmt.Errorf("acme: invalid response: %v", err)
}
if v.Status != StatusPending && v.Status != StatusValid {
return nil, fmt.Errorf("acme: unexpected status: %s", v.Status)
}
return v.authorization(res.Header.Get("Location")), nil
}
// GetAuthorization retrieves an authorization identified by the given URL.
//
// If a caller needs to poll an authorization until its status is final,
// see the WaitAuthorization method.
func (c *Client) GetAuthorization(ctx context.Context, url string) (*Authorization, error) {
if _, err := c.Discover(ctx); err != nil {
return nil, err
}
res, err := c.postAsGet(ctx, url, wantStatus(http.StatusOK))
if err != nil {
return nil, err
}
defer res.Body.Close()
var v wireAuthz
if err := json.NewDecoder(res.Body).Decode(&v); err != nil {
return nil, fmt.Errorf("acme: invalid response: %v", err)
}
return v.authorization(url), nil
}
// RevokeAuthorization relinquishes an existing authorization identified
// by the given URL.
// The url argument is an Authorization.URI value.
//
// If successful, the caller will be required to obtain a new authorization
// using the Authorize or AuthorizeOrder methods before being able to request
// a new certificate for the domain associated with the authorization.
//
// It does not revoke existing certificates.
func (c *Client) RevokeAuthorization(ctx context.Context, url string) error {
if _, err := c.Discover(ctx); err != nil {
return err
}
req := struct {
Resource string `json:"resource"`
Status string `json:"status"`
Delete bool `json:"delete"`
}{
Resource: "authz",
Status: "deactivated",
Delete: true,
}
res, err := c.post(ctx, nil, url, req, wantStatus(http.StatusOK))
if err != nil {
return err
}
defer res.Body.Close()
return nil
}
// WaitAuthorization polls an authorization at the given URL
// until it is in one of the final states, StatusValid or StatusInvalid,
// the ACME CA responded with a 4xx error code, or the context is done.
//
// It returns a non-nil Authorization only if its Status is StatusValid.
// In all other cases WaitAuthorization returns an error.
// If the Status is StatusInvalid, the returned error is of type *AuthorizationError.
func (c *Client) WaitAuthorization(ctx context.Context, url string) (*Authorization, error) {
if _, err := c.Discover(ctx); err != nil {
return nil, err
}
for {
res, err := c.postAsGet(ctx, url, wantStatus(http.StatusOK, http.StatusAccepted))
if err != nil {
return nil, err
}
var raw wireAuthz
err = json.NewDecoder(res.Body).Decode(&raw)
res.Body.Close()
switch {
case err != nil:
// Skip and retry.
case raw.Status == StatusValid:
return raw.authorization(url), nil
case raw.Status == StatusInvalid:
return nil, raw.error(url)
}
// Exponential backoff is implemented in c.get above.
// This is just to prevent continuously hitting the CA
// while waiting for a final authorization status.
d := retryAfter(res.Header.Get("Retry-After"))
if d == 0 {
// Given that the fastest challenges TLS-SNI and HTTP-01
// require a CA to make at least 1 network round trip
// and most likely persist a challenge state,
// this default delay seems reasonable.
d = time.Second
}
t := time.NewTimer(d)
select {
case <-ctx.Done():
t.Stop()
return nil, ctx.Err()
case <-t.C:
// Retry.
}
}
}
// GetChallenge retrieves the current status of an challenge.
//
// A client typically polls a challenge status using this method.
func (c *Client) GetChallenge(ctx context.Context, url string) (*Challenge, error) {
if _, err := c.Discover(ctx); err != nil {
return nil, err
}
res, err := c.postAsGet(ctx, url, wantStatus(http.StatusOK, http.StatusAccepted))
if err != nil {
return nil, err
}
defer res.Body.Close()
v := wireChallenge{URI: url}
if err := json.NewDecoder(res.Body).Decode(&v); err != nil {
return nil, fmt.Errorf("acme: invalid response: %v", err)
}
return v.challenge(), nil
}
// Accept informs the server that the client accepts one of its challenges
// previously obtained with c.Authorize.
//
// The server will then perform the validation asynchronously.
func (c *Client) Accept(ctx context.Context, chal *Challenge) (*Challenge, error) {
if _, err := c.Discover(ctx); err != nil {
return nil, err
}
res, err := c.post(ctx, nil, chal.URI, json.RawMessage("{}"), wantStatus(
http.StatusOK, // according to the spec
http.StatusAccepted, // Let's Encrypt: see https://goo.gl/WsJ7VT (acme-divergences.md)
))
if err != nil {
return nil, err
}
defer res.Body.Close()
var v wireChallenge
if err := json.NewDecoder(res.Body).Decode(&v); err != nil {
return nil, fmt.Errorf("acme: invalid response: %v", err)
}
return v.challenge(), nil
}
// DNS01ChallengeRecord returns a DNS record value for a dns-01 challenge response.
// A TXT record containing the returned value must be provisioned under
// "_acme-challenge" name of the domain being validated.
//
// The token argument is a Challenge.Token value.
func (c *Client) DNS01ChallengeRecord(token string) (string, error) {
ka, err := keyAuth(c.Key.Public(), token)
if err != nil {
return "", err
}
b := sha256.Sum256([]byte(ka))
return base64.RawURLEncoding.EncodeToString(b[:]), nil
}
// HTTP01ChallengeResponse returns the response for an http-01 challenge.
// Servers should respond with the value to HTTP requests at the URL path
// provided by HTTP01ChallengePath to validate the challenge and prove control
// over a domain name.
//
// The token argument is a Challenge.Token value.
func (c *Client) HTTP01ChallengeResponse(token string) (string, error) {
return keyAuth(c.Key.Public(), token)
}
// HTTP01ChallengePath returns the URL path at which the response for an http-01 challenge
// should be provided by the servers.
// The response value can be obtained with HTTP01ChallengeResponse.
//
// The token argument is a Challenge.Token value.
func (c *Client) HTTP01ChallengePath(token string) string {
return "/.well-known/acme-challenge/" + token
}
// TLSSNI01ChallengeCert creates a certificate for TLS-SNI-01 challenge response.
//
// Deprecated: This challenge type is unused in both draft-02 and RFC versions of the ACME spec.
func (c *Client) TLSSNI01ChallengeCert(token string, opt ...CertOption) (cert tls.Certificate, name string, err error) {
ka, err := keyAuth(c.Key.Public(), token)
if err != nil {
return tls.Certificate{}, "", err
}
b := sha256.Sum256([]byte(ka))
h := hex.EncodeToString(b[:])
name = fmt.Sprintf("%s.%s.acme.invalid", h[:32], h[32:])
cert, err = tlsChallengeCert([]string{name}, opt)
if err != nil {
return tls.Certificate{}, "", err
}
return cert, name, nil
}
// TLSSNI02ChallengeCert creates a certificate for TLS-SNI-02 challenge response.
//
// Deprecated: This challenge type is unused in both draft-02 and RFC versions of the ACME spec.
func (c *Client) TLSSNI02ChallengeCert(token string, opt ...CertOption) (cert tls.Certificate, name string, err error) {
b := sha256.Sum256([]byte(token))
h := hex.EncodeToString(b[:])
sanA := fmt.Sprintf("%s.%s.token.acme.invalid", h[:32], h[32:])
ka, err := keyAuth(c.Key.Public(), token)
if err != nil {
return tls.Certificate{}, "", err
}
b = sha256.Sum256([]byte(ka))
h = hex.EncodeToString(b[:])
sanB := fmt.Sprintf("%s.%s.ka.acme.invalid", h[:32], h[32:])
cert, err = tlsChallengeCert([]string{sanA, sanB}, opt)
if err != nil {
return tls.Certificate{}, "", err
}
return cert, sanA, nil
}
// TLSALPN01ChallengeCert creates a certificate for TLS-ALPN-01 challenge response.
// Servers can present the certificate to validate the challenge and prove control
// over a domain name. For more details on TLS-ALPN-01 see
// https://tools.ietf.org/html/draft-shoemaker-acme-tls-alpn-00#section-3
//
// The token argument is a Challenge.Token value.
// If a WithKey option is provided, its private part signs the returned cert,
// and the public part is used to specify the signee.
// If no WithKey option is provided, a new ECDSA key is generated using P-256 curve.
//
// The returned certificate is valid for the next 24 hours and must be presented only when
// the server name in the TLS ClientHello matches the domain, and the special acme-tls/1 ALPN protocol
// has been specified.
func (c *Client) TLSALPN01ChallengeCert(token, domain string, opt ...CertOption) (cert tls.Certificate, err error) {
ka, err := keyAuth(c.Key.Public(), token)
if err != nil {
return tls.Certificate{}, err
}
shasum := sha256.Sum256([]byte(ka))
extValue, err := asn1.Marshal(shasum[:])
if err != nil {
return tls.Certificate{}, err
}
acmeExtension := pkix.Extension{
Id: idPeACMEIdentifier,
Critical: true,
Value: extValue,
}
tmpl := defaultTLSChallengeCertTemplate()
var newOpt []CertOption
for _, o := range opt {
switch o := o.(type) {
case *certOptTemplate:
t := *(*x509.Certificate)(o) // shallow copy is ok
tmpl = &t
default:
newOpt = append(newOpt, o)
}
}
tmpl.ExtraExtensions = append(tmpl.ExtraExtensions, acmeExtension)
newOpt = append(newOpt, WithTemplate(tmpl))
return tlsChallengeCert([]string{domain}, newOpt)
}
// popNonce returns a nonce value previously stored with c.addNonce
// or fetches a fresh one from c.dir.NonceURL.
// If NonceURL is empty, it first tries c.directoryURL() and, failing that,
// the provided url.
func (c *Client) popNonce(ctx context.Context, url string) (string, error) {
c.noncesMu.Lock()
defer c.noncesMu.Unlock()
if len(c.nonces) == 0 {
if c.dir != nil && c.dir.NonceURL != "" {
return c.fetchNonce(ctx, c.dir.NonceURL)
}
dirURL := c.directoryURL()
v, err := c.fetchNonce(ctx, dirURL)
if err != nil && url != dirURL {
v, err = c.fetchNonce(ctx, url)
}
return v, err
}
var nonce string
for nonce = range c.nonces {
delete(c.nonces, nonce)
break
}
return nonce, nil
}
// clearNonces clears any stored nonces
func (c *Client) clearNonces() {
c.noncesMu.Lock()
defer c.noncesMu.Unlock()
c.nonces = make(map[string]struct{})
}
// addNonce stores a nonce value found in h (if any) for future use.
func (c *Client) addNonce(h http.Header) {
v := nonceFromHeader(h)
if v == "" {
return
}
c.noncesMu.Lock()
defer c.noncesMu.Unlock()
if len(c.nonces) >= maxNonces {
return
}
if c.nonces == nil {
c.nonces = make(map[string]struct{})
}
c.nonces[v] = struct{}{}
}
func (c *Client) fetchNonce(ctx context.Context, url string) (string, error) {
r, err := http.NewRequest("HEAD", url, nil)
if err != nil {
return "", err
}
resp, err := c.doNoRetry(ctx, r)
if err != nil {
return "", err
}
defer resp.Body.Close()
nonce := nonceFromHeader(resp.Header)
if nonce == "" {
if resp.StatusCode > 299 {
return "", responseError(resp)
}
return "", errors.New("acme: nonce not found")
}
return nonce, nil
}
func nonceFromHeader(h http.Header) string {
return h.Get("Replay-Nonce")
}
// linkHeader returns URI-Reference values of all Link headers
// with relation-type rel.
// See https://tools.ietf.org/html/rfc5988#section-5 for details.
func linkHeader(h http.Header, rel string) []string {
var links []string
for _, v := range h["Link"] {
parts := strings.Split(v, ";")
for _, p := range parts {
p = strings.TrimSpace(p)
if !strings.HasPrefix(p, "rel=") {
continue
}
if v := strings.Trim(p[4:], `"`); v == rel {
links = append(links, strings.Trim(parts[0], "<>"))
}
}
}
return links
}
// keyAuth generates a key authorization string for a given token.
func keyAuth(pub crypto.PublicKey, token string) (string, error) {
th, err := JWKThumbprint(pub)
if err != nil {
return "", err
}
return fmt.Sprintf("%s.%s", token, th), nil
}
// defaultTLSChallengeCertTemplate is a template used to create challenge certs for TLS challenges.
func defaultTLSChallengeCertTemplate() *x509.Certificate {
return &x509.Certificate{
SerialNumber: big.NewInt(1),
NotBefore: time.Now(),
NotAfter: time.Now().Add(24 * time.Hour),
BasicConstraintsValid: true,
KeyUsage: x509.KeyUsageKeyEncipherment | x509.KeyUsageDigitalSignature,
ExtKeyUsage: []x509.ExtKeyUsage{x509.ExtKeyUsageServerAuth},
}
}
// tlsChallengeCert creates a temporary certificate for TLS-SNI challenges
// with the given SANs and auto-generated public/private key pair.
// The Subject Common Name is set to the first SAN to aid debugging.
// To create a cert with a custom key pair, specify WithKey option.
func tlsChallengeCert(san []string, opt []CertOption) (tls.Certificate, error) {
var key crypto.Signer
tmpl := defaultTLSChallengeCertTemplate()
for _, o := range opt {
switch o := o.(type) {
case *certOptKey:
if key != nil {
return tls.Certificate{}, errors.New("acme: duplicate key option")
}
key = o.key
case *certOptTemplate:
t := *(*x509.Certificate)(o) // shallow copy is ok
tmpl = &t
default:
// package's fault, if we let this happen:
panic(fmt.Sprintf("unsupported option type %T", o))
}
}
if key == nil {
var err error
if key, err = ecdsa.GenerateKey(elliptic.P256(), rand.Reader); err != nil {
return tls.Certificate{}, err
}
}
tmpl.DNSNames = san
if len(san) > 0 {
tmpl.Subject.CommonName = san[0]
}
der, err := x509.CreateCertificate(rand.Reader, tmpl, tmpl, key.Public(), key)
if err != nil {
return tls.Certificate{}, err
}
return tls.Certificate{
Certificate: [][]byte{der},
PrivateKey: key,
}, nil
}
// encodePEM returns b encoded as PEM with block of type typ.
func encodePEM(typ string, b []byte) []byte {
pb := &pem.Block{Type: typ, Bytes: b}
return pem.EncodeToMemory(pb)
}
// timeNow is time.Now, except in tests which can mess with it.
var timeNow = time.Now

Some files were not shown because too many files have changed in this diff Show More