Compare commits

...

592 Commits

Author SHA1 Message Date
Jonathan Nobels
b24b36fcc8 ipn, tstime : add opt in rate limiting for netmap updates on the IPN bus
updates tailscale/corp#24553

Adds opt-in rate limiting to limit netmap updates to, at most, one every
3 seconds when the client includes the NotifyRateLimitNetmaps option
in the ipn bus watcher opts.   This should mitigate issues with excessive
memory and CPU usage in clients on large, busy tailnets.

Signed-off-by: Jonathan Nobels <jonathan@tailscale.com>
2024-11-15 15:14:16 -05:00
Brad Fitzpatrick
8fd471ce57 control/controlclient: disable https on for http://localhost:$port URLs
Previously we required the program to be running in a test or have
TS_CONTROL_IS_PLAINTEXT_HTTP before we disabled its https fallback
on "http" schema control URLs to localhost with ports.

But nobody accidentally does all three of "http", explicit port
number, localhost and doesn't mean it. And when they mean it, they're
testing a localhost dev control server (like I was) and don't want 443
getting involved.

As of the changes for #13597, this became more annoying in that we
were trying to use a port which wasn't even available.

Updates #13597

Change-Id: Icd00bca56043d2da58ab31de7aa05a3b269c490f
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-11-14 12:12:16 -08:00
Brad Fitzpatrick
e73cfd9700 go.toolchain.rev: bump from Go 1.23.1 to Go 1.23.3
Updates #14100

Change-Id: I57f9d4260be15ce1daebe4a9782910aba3fb9dc9
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-11-14 10:57:49 -08:00
Brad Fitzpatrick
f593d3c5c0 cmd/tailscale/cli: add "help" alias for --help
Fixes #14053

Change-Id: I0a13e11af089f02b0656fea0d316543c67591fb5
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-11-13 11:08:53 -08:00
dependabot[bot]
bfe5cd8760 .github: Bump actions/setup-go from 5.0.2 to 5.1.0 (#13934)
Bumps [actions/setup-go](https://github.com/actions/setup-go) from 5.0.2 to 5.1.0.
- [Release notes](https://github.com/actions/setup-go/releases)
- [Commits](0a12ed9d6a...41dfa10bad)

---
updated-dependencies:
- dependency-name: actions/setup-go
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-11-13 10:56:44 -07:00
Walter Poupore
0c9ade46a4 words: Add scoville to scales.txt (#14084)
https://en.wikipedia.org/wiki/Scoville_scale

Updates #words

Signed-off-by: Walter Poupore <walterp@tailscale.com>
2024-11-13 09:25:12 -08:00
dependabot[bot]
4474dcea68 .github: Bump actions/cache from 4.1.0 to 4.1.2 (#13933)
Bumps [actions/cache](https://github.com/actions/cache) from 4.1.0 to 4.1.2.
- [Release notes](https://github.com/actions/cache/releases)
- [Changelog](https://github.com/actions/cache/blob/main/RELEASES.md)
- [Commits](2cdf405574...6849a64899)

---
updated-dependencies:
- dependency-name: actions/cache
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-11-13 09:46:30 -07:00
dependabot[bot]
0cfa217f3e .github: Bump actions/upload-artifact from 4.4.0 to 4.4.3 (#13811)
Bumps [actions/upload-artifact](https://github.com/actions/upload-artifact) from 4.4.0 to 4.4.3.
- [Release notes](https://github.com/actions/upload-artifact/releases)
- [Commits](50769540e7...b4b15b8c7c)

---
updated-dependencies:
- dependency-name: actions/upload-artifact
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-11-13 09:34:10 -07:00
dependabot[bot]
1847f26042 .github: Bump github/codeql-action from 3.26.11 to 3.27.1 (#14062)
Bumps [github/codeql-action](https://github.com/github/codeql-action) from 3.26.11 to 3.27.1.
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](6db8d6351f...4f3212b617)

---
updated-dependencies:
- dependency-name: github/codeql-action
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-11-13 09:30:14 -07:00
Naman Sood
7c6562c861 words: scale up our word count (#14082)
Updates tailscale/corp#14698

Signed-off-by: Naman Sood <mail@nsood.in>
2024-11-13 09:56:02 -05:00
Brad Fitzpatrick
0c6bd9a33b words: add a scale
https://portsmouthbrewery.com/shilling-scale/

Any scale that includes "wee heavy" is a scale worth including.

Updates #words

Change-Id: I85fd7a64cf22e14f686f1093a220cb59c43e46ba
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-11-13 06:09:59 -08:00
Irbe Krumina
cf41cec5a8 cmd/{k8s-operator,containerboot},k8s-operator: remove support for proxies below capver 95. (#13986)
Updates tailscale/tailscale#13984

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2024-11-12 17:13:26 +00:00
Irbe Krumina
e38522c081 go.{mod,sum},build_docker.sh: bump mkctr, add ability to set OCI annotations for images (#14065)
Updates tailscale/tailscale#12914

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2024-11-12 14:23:38 +00:00
Tom Proctor
d8a3683fdf cmd/k8s-operator: restart ProxyGroup pods less (#14045)
We currently annotate pods with a hash of the tailscaled config so that
we can trigger pod restarts whenever it changes. However, the hash
updates more frequently than is necessary causing more restarts than is
necessary. This commit removes two causes; scaling up/down and removing
the auth key after pods have initially authed to control. However, note
that pods will still restart on scale-up/down because of the updated set
of volumes mounted into each pod. Hopefully we can fix that in a planned
follow-up PR.

Updates #13406

Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
2024-11-12 14:18:19 +00:00
Brad Fitzpatrick
4e0fc037e6 all: use iterators over slice views more
This gets close to all of the remaining ones.

Updates #12912

Change-Id: I9c672bbed2654a6c5cab31e0cbece6c107d8c6fa
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-11-11 13:22:34 -08:00
Brad Fitzpatrick
00be1761b7 util/codegen: treat unique.Handle as an opaque value type
It doesn't need a Clone method, like a time.Time, etc.

And then, because Go 1.23+ uses unique.Handle internally for
the netip package types, we can remove those special cases.

Updates #14058 (pulled out from that PR)
Updates tailscale/corp#24485

Change-Id: Iac3548a9417ccda5987f98e0305745a6e178b375
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-11-11 12:39:19 -08:00
Irbe Krumina
b9ecc50ce3 cmd/k8s-operator,k8s-operator,kube/kubetypes: add an option to configure app connector via Connector spec (#13950)
* cmd/k8s-operator,k8s-operator,kube/kubetypes: add an option to configure app connector via Connector spec

Updates tailscale/tailscale#11113

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2024-11-11 11:43:54 +00:00
M. J. Fromberger
6ff85846bc safeweb: add a Shutdown method to the Server type (#14048)
Updates #14047

Change-Id: I2d20454c715b11ad9c6aad1d81445e05a170c3a2
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
2024-11-08 10:02:16 -08:00
Anton Tolchanov
64d70fb718 ipn/ipnlocal: log a summary of posture identity response
Perhaps I was too opimistic in #13323 thinking we won't need logs for
this. Let's log a summary of the response without logging specific
identifiers.

Updates tailscale/corp#24437

Signed-off-by: Anton Tolchanov <anton@tailscale.com>
2024-11-08 16:20:07 +00:00
Brad Fitzpatrick
020cacbe70 derp/derphttp: don't link websockets other than on GOOS=js
Or unless the new "ts_debug_websockets" build tag is set.

Updates #1278

Change-Id: Ic4c4f81c1924250efd025b055585faec37a5491d
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-11-07 22:29:41 -08:00
Brad Fitzpatrick
c3306bfd15 control/controlhttp/controlhttpserver: split out Accept to its own package
Otherwise all the clients only using control/controlhttp for the
ts2021 HTTP client were also pulling in WebSocket libraries, as the
server side always needs to speak websockets, but only GOOS=js clients
speak it.

This doesn't yet totally remove the websocket dependency on Linux because
Linux has a envknob opt-in to act like GOOS=js for manual testing and force
the use of WebSockets for DERP only (not control). We can put that behind
a build tag in a future change to eliminate the dep on all GOOSes.

Updates #1278

Change-Id: I4f60508f4cad52bf8c8943c8851ecee506b7ebc9
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-11-07 22:29:41 -08:00
Brad Fitzpatrick
23880eb5b0 cmd/tailscaled: support "ts_omit_ssh" build tag to remove SSH
Some environments would like to remove Tailscale SSH support for the
binary for various reasons when not needed (either for peace of mind,
or the ~1MB of binary space savings).

Updates tailscale/corp#24454
Updates #1278
Updates #12614

Change-Id: Iadd6c5a393992c254b5dc9aa9a526916f96fd07a
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-11-07 16:06:59 -08:00
Irbe Krumina
2c8859c2e7 client/tailscale,ipn/{ipnlocal,localapi}: add a pre-shutdown localAPI endpoint that terminates control connections. (#14028)
Adds a /disconnect-control local API endpoint that just shuts down control client.
This can be run before shutting down an HA subnet router/app connector replica - it will ensure
that all connection to control are dropped and control thus considers this node inactive and tells
peers to switch over to another replica. Meanwhile the existing connections keep working (assuming
that the replica is given some graceful shutdown period).

Updates tailscale/tailscale#14020

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2024-11-07 19:27:53 +00:00
Brad Fitzpatrick
3090461961 tsweb/varz: optimize some allocs, add helper func for others
Updates #cleanup
Updates tailscale/corp#23546 (noticed when doing this)

Change-Id: Ia9f627fe32bb4955739b2787210ba18f5de27f4d
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-11-07 08:09:16 -08:00
Irbe Krumina
8ba9b558d2 envknob,kube/kubetypes,cmd/k8s-operator: add app type for ProxyGroup (#14029)
Sets a custom hostinfo app type for ProxyGroup replicas, similarly
to how we do it for all other Kubernetes Operator managed components.

Updates tailscale/tailscale#13406,tailscale/corp#22920

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2024-11-07 12:42:29 +00:00
Percy Wegmann
8dcbd988f7 cmd/derper: show more information on home page
- Basic description of DERP

If configured to do so, also show

- Mailto link to security@tailscale.com
- Link to Tailscale Security Policies
- Link to Tailscale Acceptable Use Policy

Updates tailscale/corp#24092

Signed-off-by: Percy Wegmann <percy@tailscale.com>
2024-11-06 11:06:08 -06:00
License Updater
065825e94c licenses: update license notices
Signed-off-by: License Updater <noreply+license-updater@tailscale.com>
2024-11-05 15:33:17 -08:00
Brad Fitzpatrick
01185e436f types/result, util/lineiter: add package for a result type, use it
This adds a new generic result type (motivated by golang/go#70084) to
try it out, and uses it in the new lineutil package (replacing the old
lineread package), changing that package to return iterators:
sometimes over []byte (when the input is all in memory), but sometimes
iterators over results of []byte, if errors might happen at runtime.

Updates #12912
Updates golang/go#70084

Change-Id: Iacdc1070e661b5fb163907b1e8b07ac7d51d3f83
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-11-05 10:27:52 -08:00
Irbe Krumina
809a6eba80 cmd/k8s-operator: allow to optionally configure tailscaled port (#14005)
Updates tailscale/tailscale#13981

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2024-11-04 18:42:51 +00:00
Brad Fitzpatrick
d4222fae95 tsnet: add accessor to get tsd.System
Pulled of otherwise unrelated PR #13884.

Updates tailscale/corp#22075

Change-Id: I5b539fcb4aca1b93406cf139c719a5e3c64ff7f7
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-11-03 09:58:38 -08:00
Brad Fitzpatrick
45da3a4b28 cmd/tsconnect: block after starting esbuild dev server
Thanks to @davidbuzz for raising the issue in #13973.

Fixes #8272
Fixes #13973

Change-Id: Ic413e14d34c82df3c70a97e591b90316b0b4946b
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-11-03 07:30:22 -08:00
VimT
43138c7a5c net/socks5: optimize UDP relay
Key changes:
- No mutex for every udp package: replace syncs.Map with regular map for udpTargetConns
- Use socksAddr as map key for better type safety
- Add test for multi udp target

Updates #7581

Change-Id: Ic3d384a9eab62dcbf267d7d6d268bf242cc8ed3c
Signed-off-by: VimT <me@vimt.me>
2024-11-01 15:47:52 -07:00
VimT
b0626ff84c net/socks5: fix UDP relay in userspace-networking mode
This commit addresses an issue with the SOCKS5 UDP relay functionality
when using the --tun=userspace-networking option. Previously, UDP packets
were not being correctly routed into the Tailscale network in this mode.

Key changes:
- Replace single UDP connection with a map of connections per target
- Use c.srv.dial for creating connections to ensure proper routing

Updates #7581

Change-Id: Iaaa66f9de6a3713218014cf3f498003a7cac9832
Signed-off-by: VimT <me@vimt.me>
2024-11-01 15:47:52 -07:00
Brad Fitzpatrick
634cc2ba4a wgengine/netstack: remove unused taildrive deps
A filesystem was plumbed into netstack in 993acf4475
but hasn't been used since 2d5d6f5403. Remove it.

Noticed while rebasing a Tailscale fork elsewhere.

Updates tailscale/corp#16827

Change-Id: Ib76deeda205ffe912b77a59b9d22853ebff42813
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-11-01 13:40:46 -07:00
Maisem Ali
d09e9d967f ipn/ipnlocal: reload prefs correctly on ReloadConfig
We were only updating the ProfileManager and not going down
the EditPrefs path which meant the prefs weren't applied
till either the process restarted or some other pref changed.

This makes it so that we reconfigure everything correctly when
ReloadConfig is called.

Updates #13032

Signed-off-by: Maisem Ali <maisem@tailscale.com>
2024-11-01 13:37:46 -07:00
Renato Aguiar
0ffc7bf38b Fix MagicDNS on OpenBSD
Add OpenBSD to the list of platforms that need DNS reconfigured on link changes.

Signed-off-by: Renato Aguiar <renato@renatoaguiar.net>
2024-11-01 10:44:30 -07:00
Jordan Whited
49de23cf1b net/netcheck: add addReportHistoryAndSetPreferredDERP() test case (#13989)
Add an explicit case for exercising preferred DERP hysteresis around
the branch that compares latencies on a percentage basis.

Updates #cleanup

Signed-off-by: Jordan Whited <jordan@tailscale.com>
2024-10-31 19:25:00 -07:00
Aaron Klotz
84c8860472 util/syspolicy: add policy key for onboarding flow visibility
Updates https://github.com/tailscale/corp/issues/23789

Signed-off-by: Aaron Klotz <aaron@tailscale.com>
2024-10-31 15:46:40 -06:00
Andrew Lytvynov
ddbc950f46 safeweb: add support for custom CSP (#13975)
To allow more flexibility with CSPs, add a fully customizable `CSP` type
that can be provided in `Config` and encodes itself into the correct
format. Preserve the `CSPAllowInlineStyles` option as is today, but
maybe that'll get deprecated later in favor of the new CSP field.

In particular, this allows for pages loading external JS, or inline JS
with nonces or hashes (see
https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Content-Security-Policy/script-src#unsafe_inline_script)

Updates https://github.com/tailscale/corp/issues/8027

Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
2024-10-31 12:13:29 -07:00
Andrea Gottardo
6985369479 net/sockstats: prevent crash in setNetMon (#13985) 2024-10-31 12:00:34 -07:00
Andrew Lytvynov
3477bfd234 safeweb: add support for "/" and "/foo" handler distinction (#13980)
By counting "/" elements in the pattern we catch many scenarios, but not
the root-level handler. If either of the patterns is "/", compare the
pattern length to pick the right one.

Updates https://github.com/tailscale/corp/issues/8027

Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
2024-10-31 11:12:38 -07:00
Nick Khyl
3f626c0d77 cmd/tailscale/cli, client/tailscale, ipn/localapi: add tailscale syspolicy {list,reload} commands
In this PR, we add the tailscale syspolicy command with two subcommands: list, which displays
policy settings, and reload, which forces a reload of those settings. We also update the LocalAPI
and LocalClient to facilitate these additions.

Updates #12687

Signed-off-by: Nick Khyl <nickk@tailscale.com>
2024-10-31 10:53:43 -05:00
Irbe Krumina
45354dab9b ipn,tailcfg: add app connector config knob to conffile (#13942)
Make it possible to advertise app connector via a new conffile field.
Also bumps capver - conffile deserialization errors out if unknonw
fields are set, so we need to know which clients understand the new field.

Updates tailscale/tailscale#11113

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2024-10-31 14:45:57 +00:00
Anton Tolchanov
b4f46c31bb wgengine/magicsock: export packet drop metric for outbound errors
This required sharing the dropped packet metric between two packages
(tstun and magicsock), so I've moved its definition to util/usermetric.

Updates tailscale/corp#22075

Signed-off-by: Anton Tolchanov <anton@tailscale.com>
2024-10-31 08:33:24 +00:00
Anton Tolchanov
532b26145a wgengine/magicsock: exclude disco from throughput metrics
The user-facing metrics are intended to track data transmitted at
the overlay network level.

Updates tailscale/corp#22075

Signed-off-by: Anton Tolchanov <anton@tailscale.com>
2024-10-31 08:01:19 +00:00
James Tucker
e1e22785b4 net/netcheck: ensure prior preferred DERP is always in netchecks
In an environment with unstable latency, such as upstream bufferbloat,
there are cases where a full netcheck could drop the prior preferred
DERP (likely home DERP) from future netcheck probe plans. This will then
likely result in a home DERP having a missing sample on the next
incremental netcheck, ultimately resulting in a home DERP move.

This change does not fix our overall response to highly unstable
latency, but it is an incremental improvement to prevent single spurious
samples during a full netcheck from alone triggering a flapping
condition, as now the prior changes to include historical latency will
still provide the desired resistance, and the home DERP should not move
unless latency is consistently worse over a 5 minute period.

Note that there is a nomenclature and semantics issue remaining in the
difference between a report preferred DERP and a home DERP. A report
preferred DERP is aspirational, it is what will be picked as a home DERP
if a home DERP connection needs to be established. A nodes home DERP may
be different than a recent preferred DERP, in which case a lot of
netcheck logic is fallible. In future enhancements much of the DERP move
logic should move to consider the home DERP, rather than recent report
preferred DERP.

Updates #8603
Updates #13969

Signed-off-by: James Tucker <james@tailscale.com>
2024-10-30 17:19:26 -07:00
Brad Fitzpatrick
f81348a16b util/syspolicy/source: put EnvPolicyStore env keys in their own namespace
... all prefixed with TS_DEBUGSYSPOLICY_*.

Updates #13193
Updates #12687
Updates #13855

Change-Id: Ia8024946f53e2b3afda4456a7bb85bbcf6d12bfc
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-10-30 11:27:27 -07:00
Nick Khyl
540e4c83d0 util/syspolicy/setting: make setting.Snapshot JSON-marshallable
We make setting.Snapshot JSON-marshallable in preparation for returning it from the LocalAPI.

Updates #12687

Signed-off-by: Nick Khyl <nickk@tailscale.com>
2024-10-30 12:50:29 -05:00
Nick Khyl
2a2228f97b util/syspolicy/setting: make setting.RawItem JSON-marshallable
We add setting.RawValue, a new type that facilitates unmarshalling JSON numbers and arrays
as uint64 and []string (instead of float64 and []any) for policy setting values.
We then use it to make setting.RawItem JSON-marshallable and update the tests.

Updates #12687

Signed-off-by: Nick Khyl <nickk@tailscale.com>
2024-10-30 12:50:29 -05:00
Nick Khyl
2cc1100d24 util/syspolicy/source: use errors instead of github.com/pkg/errors
Updates #12687

Signed-off-by: Nick Khyl <nickk@tailscale.com>
2024-10-30 12:14:36 -05:00
Nick Khyl
2336c340c4 util/syspolicy: implement a syspolicy store that reads settings from environment variables
In this PR, we implement (but do not use yet, pending #13727 review) a syspolicy/source.Store
that reads policy settings from environment variables. It converts a CamelCase setting.Key,
such as AuthKey or ExitNodeID, to a SCREAMING_SNAKE_CASE, TS_-prefixed environment
variable name, such as TS_AUTH_KEY and TS_EXIT_NODE_ID. It then looks up the variable
and attempts to parse it according to the expected value type. If the environment variable
is not set, the policy setting is considered not configured in this store (the syspolicy package
will still read it from other sources). Similarly, if the environment variable has an invalid value
for the setting type, it won't be used (though the reported/logged error will differ).

Updates #13193
Updates #12687

Signed-off-by: Nick Khyl <nickk@tailscale.com>
2024-10-30 11:12:22 -05:00
Irbe Krumina
1103044598 cmd/k8s-operator,k8s-operator: add topology spread constraints to ProxyClass (#13959)
Now when we have HA for egress proxies, it makes sense to support topology
spread constraints that would allow users to define more complex
topologies of how proxy Pods need to be deployed in relation with other
Pods/across regions etc.

Updates tailscale/tailscale#13406

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2024-10-30 10:45:31 +00:00
Tim Walters
856ea2376b wgengine/magicsock: log home DERP changes with latency
This adds additional logging on DERP home changes to allow
better troubleshooting.

Updates tailscale/corp#18095

Signed-off-by: Tim Walters <tim@tailscale.com>
2024-10-29 16:05:41 -04:00
Jonathan Nobels
aecb0ab76b tstest/tailmac: add support for mounting host directories in the guest (#13957)
updates tailscale/corp#24197

tailmac run now supports the --share option which will allow you
to specify a directory on the host which can be mounted in the guest
using  mount_virtiofs vmshare <path>.

Signed-off-by: Jonathan Nobels <jonathan@tailscale.com>
2024-10-29 13:49:51 -04:00
Jonathan Nobels
0f9a054cba tstest/tailmac: fix Host.app path generation (#13953)
updates tailscale/corp#24197

Generation of the Host.app path was erroneous and tailmac run
would not work unless the pwd was tailmac/bin.  Now you can
be able to invoke tailmac from anywhere.

Signed-off-by: Jonathan Nobels <jonathan@tailscale.com>
2024-10-29 13:49:29 -04:00
Anton Tolchanov
9545e36007 cmd/tailscale/cli: add 'tailscale metrics' command
- `tailscale metrics print`: to show metric values in console
- `tailscale metrics write`: to write metrics to a file (with a tempfile
  & rename dance, which is atomic on Unix).

Also, remove the `TS_DEBUG_USER_METRICS` envknob as we are getting
more confident in these metrics.

Updates tailscale/corp#22075

Signed-off-by: Anton Tolchanov <anton@tailscale.com>
2024-10-29 15:08:36 +00:00
Anton Tolchanov
38af62c7b3 ipn/ipnlocal: remove the primary routes gauge for now
Not confident this is the right way to expose this, so let's remote it
for now.

Updates tailscale/corp#22075

Signed-off-by: Anton Tolchanov <anton@tailscale.com>
2024-10-29 15:07:54 +00:00
Anton Tolchanov
11e96760ff wgengine/magicsock: fix stats packet counter on derp egress
Updates tailscale/corp#22075

Signed-off-by: Anton Tolchanov <anton@tailscale.com>
2024-10-29 15:07:45 +00:00
Anton Tolchanov
94fa6d97c5 ipn/ipnlocal: log errors while fetching serial numbers
If the client cannot fetch a serial number, write a log message helping
the user understand what happened. Also, don't just return the error
immediately, since we still have a chance to collect network interface
addresses.

Updates #5902

Signed-off-by: Anton Tolchanov <anton@tailscale.com>
2024-10-29 14:36:08 +00:00
James Tucker
0d76d7d21c tool/gocross: remove trimpath from test builds
trimpath can be inconvenient for IDEs and LSPs that do not always
correctly handle module relative paths, and can also contribute to
caching bugs taking effect. We rarely have a real need for trimpath of
test produced binaries, so avoiding it should be a net win.

Updates #2988
Signed-off-by: James Tucker <james@tailscale.com>
2024-10-28 16:10:55 -07:00
James Tucker
c0a1ed86cb tstest/natlab: add latency & loss simulation
A simple implementation of latency and loss simulation, applied to
writes to the ethernet interface of the NIC. The latency implementation
could be optimized substantially later if necessary.

Updates #13355
Signed-off-by: James Tucker <james@tailscale.com>
2024-10-28 12:49:56 -07:00
License Updater
41aac26106 licenses: update license notices
Signed-off-by: License Updater <noreply+license-updater@tailscale.com>
2024-10-28 08:38:18 -07:00
Renato Aguiar
5d07c17b93 net/dns: fix blank lines being added to resolv.conf on OpenBSD (#13928)
During resolv.conf update, old 'search' lines are cleared but '\n' is not
deleted, leaving behind a new blank line on every update.

This adds 's' flag to regexp, so '\n' is included in the match and deleted when
old lines are cleared.

Also, insert missing `\n` when updated 'search' line is appended to resolv.conf.

Signed-off-by: Renato Aguiar <renato@renatoaguiar.net>
2024-10-28 08:00:48 -07:00
Irbe Krumina
9d1348fe21 ipn/store/kubestore: don't error if state cannot be preloaded (#13926)
Preloading of state from kube Secret should not
error if the Secret does not exist.

Updates tailscale/tailscale#7671

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2024-10-27 15:54:38 +00:00
Irbe Krumina
853fe3b713 ipn/store/kubestore: cache state in memory (#13918)
Cache state in memory on writes, read from memory
in reads.
kubestore was previously always reading state from a Secret.
This change should fix bugs caused by temporary loss of access
to kube API server and imporove overall performance

Fixes #7671
Updates tailscale/tailscale#12079,tailscale/tailscale#13900

Signed-off-by: Maisem Ali <maisem@tailscale.com>
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
Co-authored-by: Maisem Ali <maisem@tailscale.com>
2024-10-26 09:33:47 -05:00
Nick Kirby
6ab39b7bcd cmd/k8s-operator: validate that tailscale.com/tailnet-ip annotation value is a valid IP
Fixes #13836
Signed-off-by: Nick Kirby <nrkirb@gmail.com>
2024-10-26 13:03:36 +01:00
Nick Khyl
e815ae0ec4 util/syspolicy, ipn/ipnlocal: update syspolicy package to utilize syspolicy/rsop
In this PR, we update the syspolicy package to utilize syspolicy/rsop under the hood,
and remove syspolicy.CachingHandler, syspolicy.windowsHandler and related code
which is no longer used.

We mark the syspolicy.Handler interface and RegisterHandler/SetHandlerForTest functions
as deprecated, but keep them temporarily until they are no longer used in other repos.

We also update the package to register setting definitions for all existing policy settings
and to register the Registry-based, Windows-specific policy stores when running on Windows.

Finally, we update existing internal and external tests to use the new API and add a few more
tests and benchmarks.

Updates #12687

Signed-off-by: Nick Khyl <nickk@tailscale.com>
2024-10-25 12:41:07 -05:00
Andrew Dunham
7fe6e50858 net/dns/resolver: fix test flake
Updates #13902

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: Ib2def19caad17367e9a31786ac969278e65f51c6
2024-10-24 13:36:57 -05:00
Paul Scott
212270463b cmd/testwrapper: add pkg runtime to output (#13894)
Fixes #13893

Signed-off-by: Paul Scott <paul@tailscale.com>
2024-10-24 09:41:54 -05:00
Andrew Dunham
b2665d9b89 net/netcheck: add a Now field to the netcheck Report
This allows us to print the time that a netcheck was run, which is
useful in debugging.

Updates #10972

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: Id48d30d4eb6d5208efb2b1526a71d83fe7f9320b
2024-10-22 15:52:42 -04:00
Brad Fitzpatrick
ae5bc88ebe health: fix spurious warning about DERP home region '0'
Updates #13650

Change-Id: I6b0f165f66da3f881a4caa25d2d9936dc2a7f22c
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-10-22 10:01:30 -05:00
Maisem Ali
85241f8408 net/tstun: use /10 as subnet for TAP mode; read IP from netmap
Few changes to resolve TODOs in the code:
- Instead of using a hardcoded IP, get it from the netmap.
- Use 100.100.100.100 as the gateway IP
- Use the /10 CGNAT range instead of a random /24

Updates #2589

Signed-off-by: Maisem Ali <maisem@tailscale.com>
2024-10-21 17:24:29 -07:00
Maisem Ali
d4d21a0bbf net/tstun: restore tap mode functionality
It had bit-rotted likely during the transition to vector io in
76389d8baf. Tested on Ubuntu 24.04
by creating a netns and doing the DHCP dance to get an IP.

Updates #2589

Signed-off-by: Maisem Ali <maisem@tailscale.com>
2024-10-21 17:02:53 -07:00
Nick Khyl
0f4c9c0ecb cmd/viewer: import types/views when generating a getter for a map field
Fixes #13873

Signed-off-by: Nick Khyl <nickk@tailscale.com>
2024-10-21 16:29:16 -05:00
Andrea Gottardo
f8f53bb6d4 health: remove SysDNSOS, add two Warnables for read+set system DNS config (#13874) 2024-10-21 13:40:43 -07:00
Erisa A
72587ab03c scripts/installer.sh: allow Archcraft for Arch packages (#13870)
Fixes #13869

Signed-off-by: Erisa A <erisa@tailscale.com>
2024-10-21 18:13:06 +01:00
Brad Fitzpatrick
c76a6e5167 derp: track client-advertised non-ideal DERP connections in more places
In f77821fd63 (released in v1.72.0), we made the client tell a DERP server
when the connection was not its ideal choice (the first node in its region).

But we didn't do anything with that information until now. This adds a
metric about how many such connections are on a given derper, and also
adds a bit to the PeerPresentFlags bitmask so watchers can identify
(and rebalance) them.

Updates tailscale/corp#372

Change-Id: Ief8af448750aa6d598e5939a57c062f4e55962be
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-10-20 19:56:28 -07:00
Andrea Gottardo
fd77965f23 net/tlsdial: call out firewalls blocking Tailscale in health warnings (#13840)
Updates tailscale/tailscale#13839

Adds a new blockblame package which can detect common MITM SSL certificates used by network appliances. We use this in `tlsdial` to display a dedicated health warning when we cannot connect to control, and a network appliance MITM attack is detected.

Signed-off-by: Andrea Gottardo <andrea@gottardo.me>
2024-10-19 00:35:46 +00:00
Mario Minardi
e711ee5d22 release/dist: clamp min / max version for synology package centre (#13857)
Clamp the min and max version for DSM 7.0 and DSM 7.2 packages when we
are building packages for the synology package centre. This change
leaves packages destined for pkgs.tailscale.com with just the min
version set to not break packages in the wild / our update flow.

Updates https://github.com/tailscale/corp/issues/22908

Signed-off-by: Mario Minardi <mario@tailscale.com>
2024-10-18 14:20:40 -06:00
Jordan Whited
877fa504b4 net/netcheck: remove arbitrary deadlines from GetReport() tests (#13832)
GetReport() may have side effects when the caller enforces a deadline
that is shorter than ReportTimeout.

Updates #13783
Updates #13394

Signed-off-by: Jordan Whited <jordan@tailscale.com>
2024-10-18 13:12:07 -07:00
Nick Khyl
874db2173b ipn/{ipnauth,ipnlocal,ipnserver}: send the auth URL to the user who started interactive login
We add the ClientID() method to the ipnauth.Actor interface and updated ipnserver.actor to implement it.
This method returns a unique ID of the connected client if the actor represents one. It helps link a series
of interactions initiated by the client, such as when a notification needs to be sent back to a specific session,
rather than all active sessions, in response to a certain request.

We also add LocalBackend.WatchNotificationsAs and LocalBackend.StartLoginInteractiveAs methods,
which are like WatchNotifications and StartLoginInteractive but accept an additional parameter
specifying an ipnauth.Actor who initiates the operation. We store these actor identities in
watchSession.owner and LocalBackend.authActor, respectively,and implement LocalBackend.sendTo
and related helper methods to enable sending notifications to watchSessions associated with actors
(or, more broadly, identifiable recipients).

We then use the above to change who receives the BrowseToURL notifications:
 - For user-initiated, interactive logins, the notification is delivered only to the user who initiated the
   process. If the initiating actor represents a specific connected client, the URL notification is sent back
   to the same LocalAPI client that called StartLoginInteractive. Otherwise, the notification is sent to all
   clients connected as that user.
   Currently, we only differentiate between users on Windows, as it is inherently a multi-user OS.
 - In all other cases (e.g., node key expiration), we send the notification to all connected users.

Updates tailscale/corp#18342

Signed-off-by: Nick Khyl <nickk@tailscale.com>
2024-10-18 15:10:02 -05:00
Jordan Whited
bb60da2764 derp: add sclient write deadline timeout metric (#13831)
Write timeouts can be indicative of stalled TCP streams. Understanding
changes in the rate of such events can be helpful in an ops context.

Updates tailscale/corp#23668

Signed-off-by: Jordan Whited <jordan@tailscale.com>
2024-10-18 10:53:49 -07:00
Brad Fitzpatrick
18fc093c0d derp: give trusted mesh peers longer write timeouts
Updates tailscale/corp#24014

Change-Id: I700872be48ab337dce8e11cabef7f82b97f0422a
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-10-18 09:37:20 -07:00
Andrew Dunham
c0a9895748 scripts/installer.sh: support DNF5
This fixes the installation on newer Fedora versions that use dnf5 as
the 'dnf' binary.

Updates #13828

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I39513243c81640fab244a32b7dbb3f32071e9fce
2024-10-17 20:28:41 -04:00
Andrea Gottardo
fa95318a47 tool/gocross: add support for tvOS Simulator (#13847)
Updates ENG-5321

Allow gocross to build a static library for the Apple TV Simulator.

Signed-off-by: Andrea Gottardo <andrea@gottardo.me>
2024-10-17 15:37:10 -07:00
Naman Sood
22c89fcb19 cmd/tailscale,ipn,tailcfg: add tailscale advertise subcommand behind envknob (#13734)
Signed-off-by: Naman Sood <mail@nsood.in>
2024-10-16 19:08:06 -04:00
Mario Minardi
d32d742af0 ipn/ipnlocal: error when trying to use exit node on unsupported platform (#13726)
Adds logic to `checkExitNodePrefsLocked` to return an error when
attempting to use exit nodes on a platform where this is not supported.
This mirrors logic that was added to error out when trying to use `ssh`
on an unsupported platform, and has very similar semantics.

Fixes https://github.com/tailscale/tailscale/issues/13724

Signed-off-by: Mario Minardi <mario@tailscale.com>
2024-10-16 14:09:53 -06:00
Brad Fitzpatrick
6a885dbc36 wgengine/magicsock: fix CI-only test warning of missing health tracker
While looking at deflaking TestTwoDevicePing/ping_1.0.0.2_via_SendPacket,
there were a bunch of distracting:

    WARNING: (non-fatal) nil health.Tracker (being strict in CI): ...

This pacifies those so it's easier to work on actually deflaking the test.

Updates #11762
Updates #11874

Change-Id: I08dcb44511d4996b68d5f1ce5a2619b555a2a773
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-10-16 09:40:49 -07:00
Christian
74dd24ce71 cmd/tsconnect, logpolicy: fixes for wasm_js.go
* updates to LocalBackend require metrics to be passed in which are now initialized
* os.MkdirTemp isn't supported in wasm/js so we simply return empty
  string for logger
* adds a UDP dialer which was missing and led to the dialer being
  incompletely initialized

Fixes #10454 and #8272

Signed-off-by: Christian <christian@devzero.io>
2024-10-16 09:39:48 -07:00
Nick Khyl
ff5f233c3a util/syspolicy: add rsop package that provides access to the resultant policy
In this PR we add syspolicy/rsop package that facilitates policy source registration
and provides access to the resultant policy merged from all registered sources for a
given scope.

Updates #12687

Signed-off-by: Nick Khyl <nickk@tailscale.com>
2024-10-16 00:06:14 -05:00
Andrew Dunham
2aa9125ac4 cmd/derpprobe: add /healthz endpoint
For a customer that wants to run their own DERP prober, let's add a
/healthz endpoint that can be used to monitor derpprobe itself.

Updates #6526

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: Iba315c999fc0b1a93d8c503c07cc733b4c8d5b6b
2024-10-15 16:35:24 -04:00
Tom Proctor
5f22f72636 hostinfo,build_docker.sh,tailcfg: more reliably detect being in a container (#13826)
Our existing container-detection tricks did not work on Kubernetes,
where Docker is no longer used as a container runtime. Extends the
existing go build tags for containers to the other container packages
and uses that to reliably detect builds that were created by Tailscale
for use in a container. Unfortunately this doesn't necessarily improve
detection for users' custom builds, but that's a separate issue.

Updates #13825

Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
2024-10-15 19:38:11 +01:00
License Updater
a8f9c0d6e4 licenses: update license notices
Signed-off-by: License Updater <noreply+license-updater@tailscale.com>
2024-10-14 08:10:13 -07:00
Kristoffer Dalby
e0d711c478 {net/connstats,wgengine/magicsock}: fix packet counting in connstats
connstats currently increments the packet counter whenever it is called
to store a length of data, however when udp batch sending was introduced
we pass the length for a series of packages, and it is only incremented
ones, making it count wrongly if we are on a platform supporting udp
batches.

Updates tailscale/corp#22075

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
2024-10-14 14:17:56 +02:00
Kristoffer Dalby
40c991f6b8 wgengine: instrument with usermetrics
Updates tailscale/corp#22075

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
2024-10-14 11:34:31 +02:00
Paul Scott
adc8368964 tstest: avoid Fatal in ResourceCheck to show panic (#13790)
Fixes #13789

Signed-off-by: Paul Scott <paul@tailscale.com>
2024-10-14 10:02:04 +01:00
Percy Wegmann
12e6094d9c ssh/tailssh: calculate passthrough environment at latest possible stage
This allows passing through any environment variables that we set ourselves, for example DBUS_SESSION_BUS_ADDRESS.

Updates #11175

Co-authored-by: Mario Minardi <mario@tailscale.com>
Signed-off-by: Percy Wegmann <percy@tailscale.com>
2024-10-11 15:25:30 -05:00
Joe Tsai
ecc8035f73 types/bools: add Compare to compare boolean values (#13792)
The bools.Compare function compares boolean values
by reporting -1, 0, +1 for ordering so that it can be easily
used with slices.SortFunc.

Updates #cleanup
Updates tailscale/corp#11038

Signed-off-by: Joe Tsai <joetsai@digital-static.net>
2024-10-11 13:12:18 -07:00
Nick Khyl
f07ff47922 net/dns/resolver: add tests for using a forwarder with multiple upstream resolvers
If multiple upstream DNS servers are available, quad-100 sends requests to all of them
and forwards the first successful response, if any. If no successful responses are received,
it propagates the first failure from any of them.

This PR adds some test coverage for these scenarios.

Updates #13571

Signed-off-by: Nick Khyl <nickk@tailscale.com>
2024-10-11 12:02:27 -05:00
Nick Hill
c2144c44a3 net/dns/resolver: update (*forwarder).forwardWithDestChan to always return an error unless it sends a response to responseChan
We currently have two executions paths where (*forwarder).forwardWithDestChan
returns nil, rather than an error, without sending a DNS response to responseChan.

These paths are accompanied by a comment that reads:
// Returning an error will cause an internal retry, there is
// nothing we can do if parsing failed. Just drop the packet.
But it is not (or no longer longer) accurate: returning an error from forwardWithDestChan
does not currently cause a retry.

Moreover, although these paths are currently unreachable due to implementation details,
if (*forwarder).forwardWithDestChan were to return nil without sending a response to
responseChan, it would cause a deadlock at one call site and a panic at another.

Therefore, we update (*forwarder).forwardWithDestChan to return errors in those two paths
and remove comments that were no longer accurate and misleading.

Updates #cleanup
Updates #13571

Signed-off-by: Nick Hill <mykola.khyl@gmail.com>
2024-10-11 12:02:27 -05:00
Nick Hill
e7545f2eac net/dns/resolver: translate 5xx DoH server errors into SERVFAIL DNS responses
If a DoH server returns an HTTP server error, rather than a SERVFAIL within
a successful HTTP response, we should handle it in the same way as SERVFAIL.

Updates #13571

Signed-off-by: Nick Hill <mykola.khyl@gmail.com>
2024-10-11 12:02:27 -05:00
Nick Hill
17335d2104 net/dns/resolver: forward SERVFAIL responses over PeerDNS
As per the docstring, (*forwarder).forwardWithDestChan should either send to responseChan
and returns nil, or returns a non-nil error (without sending to the channel).
However, this does not hold when all upstream DNS servers replied with an error.

We've been handling this special error path in (*Resolver).Query but not in (*Resolver).HandlePeerDNSQuery.
As a result, SERVFAIL responses from upstream servers were being converted into HTTP 503 responses,
instead of being properly forwarded as SERVFAIL within a successful HTTP response, as per RFC 8484, section 4.2.1:
A successful HTTP response with a 2xx status code (see Section 6.3 of [RFC7231]) is used for any valid DNS response,
regardless of the DNS response code. For example, a successful 2xx HTTP status code is used even with a DNS message
whose DNS response code indicates failure, such as SERVFAIL or NXDOMAIN.

In this PR we fix (*forwarder).forwardWithDestChan to no longer return an error when it sends a response to responseChan,
and remove the special handling in (*Resolver).Query, as it is no longer necessary.

Updates #13571

Signed-off-by: Nick Hill <mykola.khyl@gmail.com>
2024-10-11 12:02:27 -05:00
Percy Wegmann
f9949cde8b client/tailscale,cmd/{cli,get-authkey,k8s-operator}: set distinct User-Agents
This helps better distinguish what is generating activity to the
Tailscale public API.

Updates tailscale/corp#23838

Signed-off-by: Percy Wegmann <percy@tailscale.com>
2024-10-11 10:45:03 -05:00
Jordan Whited
33029d4486 net/netcheck: fix netcheck cli-triggered nil pointer deref (#13782)
Updates #13780

Signed-off-by: Jordan Whited <jordan@tailscale.com>
2024-10-10 15:52:47 -07:00
Jonathan Nobels
acb4a22dcc VERSION.txt: this is v1.77.0 (#13779) 2024-10-10 11:34:14 -07:00
Brad Fitzpatrick
508980603b ipn/conffile: don't depend on hujson on iOS/Android
Fixes #13772

Change-Id: I3ae03a5ee48c801f2e5ea12d1e54681df25d4604
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-10-10 09:14:36 -07:00
Andrew Dunham
91f58c5e63 tsnet: fix panic caused by logging after test finishes
Updates #13773

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I95e03eb6aef1639bd4a2efd3a415e2c10cdebc5a
2024-10-10 11:11:02 -04:00
Brad Fitzpatrick
1938685d39 clientupdate: don't link distsign on platforms that don't download
Updates tailscale/corp#20099

Change-Id: Ie3b782379b19d5f7890a8d3a378096b4f3e8a612
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-10-10 06:32:50 -07:00
Irbe Krumina
db1519cc9f k8s-operator/apis: revert ProxyGroup readiness cond name change (#13770)
No need to prefix this with 'Tailscale' for tailscale.com
custom resource types.

Updates tailscale/tailscale#13406

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2024-10-10 13:00:32 +01:00
Brad Fitzpatrick
2531065d10 clientupdate, ipn/localapi: don't use google/uuid, thin iOS deps
We were using google/uuid in two places and that brought in database/sql/driver.

We didn't need it in either place.

Updates #13760
Updates tailscale/corp#20099

Change-Id: Ieed32f1bebe35d35f47ec5a2a429268f24f11f1f
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-10-09 20:27:35 -07:00
Brad Fitzpatrick
fb420be176 safesocket: don't depend on go-ps on iOS
There's never a tailscaled on iOS. And we can't run child processes to
look for it anyway.

Updates tailscale/corp#20099

Change-Id: Ieb3776f4bb440c4f1c442fdd169bacbe17f23ddb
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-10-09 18:35:53 -07:00
Brad Fitzpatrick
367fba8520 control/controlhttp: don't link ts2021 server + websocket code on iOS
We probably shouldn't link it in anywhere, but let's fix iOS for now.

Updates #13762
Updates tailscale/corp#20099

Change-Id: Idac116e9340434334c256acba3866f02bd19827c
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-10-09 18:25:02 -07:00
Joe Tsai
52ef27ab7c taildrop: fix defer in loop (#13757)
However, this affects the scope of a defer.

Updates #11038

Signed-off-by: Joe Tsai <joetsai@digital-static.net>
2024-10-09 14:09:58 -07:00
Joe Tsai
5b7303817e syncs: allocate map with Map.WithLock (#13755)
One primary purpose of WithLock is to mutate the underlying map.
However, this can lead to a panic if it happens to be nil.
Thus, always allocate a map before passing it to f.

Updates tailscale/corp#11038

Signed-off-by: Joe Tsai <joetsai@digital-static.net>
2024-10-09 14:03:37 -07:00
Brad Fitzpatrick
c763b7a7db syncs: delete Map.Range, update callers to iterators
Updates #11038

Change-Id: I2819fed896cc4035aba5e4e141b52c12637373b1
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-10-09 13:56:13 -07:00
Percy Wegmann
2cadb80fb2 util/vizerror: add WrapWithMessage
Thus new function allows constructing vizerrors that combine a message
appropriate for display to users with a wrapped underlying error.

Updates tailscale/corp#23781

Signed-off-by: Percy Wegmann <percy@tailscale.com>
2024-10-09 12:59:25 -05:00
Joe Tsai
910b4e8e6a syncs: add iterators to Map (#13739)
Add Keys, Values, and All to iterate over
all keys, values, and entries, respectively.

Updates #11038

Signed-off-by: Joe Tsai <joetsai@digital-static.net>
2024-10-09 10:28:12 -07:00
Irbe Krumina
89ee6bbdae cmd/k8s-operator,k8s-operator/apis: set a readiness condition on egress Services for ProxyGroup (#13746)
cmd/k8s-operator,k8s-operator/apis: set a readiness condition on egress Services

Set a readiness condition on ExternalName Services that define a tailnet target
to route cluster traffic to via a ProxyGroup's proxies. The condition
is set to true if at least one proxy is currently set up to route.

Updates tailscale/tailscale#13406

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2024-10-09 18:23:40 +01:00
Brad Fitzpatrick
94c79659fa types/views: add iterators to the three Map view types
Their callers using Range are all kinda clunky feeling. Iterators
should make them more readable.

Updates #12912

Change-Id: I93461eba8e735276fda4a8558a4ae4bfd6c04922
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-10-09 10:00:29 -07:00
Irbe Krumina
f6d4d03355 cmd/k8s-operator: don't error out if ProxyClass for ProxyGroup not found. (#13736)
We don't need to error out and continuously reconcile if ProxyClass
has not (yet) been created, once it gets created the ProxyGroup
reconciler will get triggered.

Updates tailscale/tailscale#13406

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2024-10-09 13:23:00 +01:00
Irbe Krumina
60011e73b8 cmd/k8s-operator: fix Pod IP selection (#13743)
Ensure that .status.podIPs is used to select Pod's IP
in all reconcilers.

Updates tailscale/tailscale#13406

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2024-10-09 13:22:50 +01:00
Nick Khyl
da40609abd util/syspolicy, ipn: add "tailscale debug component-logs" support
Fixes #13313
Fixes #12687

Signed-off-by: Nick Khyl <nickk@tailscale.com>
2024-10-08 18:11:23 -05:00
Nick Khyl
29cf59a9b4 util/syspolicy/setting: update Snapshot to use Go 1.23 iterators
Updates #12912
Updates #12687

Signed-off-by: Nick Khyl <nickk@tailscale.com>
2024-10-08 15:02:23 -05:00
Tom Proctor
07c157ee9f cmd/k8s-operator: base ProxyGroup StatefulSet on common proxy.yaml definition (#13714)
As discussed in #13684, base the ProxyGroup's proxy definitions on the same
scaffolding as the existing proxies, as defined in proxy.yaml

Updates #13406

Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
2024-10-08 20:05:08 +01:00
Tom Proctor
83efadee9f kube/egressservices: improve egress ports config readability (#13722)
Instead of converting our PortMap struct to a string during marshalling
for use as a key, convert the whole collection of PortMaps to a list of
PortMap objects, which improves the readability of the JSON config while
still keeping the data structure we need in the code.

Updates #13406

Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
2024-10-08 19:48:18 +01:00
Brad Fitzpatrick
841eaacb07 net/sockstats: quiet some log spam in release builds
Updates #13731

Change-Id: Ibee85426827ebb9e43a1c42a9c07c847daa50117
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-10-08 11:02:46 -07:00
Irbe Krumina
861dc3631c cmd/{k8s-operator,containerboot},kube/egressservices: fix Pod IP check for dual stack clusters (#13721)
Currently egress Services for ProxyGroup only work for Pods and Services
with IPv4 addresses. Ensure that it works on dual stack clusters by reading
proxy Pod's IP from the .status.podIPs list that always contains both
IPv4 and IPv6 address (if the Pod has them) rather than .status.podIP that
could contain IPv6 only for a dual stack cluster.

Updates tailscale/tailscale#13406

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2024-10-08 18:35:23 +01:00
Andrew Dunham
8ee7f82bf4 net/netcheck: don't panic if a region has no Nodes
Updates #13728

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I1e8319d6b2da013ae48f15113b30c9333e69cc0b
2024-10-08 12:52:27 -04:00
Tom Proctor
36cb2e4e5f cmd/k8s-operator,k8s-operator: use default ProxyClass if set for ProxyGroup (#13720)
The default ProxyClass can be set via helm chart or env var, and applies
to all proxies that do not otherwise have an explicit ProxyClass set.
This ensures proxies created by the new ProxyGroup CRD are consistent
with the behaviour of existing proxies

Nearby but unrelated changes:

* Fix up double error logs (controller runtime logs returned errors)
* Fix a couple of variable names

Updates #13406

Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
2024-10-08 17:34:34 +01:00
Tom Proctor
cba2e76568 cmd/containerboot: simplify k8s setup logic (#13627)
Rearrange conditionals to reduce indentation and make it a bit easier to read
the logic. Also makes some error message updates for better consistency
with the recent decision around capitalising resource names and the
upcoming addition of config secrets.

Updates #cleanup

Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
2024-10-08 17:13:00 +01:00
dependabot[bot]
866714a894 .github: Bump github/codeql-action from 3.26.9 to 3.26.11 (#13710)
Bumps [github/codeql-action](https://github.com/github/codeql-action) from 3.26.9 to 3.26.11.
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](461ef6c76d...6db8d6351f)

---
updated-dependencies:
- dependency-name: github/codeql-action
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-07 22:15:40 -06:00
dependabot[bot]
266c14d6ca .github: Bump actions/cache from 4.0.2 to 4.1.0 (#13711)
Bumps [actions/cache](https://github.com/actions/cache) from 4.0.2 to 4.1.0.
- [Release notes](https://github.com/actions/cache/releases)
- [Changelog](https://github.com/actions/cache/blob/main/RELEASES.md)
- [Commits](0c45773b62...2cdf405574)

---
updated-dependencies:
- dependency-name: actions/cache
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-07 20:48:06 -06:00
Nick Hill
9a73462ea4 types/lazy: add DeferredInit type
It is sometimes necessary to defer initialization steps until the first actual usage
or until certain prerequisites have been met. For example, policy setting and
policy source registration should not occur during package initialization.
Instead, they should be deferred until the syspolicy package is actually used.
Additionally, any errors should be properly handled and reported, rather than
causing a panic within the package's init function.

In this PR, we add DeferredInit, to facilitate the registration and invocation
of deferred initialization functions.

Updates #12687

Signed-off-by: Nick Hill <mykola.khyl@gmail.com>
2024-10-07 15:43:22 -05:00
Brad Fitzpatrick
f3de4e96a8 derp: fix omitted word in comment
Fix comment just added in 38f236c725.

Updates tailscale/corp#23668
Updates #cleanup

Change-Id: Icbe112e24fcccf8c61c759c631ad09f3e5480547
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-10-07 12:21:10 -07:00
Irbe Krumina
7f016baa87 cmd/k8s-operator,k8s-operator: create ConfigMap for egress services + small fixes for egress services (#13715)
cmd/k8s-operator, k8s-operator: create ConfigMap for egress services + small reconciler fixes

Updates tailscale/tailscale#13406

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2024-10-07 20:12:56 +01:00
Brad Fitzpatrick
38f236c725 derp: add server metric for batch write sizes
Updates tailscale/corp#23668

Change-Id: Ie6268c4035a3b29fd53c072c5793e4cbba93d031
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-10-07 11:22:51 -07:00
Erisa A
c588c36233 types/key: use tlpub: in error message (#13707)
Fixes tailscale/corp#19442

Signed-off-by: Erisa A <erisa@tailscale.com>
2024-10-07 17:28:45 +01:00
Brad Fitzpatrick
cb10eddc26 tool/gocross: fix argument order to find
To avoid warning:

    find: warning: you have specified the global option -maxdepth after the argument -type, but global options are not positional, i.e., -maxdepth affects tests specified before it as well as those specified after it.  Please specify global options before other arguments.

Fixes tailscale/corp#23689

Change-Id: I91ee260b295c552c0a029883d5e406733e081478
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-10-07 08:07:03 -07:00
Tom Proctor
e48cddfbb3 cmd/{containerboot,k8s-operator},k8s-operator,kube: add ProxyGroup controller (#13684)
Implements the controller for the new ProxyGroup CRD, designed for
running proxies in a high availability configuration. Each proxy gets
its own config and state Secret, and its own tailscale node ID.

We are currently mounting all of the config secrets into the container,
but will stop mounting them and instead read them directly from the kube
API once #13578 is implemented.

Updates #13406

Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
2024-10-07 14:58:45 +01:00
Brad Fitzpatrick
1005cbc1e4 tailscaleroot: panic if tailscale_go build tag but Go toolchain mismatch
Fixes #13527

Change-Id: I05921969a84a303b60d1b3b9227aff9865662831
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-10-06 15:22:04 -07:00
Brad Fitzpatrick
c48cc08de2 wgengine: stop conntrack log spam about Canonical net probes
Like we do for the ones on iOS.

As a bonus, this removes a caller of tsaddr.IsTailscaleIP which we
want to revamp/remove soonish.

Updates #13687

Change-Id: Iab576a0c48e9005c7844ab52a0aba5ba343b750e
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-10-05 12:51:55 -07:00
Andrew Dunham
12f1bc7c77 envknob: support disk-based envknobs on the macsys build
Per my investigation just now, the $HOME environment variable is unset
on the macsys (standalone macOS GUI) variant, but the current working
directory is valid. Look for the environment variable file in that
location in addition to inside the home directory.

Updates #3707

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I481ae2e0d19b316244373e06865e3b5c3a9f3b88
2024-10-04 17:12:27 -04:00
Patrick O'Doherty
4ad3f01225 safeweb: allow passing http.Server in safeweb.Config (#13688)
Extend safeweb.Config with the ability to pass a http.Server that
safeweb will use to server traffic.

Updates corp#8207

Signed-off-by: Patrick O'Doherty <patrick@tailscale.com>
2024-10-04 11:57:00 -07:00
kari-ts
8fdffb8da0 hostinfo: update SetPackage doc with new Android values (#13537)
Fixes tailscale/corp#23283

Signed-off-by: kari-ts <kari@tailscale.com>
2024-10-04 16:35:19 +00:00
Erisa A
f30d85310c cmd/tailscale/cli: don't print disablement secrets if init fails (#13673)
* cmd/tailscale/cli: don't print disablement secrets if init fails

Fixes tailscale/corp#11355

Signed-off-by: Erisa A <erisa@tailscale.com>

* cmd/tailscale/cli: changes from code review

Signed-off-by: Erisa A <erisa@tailscale.com>

* cmd/tailscale/cli: small grammar change

Signed-off-by: Erisa A <erisa@tailscale.com>

---------

Signed-off-by: Erisa A <erisa@tailscale.com>
2024-10-04 16:01:48 +01:00
Irbe Krumina
e8bb5d1be5 cmd/{k8s-operator,containerboot},k8s-operator,kube: reconcile ExternalName Services for ProxyGroup (#13635)
Adds a new reconciler that reconciles ExternalName Services that define a
tailnet target that should be exposed to cluster workloads on a ProxyGroup's
proxies.
The reconciler ensures that for each such service, the config mounted to
the proxies is updated with the tailnet target definition and that
and EndpointSlice and ClusterIP Service are created for the service.

Adds a new reconciler that ensures that as proxy Pods become ready to route
traffic to a tailnet target, the EndpointSlice for the target is updated
with the Pods' endpoints.

Updates tailscale/tailscale#13406

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2024-10-04 13:11:35 +01:00
Irbe Krumina
9bd158cc09 cmd/containerboot,util/linuxfw: create a SNAT rule for dst/src only once, clean up if needed (#13658)
The AddSNATRuleForDst rule was adding a new rule each time it was called including:
- if a rule already existed
- if a rule matching the destination, but with different desired source already existed

This was causing issues especially for the in-progress egress HA proxies work,
where the rules are now refreshed more frequently, so more redundant rules
were being created.

This change:
- only creates the rule if it doesn't already exist
- if a rule for the same dst, but different source is found, delete it
- also ensures that egress proxies refresh firewall rules
if the node's tailnet IP changes

Updates tailscale/tailscale#13406

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2024-10-03 20:15:00 +01:00
Patrick O'Doherty
a3c6a3a34f safeweb: add StrictTransportSecurityOptions config (#13679)
Add the ability to specify Strict-Transport-Security options in response
to BrowserMux HTTP requests in safeweb.

Updates https://github.com/tailscale/corp/issues/23375

Signed-off-by: Patrick O'Doherty <patrick@tailscale.com>
2024-10-03 18:38:29 +00:00
Brad Fitzpatrick
dc60c8d786 ssh/tailssh: pass window size pixels in IoctlSetWinsize events
Fixes #13669

Change-Id: Id44cfbb83183f1bbcbdc38c29238287b9d288707
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-10-03 09:24:28 -07:00
Andrea Gottardo
58c6bc2991 logpolicy: force TLS 1.3 handshake
Updates tailscale/tailscale#3363

We know `log.tailscale.io` supports TLS 1.3, so we can enforce its usage in the client to shake some bytes off the TLS handshake each time a connection is opened to upload logs.

Signed-off-by: Andrea Gottardo <andrea@gottardo.me>
2024-10-03 09:16:23 -07:00
Brad Fitzpatrick
5f88b65764 wgengine/netstack: check userspace ping success on Windows
Hacky temporary workaround until we do #13654 correctly.

Updates #13654

Change-Id: I764eaedbb112fb3a34dddb89572fec1b2543fd4a
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-10-03 09:07:39 -07:00
Brad Fitzpatrick
1f8eea53a8 control/controlclient: include HTTP status string in error message too
Not just its code.

Updates tailscale/corp#23584

Change-Id: I8001a675372fe15da797adde22f04488d8683448
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-10-03 08:37:16 -07:00
Brad Fitzpatrick
6f694da912 wgengine/magicsock: avoid log spam from ReceiveFunc on shutdown
The new logging in 2dd71e64ac is spammy at shutdown:

    Receive func ReceiveIPv6 exiting with error: *net.OpError, read udp [::]:38869: raw-read udp6 [::]:38869: use of closed network connection
    Receive func ReceiveIPv4 exiting with error: *net.OpError, read udp 0.0.0.0:36123: raw-read udp4 0.0.0.0:36123: use of closed network connection

Skip it if we're in the process of shutting down.

Updates #10976

Change-Id: I4f6d1c68465557eb9ffe335d43d740e499ba9786
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-10-02 20:22:12 -07:00
Naman Sood
09ec2f39b5 tailcfg: add func to check for known valid ServiceProtos (#13668)
Updates tailscale/corp#23574.

Signed-off-by: Naman Sood <mail@nsood.in>
2024-10-02 22:54:02 -04:00
Brad Fitzpatrick
383120c534 ipn/ipnlocal: don't run portlist code unless service collection is on
We were selectively uploading it, but we were still gathering it,
which can be a waste of CPU.

Also remove a bunch of complexity that I don't think matters anymore.

And add an envknob to force service collection off on a single node,
even if the tailnet policy permits it.

Fixes #13463

Change-Id: Ib6abe9e29d92df4ffa955225289f045eeeb279cf
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-10-02 18:08:31 -07:00
Nick Khyl
d837e0252f wf/firewall: allow link-local multicast for permitted local routes when the killswitch is on on Windows
When an Exit Node is used, we create a WFP rule to block all inbound and outbound traffic,
along with several rules to permit specific types of traffic. Notably, we allow all inbound and
outbound traffic to and from LocalRoutes specified in wgengine/router.Config. The list of allowed
routes always includes routes for internal interfaces, such as loopback and virtual Hyper-V/WSL2
interfaces, and may also include LAN routes if the "Allow local network access" option is enabled.
However, these permitting rules do not allow link-local multicast on the corresponding interfaces.
This results in broken mDNS/LLMNR, and potentially other similar issues, whenever an exit node is used.

In this PR, we update (*wf.Firewall).UpdatePermittedRoutes() to create rules allowing outbound and
inbound link-local multicast traffic to and from the permitted IP ranges, partially resolving the mDNS/LLMNR
and *.local name resolution issue.

Since Windows does not attempt to send mDNS/LLMNR queries if a catch-all NRPT rule is present,
it is still necessary to disable the creation of that rule using the disable-local-dns-override-via-nrpt nodeAttr.

Updates #13571

Signed-off-by: Nick Khyl <nickk@tailscale.com>
2024-10-02 18:36:01 -05:00
Brad Fitzpatrick
b8af93310a tstest: add the start of a testing wishlist
Of tests we wish we could easily add. One day.

Updates #13038

Change-Id: If44646f8d477674bbf2c9a6e58c3cd8f94a4e8df
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-10-02 16:08:41 -07:00
Andrea Gottardo
6de6ab015f net/dns: tweak DoH timeout, limit MaxConnsPerHost, require TLS 1.3 (#13564)
Updates tailscale/tailscale#6148

This is the result of some observations we made today with @raggi. The DNS over HTTPS client currently doesn't cap the number of connections it uses, either in-use or idle. A burst of DNS queries will open multiple connections. Idle connections remain open for 30 seconds (this interval is defined in the dohTransportTimeout constant). For DoH providers like NextDNS which send keep-alives, this means the cellular modem will remain up more than expected to send ACKs if any keep-alives are received while a connection remains idle during those 30 seconds. We can set the IdleConnTimeout to 10 seconds to ensure an idle connection is terminated if no other DNS queries come in after 10 seconds. Additionally, we can cap the number of connections to 1. This ensures that at all times there is only one open DoH connection, either active or idle. If idle, it will be terminated within 10 seconds from the last query.

We also observed all the DoH providers we support are capable of TLS 1.3. We can force this TLS version to reduce the number of packets sent/received each time a TLS connection is established.

Signed-off-by: Andrea Gottardo <andrea@gottardo.me>
2024-10-02 09:26:11 -07:00
Brad Fitzpatrick
a01b545441 control/control{client,http}: don't noise dial localhost:443 in http-only tests
1eaad7d3de regressed some tests in another repo that were starting up
a control server on `http://127.0.0.1:nnn`. Because there was no https
running, and because of a bug in 1eaad7d3de (which ended up checking
the recently-dialed-control check twice in a single dial call), we
ended up forcing only the use of TLS dials in a test that only had
plaintext HTTP running.

Instead, plumb down support for explicitly disabling TLS fallbacks and
use it only when running in a test and using `http` scheme control
plane URLs to 127.0.0.1 or localhost.

This fixes the tests elsewhere.

Updates #13597

Change-Id: I97212ded21daf0bd510891a278078daec3eebaa6
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-10-02 10:41:08 -05:00
Brad Fitzpatrick
6b03e18975 control/controlhttp: rename a param from addr to optAddr for clarity
And update docs.

Updates #cleanup
Updates #13597 (tangentially; noted this cleanup while debugging)

Change-Id: I62440294c78b0bb3f5673be10318dd89af1e1bfe
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-10-02 10:41:08 -05:00
Brad Fitzpatrick
f49d218cfe net/dnscache: don't fall back to an IPv6 dial if we don't have IPv6
I noticed while debugging a test failure elsewhere that our failure
logs (when verbosity is cranked up) were uselessly attributing dial
failures to failure to dial an invalid IP address (this IPv6 address
we didn't have), rather than showing me the actual IPv4 connection
failure.

Updates #13597 (tangentially)

Change-Id: I45ffbefbc7e25ebfb15768006413a705b941dae5
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-10-02 10:41:08 -05:00
Brad Fitzpatrick
30f0fa95d9 control/controlclient: bound ReportHealthChange context lifetime to Direct client's
Fixes #13651

Change-Id: I8154d3cc0ca40fe7a0223b26ae2e77e8d6ba874b
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-10-02 10:40:39 -05:00
Andrea Gottardo
ed1ac799c8 net/captivedetection: set Timeout on net.Dialer (#13613)
Updates tailscale/tailscale#1634
Updates tailscale/tailscale#13265

Captive portal detection uses a custom `net.Dialer` in its `http.Client`. This custom Dialer ensures that the socket is bound specifically to the Wi-Fi interface. This is crucial because without it, if any default routes are set, the outgoing requests for detecting a captive portal would bypass Wi-Fi and go through the default route instead.

The Dialer did not have a Timeout property configured, so the default system timeout was applied. This caused issues in #13265, where we attempted to make captive portal detection requests over an IPsec interface used for Wi-Fi Calling. The call to `connect()` would fail and remain blocked until the system timeout (approximately 1 minute) was reached.

In #13598, I simply excluded the IPsec interface from captive portal detection. This was a quick and safe mitigation for the issue. This PR is a follow-up to make the process more robust, by setting a 3 seconds timeout on any connection establishment on any interface (this is the same timeout interval we were already setting on the HTTP client).

Signed-off-by: Andrea Gottardo <andrea@gottardo.me>
2024-10-02 15:29:46 +00:00
Nick Khyl
e66fe1f2e8 docs/windows/policy: add ADMX policy setting to configure the AuthKey
Updates tailscale/corp#22120

Signed-off-by: Nick Khyl <nickk@tailscale.com>
2024-10-02 09:19:19 -05:00
dependabot[bot]
992ee6dd0b .github: Bump github/codeql-action from 3.26.8 to 3.26.9 (#13625)
Bumps [github/codeql-action](https://github.com/github/codeql-action) from 3.26.8 to 3.26.9.
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](294a9d9291...461ef6c76d)

---
updated-dependencies:
- dependency-name: github/codeql-action
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-01 23:27:30 -06:00
Brad Fitzpatrick
262c526c4e net/portmapper: don't treat 0.0.0.0 as a valid IP
Updates tailscale/corp#23538

Change-Id: I58b8c30abe43f1d1829f01eb9fb2c1e6e8db9476
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-10-01 16:11:47 -05:00
Andrew Dunham
16ef88754d net/portmapper: don't return unspecified/local external IPs
We were previously not checking that the external IP that we got back
from a UPnP portmap was a valid endpoint; add minimal validation that
this endpoint is something that is routeable by another host.

Updates tailscale/corp#23538

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: Id9649e7683394aced326d5348f4caa24d0efd532
2024-10-01 14:13:40 -04:00
Brad Fitzpatrick
1eaad7d3de control/controlhttp: fix connectivity on Alaska Air wifi
Updates #13597

Change-Id: Ifbf52b93fd35d64fcf80f8fddbfd610008fd8742
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-10-01 11:58:20 -05:00
Brad Fitzpatrick
fd32f0ddf4 control/controlhttp: factor out some code in prep for future change
This pulls out the clock and forceNoise443 code into methods on the
Dialer as cleanup in its own commit to make a future change less
distracting.

Updates #13597

Change-Id: I7001e57fe7b508605930c5b141a061b6fb908733
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-10-01 11:28:59 -05:00
Brad Fitzpatrick
d3f302d8e2 cmd/tailscale/cli: make 'tailscale debug ts2021' try twice
In prep for a future port 80 MITM fix, make the 'debug ts2021' command
retry once after a failure to give it a chance to pick a new strategy.

Updates #13597

Change-Id: Icb7bad60cbf0dbec78097df4a00e9795757bc8e4
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-10-01 11:28:59 -05:00
Mario Minardi
8f44ba1cd6 ssh: Add logic to set accepted environment variables in SSH session (#13559)
Add logic to set environment variables that match the SSH rule's
`acceptEnv` settings in the SSH session's environment.

Updates https://github.com/tailscale/corp/issues/22775

Signed-off-by: Mario Minardi <mario@tailscale.com>
2024-09-30 21:47:45 -06:00
dependabot[bot]
dd6b808acf .github: Bump peter-evans/create-pull-request from 7.0.1 to 7.0.5 (#13626)
Bumps [peter-evans/create-pull-request](https://github.com/peter-evans/create-pull-request) from 7.0.1 to 7.0.5.
- [Release notes](https://github.com/peter-evans/create-pull-request/releases)
- [Commits](8867c4aba1...5e914681df)

---
updated-dependencies:
- dependency-name: peter-evans/create-pull-request
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-30 21:12:44 -06:00
Anton Tolchanov
a70287d324 logpolicy: don't create a filch buffer if logging is disabled
Updates #9549

Signed-off-by: Anton Tolchanov <commits@knyar.net>
2024-09-30 11:36:08 +02:00
Maisem Ali
fb0f8fc0ae cmd/tsidp: add --dir flag
To better control where the tsnet state is being stored.

Updates #10263

Signed-off-by: Maisem Ali <maisem@tailscale.com>
2024-09-29 16:15:22 -07:00
Irbe Krumina
096b090caf cmd/containerboot,kube,util/linuxfw: configure kube egress proxies to route to 1+ tailnet targets (#13531)
* cmd/containerboot,kube,util/linuxfw: configure kube egress proxies to route to 1+ tailnet targets

This commit is first part of the work to allow running multiple
replicas of the Kubernetes operator egress proxies per tailnet service +
to allow exposing multiple tailnet services via each proxy replica.

This expands the existing iptables/nftables-based proxy configuration
mechanism.

A proxy can now be configured to route to one or more tailnet targets
via a (mounted) config file that, for each tailnet target, specifies:
- the target's tailnet IP or FQDN
- mappings of container ports to which cluster workloads will send traffic to
tailnet target ports where the traffic should be forwarded.

Example configfile contents:
{
  "some-svc": {"tailnetTarget":{"fqdn":"foo.tailnetxyz.ts.net","ports"{"tcp:4006:80":{"protocol":"tcp","matchPort":4006,"targetPort":80},"tcp:4007:443":{"protocol":"tcp","matchPort":4007,"targetPort":443}}}}
}

A proxy that is configured with this config file will configure firewall rules
to route cluster traffic to the tailnet targets. It will then watch the config file
for updates as well as monitor relevant netmap updates and reconfigure firewall
as needed.

This adds a bunch of new iptables/nftables functionality to make it easier to dynamically update
the firewall rules without needing to restart the proxy Pod as well as to make
it easier to debug/understand the rules:

- for iptables, each portmapping is a DNAT rule with a comment pointing
at the 'service',i.e:

-A PREROUTING ! -i tailscale0 -p tcp -m tcp --dport 4006 -m comment --comment "some-svc:tcp:4006 -> tcp:80" -j DNAT --to-destination 100.64.1.18:80
Additionally there is a SNAT rule for each tailnet target, to mask the source address.

- for nftables, a separate prerouting chain is created for each tailnet target
and all the portmapping rules are placed in that chain. This makes it easier
to look up rules and delete services when no longer needed.
(nftables allows hooking a custom chain to a prerouting hook, so no extra work
is needed to ensure that the rules in the service chains are evaluated).

The next steps will be to get the Kubernetes Operator to generate
the configfile and ensure it is mounted to the relevant proxy nodes.

Updates tailscale/tailscale#13406

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2024-09-29 16:30:53 +01:00
Irbe Krumina
c62b0732d2 cmd/k8s-operator: remove auth key once proxy has logged in (#13612)
The operator creates a non-reusable auth key for each of
the cluster proxies that it creates and puts in the tailscaled
configfile mounted to the proxies.
The proxies are always tagged, and their state is persisted
in a Kubernetes Secret, so their node keys are expected to never
be regenerated, so that they don't need to re-auth.

Some tailnet configurations however have seen issues where the auth
keys being left in the tailscaled configfile cause the proxies
to end up in unauthorized state after a restart at a later point
in time.
Currently, we have not found a way to reproduce this issue,
however this commit removes the auth key from the config once
the proxy can be assumed to have logged in.

If an existing, logged-in proxy is upgraded to this version,
its redundant auth key will be removed from the conffile.

If an existing, logged-in proxy is downgraded from this version
to a previous version, it will work as before without re-issuing key
as the previous code did not enforce that a key must be present.

Updates tailscale/tailscale#13451

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2024-09-27 17:47:27 +01:00
Kristoffer Dalby
77832553e5 ipn/ipnlocal: add advertised and primary route metrics
Updates tailscale/corp#22075

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
2024-09-27 16:05:14 +02:00
Tom Proctor
cab2e6ea67 cmd/k8s-operator,k8s-operator: add ProxyGroup CRD (#13591)
The ProxyGroup CRD specifies a set of N pods which will each be a
tailnet device, and will have M different ingress or egress services
mapped onto them. It is the mechanism for specifying how highly
available proxies need to be. This commit only adds the definition, no
controller loop, and so it is not currently functional.

This commit also splits out TailnetDevice and RecorderTailnetDevice
into separate structs because the URL field is specific to recorders,
but we want a more generic struct for use in the ProxyGroup status field.

Updates #13406

Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
2024-09-27 01:05:56 +01:00
Andrew Dunham
7ec8bdf8b1 go.mod: upgrade golangci-lint
To pull in the fix for mgechev/revive#863 - seen in the GitHub Actions
check below:
    https://github.com/tailscale/tailscale/actions/runs/11057524933/job/30721507353?pr=13600

Updates #13602

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: Ia04adc5d74bdbde14204645ca948794447b16776
2024-09-26 17:08:54 -04:00
Andrea Gottardo
69be54c7b6 net/captivedetection: exclude ipsec interfaces from captive portal detection (#13598)
Updates tailscale/tailscale#1634

Logs from some iOS users indicate that we're pointlessly performing captive portal detection on certain interfaces named ipsec*. These are tunnels with the cellular carrier that do not offer Internet access, and are only used to provide internet calling functionality (VoLTE / VoWiFi).

```
attempting to do captive portal detection on interface ipsec1
attempting to do captive portal detection on interface ipsec6
```

This PR excludes interfaces with the `ipsec` prefix from captive portal detection.

Signed-off-by: Andrea Gottardo <andrea@gottardo.me>
2024-09-26 17:28:10 +00:00
Kristoffer Dalby
5550a17391 wgengine: make opts.Metrics mandatory
Fixes #13582

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
2024-09-26 13:09:47 +02:00
Kristoffer Dalby
7d1160ddaa {ipn,net,tsnet}: use tsaddr helpers
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
2024-09-26 12:17:31 +02:00
Kristoffer Dalby
f03e82a97c client/web: use tsaddr helpers
Updates #cleanup

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
2024-09-26 12:17:31 +02:00
Kristoffer Dalby
0909431660 cmd/tailscale: use tsaddr helpers
Updates #cleanup

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
2024-09-26 12:17:31 +02:00
Kristoffer Dalby
3dc33a0a5b net/tsaddr: add WithoutExitRoutes and IsExitRoute
Updates #cleanup

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
2024-09-26 12:17:31 +02:00
Mario Minardi
c90c9938c8 ssh/tailssh: add logic for matching against AcceptEnv patterns (#13466)
Add logic for parsing and matching against our planned format for
AcceptEnv values. Namely, this supports direct matches against string
values and matching where * and ? are treated as wildcard characters
which match against an arbitrary number of characters and a single
character respectively.

Actually using this logic in non-test code will come in subsequent
changes.

Updates https://github.com/tailscale/corp/issues/22775

Signed-off-by: Mario Minardi <mario@tailscale.com>
2024-09-25 21:09:05 -06:00
James Tucker
9eb59c72c1 wgengine/magicsock: fix check for EPERM on macOS
Like Linux, macOS will reply to sendto(2) with EPERM if the firewall is
currently blocking writes, though this behavior is like Linux
undocumented. This is often caused by a faulting network extension or
content filter from EDR software.

Updates #11710
Updates #12891
Updates #13511

Signed-off-by: James Tucker <james@tailscale.com>
2024-09-25 16:33:36 -07:00
Andrew Dunham
717d589149 metrics: revert changes to MultiLabelMap's String method
This breaks its ability to be used as an expvar and is blocking a trunkd
deploy. Revert for now, and add a test to ensure that we don't break it
in a future change.

Updates #13550

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I1f1221c257c1de47b4bff0597c12f8530736116d
2024-09-25 19:20:50 -04:00
Cameron Stokes
65c26357b1 cmd/k8s-operator, k8s-operator: fix outdated kb links (#13585)
updates #13583

Signed-off-by: Cameron Stokes <cameron@tailscale.com>
2024-09-25 22:15:42 +01:00
Adrian Dewhurst
2fdbcbdf86 wgengine/magicsock: only used cached results for GetLastNetcheckReport
When querying for an exit node suggestion, occasionally it triggers a
new report concurrently with an existing report in progress. Generally,
there should always be a recent report or one in progress, so it is
redundant to start one there, and it causes concurrency issues.

Fixes #12643

Change-Id: I66ab9003972f673e5d4416f40eccd7c6676272a5
Signed-off-by: Adrian Dewhurst <adrian@tailscale.com>
2024-09-25 16:50:33 -04:00
Brad Fitzpatrick
c2f0c705e7 health: clean up updateBuiltinWarnablesLocked a bit, fix DERP warnings
Updates #13265

Change-Id: Iabe4a062204a7859d869f6acfb9274437b4ea1ea
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-09-25 12:52:02 -07:00
Kristoffer Dalby
0e0e53d3b3 util/usermetrics: make usermetrics non-global
this commit changes usermetrics to be non-global, this is a building
block for correct metrics if a go process runs multiple tsnets or
in tests.

Updates #13420
Updates tailscale/corp#22075

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
2024-09-25 15:57:00 +02:00
Brad Fitzpatrick
e1bbe1bf45 derp: document the RunWatchConnectionLoop callback gotchas
Updates #13566

Change-Id: I497b5adc57f8b1b97dbc3f74c0dc67140caad436
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-09-24 15:32:08 -07:00
Brad Fitzpatrick
6f7e7a30e3 tool/gocross: make gocross-wrapper.sh keep multiple Go toolchains around
So it doesn't delete and re-pull when switching between branches.

Updates tailscale/corp#17686

Change-Id: Iffb989781db42fcd673c5f03dbd0ce95972ede0f
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-09-24 14:17:45 -07:00
Mario Minardi
43f4131d7a {release,version}: add DSM7.2 specific synology builds (#13405)
Add separate builds for DSM7.2 for synology so that we can encode
separate versioning information in the INFO file to distinguish between
the two.

Fixes https://github.com/tailscale/corp/issues/22908

Signed-off-by: Mario Minardi <mario@tailscale.com>
2024-09-24 15:00:37 -06:00
Andrea Gottardo
8a6f48b455 cli: add tailscale dns query (#13368)
Updates tailscale/tailscale#13326

Adds a CLI subcommand to perform DNS queries using the internal DNS forwarder and observe its internals (namely, which upstream resolvers are being used).

Signed-off-by: Andrea Gottardo <andrea@gottardo.me>
2024-09-24 20:18:45 +00:00
dependabot[bot]
a98f75b783 .github: Bump tibdex/github-app-token from 1.8.0 to 2.1.0 (#9529)
Bumps [tibdex/github-app-token](https://github.com/tibdex/github-app-token) from 1.8.0 to 2.1.0.
- [Release notes](https://github.com/tibdex/github-app-token/releases)
- [Commits](b62528385c...3beb63f4bd)

---
updated-dependencies:
- dependency-name: tibdex/github-app-token
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Signed-off-by: Mario Minardi <mario@tailscale.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-24 10:01:00 -06:00
Mario Minardi
05d82fb0d8 .github: pin re-actors/alls-green to latest 1.x (#13558)
Pin re-actors/alls-green usage to latest 1.x. This was previously
pointing to `@release/v2` which pulls in the latest changes from this
branch as they are released, with the potential to break our workflows
if a breaking change or malicious version on this stream is ever pushed.

Changing this to a pinned version also means that dependabot will keep
this in the pinned version format (e.g., referencing a SHA) when it
opens a PR to bump the dependency.

Updates #cleanup

Signed-off-by: Mario Minardi <mario@tailscale.com>
2024-09-23 17:35:53 -06:00
Mario Minardi
04bbef0e8b .github: update and pin actions/upload-artifact to latest 4.x (#13556)
Update and pin actions/upload-artifact usage to latest 4.x. These were
previously pointing to @3 which pulls in the latest v3 as they are
released, with the potential to break our workflows if a breaking change
or malicious version on the @3 stream is ever pushed.

Changing this to a pinned version also means that dependabot will keep
this in the pinned version format (e.g., referencing a SHA) when it
opens a PR to bump the dependency.

Updates #cleanup

Signed-off-by: Mario Minardi <mario@tailscale.com>
2024-09-23 16:44:26 -06:00
Mario Minardi
a8bd0cb9c2 .github: update and pin actions/cache to latest 4.x (#13555)
Update and pin actions/cache usage to latest 4.x. These were previously
pointing to `@3` which pulls in the latest v3 as they are released, with
the potential to break our workflows if a breaking change or malicious
version on the `@3` stream is ever pushed.

Changing this to a pinned version also means that dependabot will keep
this in the pinned version format (e.g., referencing a SHA) when it
opens a PR to bump the dependency.

The breaking change between v3 and v4 is that v4 requires Node 20 which
should be a non-issue where this is run.

Updates #cleanup

Signed-off-by: Mario Minardi <mario@tailscale.com>
2024-09-23 16:34:55 -06:00
Mario Minardi
a3f7e72321 .github: use and pin slackapi/slack-github-action to latest 1.x (#13554)
Use slackapi/slack-github-action across the board and pin to latest 1.x.
Previously we were referencing the 1.27.0 tag directly which is
vulnerable to someone replacing that version tag with malicious code.

Replace usage of ruby/action-slack with slackapi/slack-github-action as
the latter is the officially supported action from slack.

Updates #cleanup

Signed-off-by: Mario Minardi <mario@tailscale.com>
2024-09-23 16:11:13 -06:00
Mario Minardi
22e98cf95e .github: pin codeql actions to latest 3.x (#13552)
Pin codeql actions usage to latest 3.x. These were previously pointing
to `@2` which pulls in the latest v2 as they are released, with the
potential to break our workflows if a breaking change or malicious
version on the `@2` stream is ever pushed.

Changing this to a pinned version also means that dependabot will keep
this in the pinend version format (e.g., referencing a SHA) when it
opens a PR to bump the dependency.

The breaking change between v2 and v3 is that v3 requires Node 20 which
is a non-issue as we are running this on ubuntu latest.

Updates #cleanup

Signed-off-by: Mario Minardi <mario@tailscale.com>
2024-09-23 15:52:26 -06:00
Mario Minardi
2c1bbfb902 .github: pin actions/setup-go usage to latest 5.x (#13553)
Pin actions/checkout usage to latest 5.x. These were previously pointing
to `@4` which pulls in the latest v4 as they are released, with the
potential to break our workflows if a breaking change or malicious
version on the `@4` stream is ever pushed.

Changing this to a pinned version also means that dependabot will keep
this in the pinend version format (e.g., referencing a SHA) when it
opens a PR to bump the dependency.

The breaking change between v4 and v5 is that v5 requires Node 20 which
should be a non-issue where it is used.

Updates #cleanup

Signed-off-by: Mario Minardi <mario@tailscale.com>
2024-09-23 15:14:49 -06:00
Mario Minardi
07991dec83 .github: pin actions/checkout to latest v3 or v4 as appropriate (#13551)
Pin actions/checkout usage to latest 3.x or 4.x as appropriate. These
were previously pointing to `@4` or `@3` which pull in the latest
versions at these tags as they are released, with the potential to break
our workflows if a breaking change or malicious version for either of
these streams are released.

Changing this to a pinned version also means that dependabot will keep
this in the pinend version format (e.g., referencing a SHA) when it
opens a PR to bump the dependency.

Updates #cleanup

Signed-off-by: Mario Minardi <mario@tailscale.com>
2024-09-23 14:52:19 -06:00
Mario Minardi
8d508712c9 tailcfg: add AcceptEnv field to SSHRule (#13523)
Add an `AcceptEnv` field to `SSHRule`. This will contain the collection
of environment variable names / patterns that are specified in the
`acceptEnv` block for the SSH rule within the policy file. This will be
used in the tailscale client to filter out unacceptable environment
variables.

Updates: https://github.com/tailscale/corp/issues/22775

Signed-off-by: Mario Minardi <mario@tailscale.com>
2024-09-22 20:15:26 -06:00
Joe Tsai
dc86d3589c types/views: add SliceView.All iterator (#13536)
And convert a all relevant usages.

Updates #12912

Signed-off-by: Joe Tsai <joetsai@digital-static.net>
2024-09-20 13:55:33 -07:00
Brad Fitzpatrick
3e9ca6c64b go.toolchain.rev: bump oss, test toolchain matches go.toolchain.rev
Update go.toolchain.rev for https://github.com/tailscale/go/pull/104 and
add a test that, when using the tailscale_go build tag, we use the
right Go toolchain.

We'll crank up the strictness in later commits.

Updates #13527

Change-Id: Ifb09a844858be2beb144a420e4e9dbdc5c03ae3a
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-09-19 20:27:59 -07:00
Tom Proctor
d0a56a8870 cmd/containerboot: split main.go (#13517)
containerboot's main.go had grown to well over 1000 lines with
lots of disparate bits of functionality. This commit is pure copy-
paste to group related functionality outside of the main function
into its own set of files. Everything is still in the main package
to keep the diff incremental and reviewable.

Updates #cleanup

Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
2024-09-19 17:58:26 +01:00
James Tucker
af5a845a87 net/dns/resolver: fix dns-sd NXDOMAIN responses from quad-100
mdnsResponder at least as of macOS Sequoia does not find NXDOMAIN
responses to these dns-sd PTR queries acceptable unless they include the
question section in the response. This was found debugging #13511, once
we turned on additional diagnostic reporting from mdnsResponder we
witnessed:

```
Received unacceptable 12-byte response from 100.100.100.100 over UDP via utun6/27 -- id: 0x7F41 (32577), flags: 0x8183 (R/Query, RD, RA, NXDomain), counts: 0/0/0/0,
```

If the response includes a question section, the resposnes are
acceptable, e.g.:

```
Received acceptable 59-byte response from 8.8.8.8 over UDP via en0/17 -- id: 0x2E55 (11861), flags: 0x8183 (R/Query, RD, RA, NXDomain), counts: 1/0/0/0,
```

This may be contributing to an issue under diagnosis in #13511 wherein
some combination of conditions results in mdnsResponder no longer
answering DNS queries correctly to applications on the system for
extended periods of time (multiple minutes), while dig against quad-100
provides correct responses for those same domains. If additional debug
logging is enabled in mdnsResponder we see it reporting:

```
Penalizing server 100.100.100.100 for 60 seconds
```

It is also possible that the reason that macOS & iOS never "stopped
spamming" these queries is that they have never been replied to with
acceptable responses. It is not clear if this special case handling of
dns-sd PTR queries was ever beneficial, and given this evidence may have
always been harmful. If we subsequently observe that the queries settle
down now that they have acceptable responses, we should remove these
special cases - making upstream queries very occasionally isn't a lot of
battery, so we should be better off having to maintain less special
cases and avoid bugs of this class.

Updates #2442
Updates #3025
Updates #3363
Updates #3594
Updates #13511

Signed-off-by: James Tucker <james@tailscale.com>
2024-09-18 18:43:03 -07:00
Andrea Gottardo
3a467b66b6 go/toolchain: use ed9dc37b2b000f376a3e819cbb159e2c17a2dac6 (#13507)
Updates tailscale/tailscale#13452

Bump the Go toolchain to the latest to pick up changes required to not crash on Android 9/10.

Signed-off-by: Andrea Gottardo <andrea@gottardo.me>
2024-09-18 18:51:09 +00:00
M. J. Fromberger
5f89c93274 safeweb: add a ListenAndServe method to the Server type (#13498)
Updates #13497

Change-Id: I398e9fa58ad0b9dc799ea280c9c7a32150150ee4
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
2024-09-17 12:59:28 -07:00
Jordan Whited
951884b077 net/netcheck,wgengine/magicsock: plumb OnlyTCP443 controlknob through netcheck (#13491)
Updates tailscale/corp#17879

Signed-off-by: Jordan Whited <jordan@tailscale.com>
2024-09-17 12:24:42 -07:00
Fran Bull
8b962f23d1 cmd/natc: fix nil pointer
Fixes #13495

Signed-off-by: Fran Bull <fran@tailscale.com>
2024-09-17 09:48:48 -07:00
Jordan Whited
5f4a4c6744 wgengine/magicsock: fix sendUDPStd docs (#13490)
Updates #cleanup

Signed-off-by: Jordan Whited <jordan@tailscale.com>
2024-09-16 19:28:00 -07:00
Jordan Whited
4084c6186d wgengine/magicsock: add side-effect-free function for netcheck UDP sends (#13487)
Updates #13484
Updates tailscale/corp#17879

Signed-off-by: Jordan Whited <jordan@tailscale.com>
2024-09-16 19:00:12 -07:00
Brad Fitzpatrick
8012bb4216 derp: refactor DERP server's peer-gone watch mechanism
In prep for upcoming flow tracking & mutex contention optimization
changes, this change refactors (subjectively simplifying) how the DERP
Server accounts for which peers have written to which other peers, to
be able to send PeerGoneReasonDisconnected messages to writes to
uncache their DRPO (DERP Return Path Optimization) routes.

Notably, this removes the Server.sentTo field which was guarded by
Server.mu and checked on all packet sends. Instead, the accounting is
moved to each sclient's sendLoop goroutine and now only needs to
acquire Server.mu for newly seen senders, the first time a peer sends
a packet to that sclient.

This change reduces the number of reasons to acquire Server.mu
per-packet from two to one. Removing the last one is the subject of an
upcoming change.

Updates #3560
Updates #150

Change-Id: Id226216d6629d61254b6bfd532887534ac38586c
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-09-16 17:47:38 -07:00
License Updater
7f1c193a83 licenses: update license notices
Signed-off-by: License Updater <noreply+license-updater@tailscale.com>
2024-09-16 15:21:37 -07:00
Andrew Dunham
f572286bf9 gokrazy, various: use point versions of Go and update Nix deps
This un-breaks vim-go (which doesn't understand "go 1.23") and allows
the natlab tests to work in a Nix shell (by adding the "qemu-img" and
"mkfs.ext4" binaries to the shell). These binaries are available even on
macOS, as I'm testing on my M1 Max.

Updates #13038

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I99f8521b5de93ea47dc33b099d5b243ffc1303da
2024-09-16 16:06:43 -04:00
Andrew Dunham
40833a7524 wgengine/magicsock: disable raw disco by default; add envknob to enable
Updates #13140

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: Ica85b2ac8ac7eab4ec5413b212f004aecc453279
2024-09-16 11:06:33 -07:00
Mario Minardi
124ff3b034 {api.md,publicapi}: remove old API docs (#13468)
Now that we have our API docs hosted at https://tailscale.com/api we can
remove the previous (and now outdated) markdown based docs. The top
level api.md has been left with the only content being the redirect to
the new docs.

Updates #cleanup

Signed-off-by: Mario Minardi <mario@tailscale.com>
2024-09-13 14:10:33 -06:00
Jordan Whited
afec2d41b4 wgengine/magicsock: remove redundant deadline from netcheck report call (#13395)
netcheck.Client.GetReport() applies its own deadlines. This 2s deadline
was causing GetReport() to never fall back to HTTPS/ICMP measurements
as it was shorter than netcheck.stunProbeTimeout, leaving no time
for fallbacks.

Updates #13394
Updates #6187

Signed-off-by: Jordan Whited <jordan@tailscale.com>
2024-09-13 10:51:30 -07:00
Mario Minardi
93f61aa4cc tailcfg: add node attr for SSH environment variables (#13450)
Add a node attr for enabling SSH environment variable handling logic.

Updates https://github.com/tailscale/corp/issues/22775

Signed-off-by: Mario Minardi <mario@tailscale.com>
2024-09-12 16:18:14 -06:00
Brad Fitzpatrick
aa15a63651 derp: add new concurrent server benchmark
In prep for reducing mutex contention on Server.mu.

Updates #3560

Change-Id: Ie95e7c6dc9f4b64b6f79b3b2338f8cd86c688d98
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-09-12 14:05:38 -07:00
kari-ts
3bee38d50f VERSION.txt: this is v1.75.0 (#13454)
Signed-off-by: kari-ts <kari@tailscale.com>
2024-09-12 20:19:46 +00:00
Brad Fitzpatrick
cec779e771 util/slicesx: add FirstElementEqual and LastElementEqual
And update a few callers as examples of motivation. (there are a
couple others, but these are the ones where it's prettier)

Updates #cleanup

Change-Id: Ic8c5cb7af0a59c6e790a599136b591ebe16d38eb
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-09-11 18:36:00 -07:00
Brad Fitzpatrick
910462a8e0 derp: unify server's clientSet interface into concrete type
73280595a8 for #2751 added a "clientSet" interface to
distinguish the two cases of a client being singly connected (the
common case) vs tolerating multiple connections from the client at
once. At the time (three years ago) it was kinda an experiment
and we didn't know whether it'd stop the reconnect floods we saw
from certain clients. It did.

So this promotes it to a be first-class thing a bit, removing the
interface. The old tests from 73280595a were invaluable in ensuring
correctness while writing this change (they failed a bunch).

But the real motivation for this change is that it'll permit a future
optimization to add flow tracking for stats & performance where we
don't contend on Server.mu for each packet sent via DERP. Instead,
each client can track its active flows and hold on to a *clientSet and
ask the clientSet per packet what the active client is via one atomic
load rather than a mutex. And if the atomic load returns nil, we'll
know we need to ask the server to see if they died and reconnected and
got a new clientSet. But that's all coming later.

Updates #3560

Change-Id: I9ccda3e5381226563b5ec171ceeacf5c210e1faf
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-09-11 16:17:27 -07:00
Maisem Ali
f2713b663e .github: enable fuzz testing again (go1.23)
Updates #12912

Signed-off-by: Maisem Ali <maisem@tailscale.com>
2024-09-11 14:50:13 -07:00
Maisem Ali
4d6a8224d5 util/linuxfw: fall back to nftables when iptables not found
When the desired netfilter mode was unset, we would always try
to use the `iptables` binary. In such cases if iptables was not found,
tailscaled would just crash as seen in #13440. To work around this, in those
cases check if the `iptables` binary even exists and if it doesn't fall back
to the nftables implementation.

Verified that it works on stock Ubuntu 24.04.

Updates #5621
Updates #8555
Updates #8762
Fixes #13440

Signed-off-by: Maisem Ali <maisem@tailscale.com>
2024-09-11 14:36:17 -07:00
Tom Proctor
98f4dd9857 cmd/k8s-operator,k8s-operator,kube: Add TSRecorder CRD + controller (#13299)
cmd/k8s-operator,k8s-operator,kube: Add TSRecorder CRD + controller

Deploys tsrecorder images to the operator's cluster. S3 storage is
configured via environment variables from a k8s Secret. Currently
only supports a single tsrecorder replica, but I've tried to take early
steps towards supporting multiple replicas by e.g. having a separate
secret for auth and state storage.

Example CR:

```yaml
apiVersion: tailscale.com/v1alpha1
kind: Recorder
metadata:
  name: rec
spec:
  enableUI: true
```

Updates #13298

Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
2024-09-11 12:19:29 +01:00
Brad Fitzpatrick
9f9470fc10 ipnlocal,proxymap,wgengine/netstack: add optional WhoIs/proxymap debug
Updates tailscale/corp#20600

Change-Id: I2bb17af0f40603ada1ba4cecc087443e00f9392a
Co-authored-by: Maisem Ali <maisem@tailscale.com>
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-09-10 14:23:33 -07:00
Fran Bull
7d16af8d95 cmd/natc: fix nil pointer
Fixes #13432

Signed-off-by: Fran Bull <fran@tailscale.com>
2024-09-10 13:49:29 -07:00
dependabot[bot]
436a0784a2 build(deps): bump ws from 8.14.2 to 8.17.1 in /client/web (#12524)
Bumps [ws](https://github.com/websockets/ws) from 8.14.2 to 8.17.1.
- [Release notes](https://github.com/websockets/ws/releases)
- [Commits](https://github.com/websockets/ws/compare/8.14.2...8.17.1)

---
updated-dependencies:
- dependency-name: ws
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-10 12:39:40 -06:00
dependabot[bot]
71b550c73c .github: Bump peter-evans/create-pull-request from 5.0.1 to 7.0.1 (#13419)
Bumps [peter-evans/create-pull-request](https://github.com/peter-evans/create-pull-request) from 5.0.1 to 7.0.1.
- [Release notes](https://github.com/peter-evans/create-pull-request/releases)
- [Commits](284f54f989...8867c4aba1)

---
updated-dependencies:
- dependency-name: peter-evans/create-pull-request
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-10 10:08:14 -06:00
Jordan Whited
a228d77f86 cmd/stunstamp: add protocol context to timeout logs (#13422)
We started out with a single protocol & port, now it's many.

Updates #cleanup

Signed-off-by: Jordan Whited <jordan@tailscale.com>
2024-09-09 18:42:13 -07:00
Andrew Dunham
0970615b1b ipn/ipnlocal: don't program system DNS when node key is expired (#13370)
This mimics having Tailscale in the 'Stopped' state by programming an
empty DNS configuration when the current node key is expired.

Updates tailscale/support-escalations#55


Change-Id: I68ff4665761fb621ed57ebf879263c2f4b911610

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
2024-09-09 15:15:29 -04:00
Brad Fitzpatrick
0a2e5afb26 tsnet: remove old package doc experimental warning
It was scaring people. It's been pretty stable for quite some time now
and we're unlikely to change the API and break people at this point.
We might, but have been trying not to.

Fixes tailscale/corp#22933

Change-Id: I0c3c79b57ccac979693c62ba320643a940ac947e
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-09-09 09:40:43 -07:00
Irbe Krumina
209567e7a0 kube,cmd/{k8s-operator,containerboot},envknob,ipn/store/kubestore,*/depaware.txt: rename packages (#13418)
Rename kube/{types,client,api} -> kube/{kubetypes,kubeclient,kubeapi}
so that we don't need to rename the package on each import to
convey that it's kubernetes specific.

Updates#cleanup

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2024-09-08 20:57:29 +01:00
Irbe Krumina
d6dfb7f242 kube,cmd/{k8s-operator,containerboot},envknob,ipn/store/kubestore,*/depaware.txt: split out kube types (#13417)
Further split kube package into kube/{client,api,types}. This is so that
consumers who only need constants/static types don't have to import
the client and api bits.

Updates#cleanup

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2024-09-08 19:06:07 +01:00
Irbe Krumina
ecd64f6ed9 cmd/k8s-operator,kube: set app name for Kubernetes Operator proxies (#13410)
Updates tailscale/corp#22920

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2024-09-08 05:48:38 +01:00
Nick Khyl
4dfde7bffc net/dns: disable DNS registration for Tailscale interface on Windows
We already disable dynamic updates by setting DisableDynamicUpdate to 1 for the Tailscale interface.
However, this does not prevent non-dynamic DNS registration from happening when `ipconfig /registerdns`
runs and in similar scenarios. Notably, dns/windowsManager.SetDNS runs `ipconfig /registerdns`,
triggering DNS registration for all interfaces that do not explicitly disable it.

In this PR, we update dns/windowsManager.disableDynamicUpdates to also set RegistrationEnabled to 0.

Fixes #13411

Signed-off-by: Nick Khyl <nickk@tailscale.com>
2024-09-07 19:00:38 +01:00
Irbe Krumina
2b0d0ddf5d sessionrecording,ssh/tailssh,k8s-operator: log connected recorder address (#13382)
Updates tailscale/corp#19821

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2024-09-07 06:11:33 +01:00
Patrick O'Doherty
7ce9c1944a go.toolchain.rev: update to 1.23.1 (#13408)
Update Go toolchain to 1.23.1.

Updates #cleanup

Signed-off-by: Patrick O'Doherty <patrick@tailscale.com>
2024-09-06 13:09:15 -07:00
Brad Fitzpatrick
71ff3d7c39 go.mod: bump github.com/illarion/gonotify/v2
Updates #13359

Change-Id: I28e048bf9d1d114d07d140f165f4ea89a82be79f
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-09-06 08:36:10 -07:00
Jordan Whited
95f0094310 cmd/stunstamp: cleanup timeout and interval constants (#13393)
Updates #cleanup

Signed-off-by: Jordan Whited <jordan@tailscale.com>
2024-09-05 13:40:12 -07:00
Nick Khyl
e7b5e8c8cd ipn/ipnserver: remove IdleTimeout
We no longer need this on Windows, and it was never required on other platforms.
It just results in more short-lived connections unless we use HTTP/2.

Updates tailscale/corp#18342

Signed-off-by: Nick Khyl <nickk@tailscale.com>
2024-09-05 13:00:38 -05:00
Andrew Lytvynov
e7a6e7930f cmd/systray: handle reconnects to IPN bus (#13386)
When tailscaled restarts and our watch connection goes down, we get
stuck in an infinite loop printing `ipnbus error: EOF` (which ended up
consuming all the disk space on my laptop via the log file). Instead,
handle errors in `watchIPNBus` and reconnect after a short delay.

Updates #1708

Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
2024-09-05 10:11:05 -07:00
Flakes Updater
4f2a2bfa42 go.mod.sri: update SRI hash for go.mod changes
Signed-off-by: Flakes Updater <noreply+flakes-updater@tailscale.com>
2024-09-05 10:06:02 -07:00
Jordan Whited
7aa766ee65 net/tstun: probe TCP GRO (#13376)
Disable TCP & UDP GRO if the probe fails.

torvalds/linux@e269d79c7d broke virtio_net
TCP & UDP GRO causing GRO writes to return EINVAL. The bug was then
resolved later in
torvalds/linux@89add40066. The offending
commit was pulled into various LTS releases.

Updates #13041

Signed-off-by: Jordan Whited <jordan@tailscale.com>
2024-09-05 09:59:31 -07:00
Andrew Dunham
7dcf65a10a net/dns: fix IsZero and Equal methods on OSConfig
Discovered this while investigating the following issue; I think it's
unrelated, but might as well fix it. Also, add a test helper for
checking things that have an IsZero method using the reflect package.

Updates tailscale/support-escalations#55

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I57b7adde43bcef9483763b561da173b4c35f49e2
2024-09-05 00:05:36 -04:00
Brad Fitzpatrick
13dee9db7b health: fix magicsockReceiveFuncWarnable health clearing
Fixes #13204

Change-Id: I7154cdabc9dc362dcc3221fd5a86e21f610bbff0
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-09-04 17:08:33 -07:00
Brad Fitzpatrick
3d401c11fa all: use new Go 1.23 slices.Sorted more
Updates #12912

Change-Id: If1294e5bc7b5d3cf0067535ae10db75e8b988d8b
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-09-04 14:52:21 -07:00
Anton Tolchanov
fd6686d81a tka: truncate long rotation signature chains
When a rotation signature chain reaches a certain size, remove the
oldest rotation signature from the chain before wrapping it in a new
rotation signature.

Since all previous rotation signatures are signed by the same wrapping
pubkey (node's own tailnet lock key), the node can re-construct the
chain, re-signing previous rotation signatures. This will satisfy the
existing certificate validation logic.

Updates #13185

Signed-off-by: Anton Tolchanov <anton@tailscale.com>
2024-09-04 22:17:21 +01:00
Brad Fitzpatrick
bcc47d91ca cmd/tailscale/cli: use new Go 1.23 slices.Sorted
And a grammatical nit.

Updates #12912

Change-Id: I9feae53beb4d28dfe98b583373e2e0a43c801fc4
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-09-04 13:27:05 -07:00
Nick Khyl
11d205f6c4 control/controlclient,posture,util/syspolicy: use predefined syspolicy keys instead of string literals
With the upcoming syspolicy changes, it's imperative that all syspolicy keys are defined in the syspolicy package
for proper registration. Otherwise, the corresponding policy settings will not be read.

This updates a couple of places where we still use string literals rather than syspolicy consts.

Updates #12687

Signed-off-by: Nick Khyl <nickk@tailscale.com>
2024-09-04 15:25:19 -05:00
Andrea Gottardo
d060b3fa02 cli: implement tailscale dns status (#13353)
Updates tailscale/tailscale#13326

This PR begins implementing a `tailscale dns` command group in the Tailscale CLI. It provides an initial implementation of `tailscale dns status` which dumps the state of the internal DNS forwarder.

Two new endpoints were added in LocalAPI to support the CLI functionality:

- `/netmap`: dumps a copy of the last received network map (because the CLI shouldn't have to listen to the ipn bus for a copy)
- `/dns-osconfig`: dumps the OS DNS configuration (this will be very handy for the UI clients as well, as they currently do not display this information)

My plan is to implement other subcommands mentioned in tailscale/tailscale#13326, such as `query`, in later PRs.

Signed-off-by: Andrea Gottardo <andrea@gottardo.me>
2024-09-04 19:43:55 +00:00
Nick Khyl
5bc9fafab8 ipn/ipnlocal: always send auth URL notifications when a user requests interactive login
This PR changes how LocalBackend handles interactive (initiated via StartLoginInteractive) and non-interactive (e.g., due to key expiration) logins,
and when it sends the authURL to the connected clients.

Specifically,
 - When a user initiates an interactive login by clicking Log In in the GUI, the LocalAPI calls StartLoginInteractive.
   If an authURL is available and hasn't expired, we immediately send it to all connected clients, suggesting them to open that URL in a browser.
   Otherwise, we send a login request to the control plane and set a flag indicating that an interactive login is in progress.
 - When LocalBackend receives an authURL from the control plane, we check if it differs from the previous one and whether an interactive login
   is in progress. If either condition is true, we notify all connected clients with the new authURL and reset the interactive login flag.

We reset the auth URL and flags upon a successful authentication, when a different user logs in and when switching Tailscale login profiles.

Finally, we remove the redundant dedup logic added to WatchNotifications in #12096 and revert the tests to their original state to ensure that
calling StartLoginInteractive always produces BrowseToURL notifications, either immediately or when the authURL is received from the control plane.

Fixes #13296

Signed-off-by: Nick Khyl <nickk@tailscale.com>
2024-09-04 13:39:46 -05:00
Andrea Gottardo
0112da6070 net/dns: support GetBaseConfig on Darwin OSS tailscaled (#13351)
Updates tailscale/tailscale#177

It appears that the OSS distribution of `tailscaled` is currently unable to get the current system base DNS configuration, as GetBaseConfig() in manager_darwin.go is unimplemented. This PR adds a basic implementation that reads the current values in `/etc/resolv.conf`, to at least unblock DNS resolution via Quad100 if `--accept-dns` is enabled.

Signed-off-by: Andrea Gottardo <andrea@gottardo.me>
2024-09-04 10:31:58 -07:00
Jordan Whited
1fc4268aea cmd/stunstamp: increase probe jitter (#13362)
We've added more probe targets recently which has resulted in more
timeouts behind restrictive NATs in localized testing that don't
like how many flows we are creating at once. Not so much an issue
for datacenter or cloud-hosted deployments.

Updates tailscale/corp#22114

Signed-off-by: Jordan Whited <jordan@tailscale.com>
2024-09-04 09:54:32 -07:00
Jordan Whited
1dd1798bfa cmd/stunstamp: use measureFn more consistently in naming/signatures (#13360)
Updates #cleanup

Signed-off-by: Jordan Whited <jordan@tailscale.com>
2024-09-04 09:28:03 -07:00
Jordan Whited
6d6b1773ea cmd/stunstamp: implement ICMP{v6} probing (#13354)
This adds both userspace and kernel timestamping.

Updates tailscale/corp#22114

Signed-off-by: Jordan Whited <jordan@tailscale.com>
2024-09-04 08:36:47 -07:00
Brad Fitzpatrick
c4d0237e5c tstest/natlab: add dual stack with blackholed IPv4
This reproduces the bug report from
https://github.com/tailscale/tailscale/issues/13346

It does not yet fix it.

Updates #13346

Change-Id: Ia5af7b0481a64a37efe259c798facdda6d9da618
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-09-03 17:16:26 -07:00
Nick Khyl
aeb15dea30 util/syspolicy/source: add package for reading policy settings from external stores
We add package defining interfaces for policy stores, enabling creation of policy sources
and reading settings from them. It includes a Windows-specific PlatformPolicyStore for GP and MDM
policies stored in the Registry, and an in-memory TestStore for testing purposes.

We also include an internal package that tracks and reports policy usage metrics when a policy setting
is read from a store. Initially, it will be used only on Windows and Android, as macOS, iOS, and tvOS
report their own metrics. However, we plan to use it across all platforms eventually.

Updates #12687

Signed-off-by: Nick Khyl <nickk@tailscale.com>
2024-09-03 14:51:14 -05:00
Brad Fitzpatrick
e865a0e2b0 cmd/tailscale/cli: add 'debug go-buildinfo' subcommand
To dump runtime/debug.BuildInfo.

Updates #1866

Change-Id: I8810390858a03b7649f9b22ef3ab910d423388da
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-09-03 11:37:04 -07:00
Seaver Thorn
345876da33 client/tailscale: adding missing proto field in ACL parsing structures (#13051)
Signed-off-by: Seaver Thorn <swthorn@ncsu.edu>
2024-09-03 18:04:39 +00:00
Irbe Krumina
8e1c00f841 cmd/k8s-operator,k8s-operator/sessionrecording: ensure recording header contains terminal size for terminal sessions (#12965)
* cmd/k8s-operator,k8s-operator/sessonrecording: ensure CastHeader contains terminal size

For tsrecorder to be able to play session recordings, the recording's
CastHeader must have '.Width' and '.Height' fields set to non-zero.
Kubectl (or whoever is the client that initiates the 'kubectl exec'
session recording) sends the terminal dimensions in a resize message that
the API server proxy can intercept, however that races with the first server
message that we need to record.
This PR ensures we wait for the terminal dimensions to be processed from
the first resize message before any other data is sent, so that for all
sessions with terminal attached, the header of the session recording
contains the terminal dimensions and the recording can be played by tsrecorder.

Updates tailscale/tailscale#19821

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2024-09-03 18:42:02 +01:00
Andrew Dunham
1c972bc7cb wgengine/magicsock: actually use AF_PACKET socket for raw disco
Previously, despite what the commit said, we were using a raw IP socket
that was *not* an AF_PACKET socket, and thus was subject to the host
firewall rules. Switch to using a real AF_PACKET socket to actually get
the functionality we want.

Updates #13140

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: If657daeeda9ab8d967e75a4f049c66e2bca54b78
2024-09-03 12:50:09 -04:00
Brad Fitzpatrick
eb2fa16fcc tailcfg: bump capver for earlier cryptokey panic fix [capver 106]
I should've bumped capver in 65fe0ba7b5 but forgot.

This lets us turn off the cryptokey routing change from control for
the affected panicky range of commits, based on capver.

Updates #13332
Updates tailscale/corp#20732

Change-Id: I32c17cfcb45b2369b2b560032330551d47a0ce0b
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-09-03 09:36:47 -07:00
Brad Fitzpatrick
20cf48b8dd gokrazy{,/natlabapp.arm64}: start adding arm64 appliance support
Both for Raspberry Pis, and for running natlab tests faster on Apple
Silicon Macs without emulating x86.

Not fully wired up yet.

Updates #1866
Updates #13038

Change-Id: I1552bf107069308f325f640773cc881ed735b5ab
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-09-03 09:24:15 -07:00
Brad Fitzpatrick
65fe0ba7b5 wgengine/magicsock: fix panic regression from cryptokey routing change
Fixes #13332
Updates tailscale/corp#20732

Change-Id: I30f12746844bf77f5a664bf8e8d8ebf2511a2b27
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-08-31 06:19:28 -07:00
Nick Khyl
2f2aeaeaeb ipn/ipnlocal: fix a nil pointer dereference when serving /localapi/v0/tka/status
Fixes #13330

Signed-off-by: Nick Khyl <nickk@tailscale.com>
2024-08-30 23:13:44 -05:00
Brad Fitzpatrick
3d9e3a17fa tstest/natlab/vnet: move some boilerplate to mkPacket helper
No need to make callers specify the redundant IP version or
TTL/HopLimit or EthernetType in the common case. The mkPacket helper
can set those when unset.

And use the mkIPLayer in another place, simplifying some code.

And rename mkPacketErr to just mkPacket, then move mkPacket to
test-only code, as mustPacket.

Updates #13038

Change-Id: Ic216e44dda760c69ab9bfc509370040874a47d30
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-08-30 20:23:30 -07:00
Brad Fitzpatrick
7e88d6712e tstest/natlab/vnet: add syslog tests
Updates #13038

Change-Id: I4ac96cb0a9e46a2fb1e09ddedd3614eb006c2c8c
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-08-30 14:22:01 -07:00
Brad Fitzpatrick
b1a5b40318 tstest/natlab/vnet: add DHCP tests, ignore DHCPv4 on v6-only networks
And clean up some of the test helpers in the process.

Updates #13038

Change-Id: I3e2b5f7028a32d97af7f91941e59399a8e222b25
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-08-30 08:49:01 -07:00
Brad Fitzpatrick
ffa1c93f59 tstest/natlab/vnet: use mkPacketErr in more places
I'd added this helper for tests, but then moved it to non-test code
and forgot some places to use it. This uses it in more places to
remove some boilerplate.

Updates #13038

Change-Id: Ic4dc339be1c47a55b71d806bab421097ee3d75ed
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-08-30 08:49:01 -07:00
Anton Tolchanov
109d0891e1 posture: stop logging serial numbers
Logging serial numbers every time they are read might have been useful
early on, but seems unnecessary now.

Updates #5902

Signed-off-by: Anton Tolchanov <anton@tailscale.com>
2024-08-30 15:45:53 +01:00
Nick Khyl
959285e0c5 ipn/ipnlocal: fix race condition that results in a panic sending on a closed channel
Fixes #13288

Signed-off-by: Nick Khyl <nickk@tailscale.com>
2024-08-29 17:27:21 -05:00
Percy Wegmann
35423fcf69 drive/driveimpl: use su instead of sudo
This allows Taildrive to work on systems like Busybox that don't have sudo.

Fixes #12282

Signed-off-by: Percy Wegmann <percy@tailscale.com>
2024-08-29 16:23:03 -05:00
Jordan Whited
45c97751fb net/tstun: clarify GROFilterFunc *gro.GRO usage (#13318)
Updates #cleanup

Signed-off-by: Jordan Whited <jordan@tailscale.com>
2024-08-29 13:04:46 -07:00
Percy Wegmann
ecc451501c ssh/tailssh: add ability to force V2 behavior using new feature flag
Introduces ssh-behavior-v2 node attribute to override ssh-behavior-v1.

Updates #11854

Signed-off-by: Percy Wegmann <percy@tailscale.com>
2024-08-29 15:02:58 -05:00
Andrea Gottardo
a584d04f8a dns: increase TimeToVisible before DNS unavailable warning (#13317)
Updates tailscale/tailscale#13314

Some users are reporting 'DNS unavailable' spurious (?) warnings, especially on Android:

https://old.reddit.com/r/Tailscale/comments/1f2ow3w/health_warning_dns_unavailable_on_tailscale/
https://old.reddit.com/r/Tailscale/comments/1f3l2il/health_warnings_dns_unavailable_what_does_it_mean/

I suspect this is caused by having a too low TimeToVisible setting on the Warnable, which triggers the unhealthy state during slow network transitions.

Signed-off-by: Andrea Gottardo <andrea@gottardo.me>
2024-08-29 11:43:38 -07:00
Jordan Whited
0926954cf5 net/tstun,wgengine/netstack: implement TCP GRO for local services (#13315)
Throughput improves substantially when measured via netstack loopback
(TS_DEBUG_NETSTACK_LOOPBACK_PORT).

Before (d21ebc2):
jwhited@i5-12400-2:~$ iperf3 -V -c 100.100.100.100
Starting Test: protocol: TCP, 1 streams, 131072 byte blocks
Test Complete. Summary Results:
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  5.77 GBytes  4.95 Gbits/sec    0 sender
[  5]   0.00-10.01  sec  5.77 GBytes  4.95 Gbits/sec      receiver

After:
jwhited@i5-12400-2:~$ iperf3 -V -c 100.100.100.100
Starting Test: protocol: TCP, 1 streams, 131072 byte blocks
Test Complete. Summary Results:
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  12.7 GBytes  10.9 Gbits/sec    0 sender
[  5]   0.00-10.00  sec  12.7 GBytes  10.9 Gbits/sec      receiver

Updates tailscale/corp#22754

Signed-off-by: Jordan Whited <jordan@tailscale.com>
2024-08-29 11:37:48 -07:00
Jordan Whited
71acf87830 tstest/integration: add UDP netstack loopback integration test (#13312)
Updates tailscale/corp#22713

Signed-off-by: Jordan Whited <jordan@tailscale.com>
2024-08-29 11:17:27 -07:00
Kristoffer Dalby
e93c160a39 nix: update nix and use go 1.23
Updates #12912

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
2024-08-29 17:25:13 +02:00
Nick Khyl
b48c8db69c ipn/ipnlocal: set WantRunning upon an interactive login, but not during a seamless renewal or a profile switch
The LocalBackend's state machine starts in NoState and soon transitions to NeedsLogin if there's no auto-start profile,
with the profileManager starting with a new empty profile. Notably, entering the NeedsLogin state blocks engine updates.
We expect the user to transition out of this state by logging in interactively, and we set WantRunning to true when
controlclient enters the StateAuthenticated state.

While our intention is correct, and completing an interactive login should set WantRunning to true, our assumption
that logging into the current Tailscale profile is the only way to transition out of the NeedsLogin state is not accurate.
Another common transition path includes an explicit profile switch (via LocalBackend.SwitchProfile) or an implicit switch
when a Windows user connects to the backend. This results in a bug where WantRunning is set to true even when it was
previously set to false, and the user expressed no intention of changing it.

A similar issue occurs when switching from (sic) a Tailnet that has seamlessRenewalEnabled, regardless of the current state
of the LocalBackend's state machine, and also results in unexpectedly set WantRunning. While this behavior is generally
undesired, it is also incorrect that it depends on the control knobs of the Tailnet we're switching from rather than
the Tailnet we're switching to. However, this issue needs to be addressed separately.

This PR updates LocalBackend.SetControlClientStatus to only set WantRunning to true in response to an interactive login
as indicated by a non-empty authURL.

Fixes #6668
Fixes #11280
Updates #12756

Signed-off-by: Nick Khyl <nickk@tailscale.com>
2024-08-29 09:27:17 -05:00
Brad Fitzpatrick
82c2c5c597 tstest/natlab/vnet: add more tests
This adds tests for DNS requests, and ignoring IPv6 packets on v4-only
networks.

No behavior changes. But some things are pulled out into functions.

And the mkPacket helpers previously just for tests are moved into
non-test code to be used elsewhere to reduce duplication, doing the
checksum stuff automatically.

Updates #13038

Change-Id: I4dd0b73c75b2b9567b4be3f05a2792999d83f6a3
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-08-28 21:39:29 -07:00
Jordan Whited
d21ebc28af wgengine/netstack: implement netstack loopback (#13301)
When the TS_DEBUG_NETSTACK_LOOPBACK_PORT environment variable is set,
netstack will loop back (dnat to addressFamilyLoopback:loopbackPort)
TCP & UDP flows originally destined to localServicesIP:loopbackPort.
localServicesIP is quad-100 or the IPv6 equivalent.

Updates tailscale/corp#22713

Signed-off-by: Jordan Whited <jordan@tailscale.com>
2024-08-28 18:50:13 -07:00
Nick Khyl
80b2b45d60 ipn/ipnlocal: refactor and cleanup profileManager
In preparation for multi-user and unattended mode improvements, we are
refactoring and cleaning up `ipn/ipnlocal.profileManager`. The concept of the
"current user", which is only relevant on Windows, is being deprecated and will
soon be removed to allow more than one Windows user to connect and utilize
`LocalBackend` according to that user's access rights to the device and specific
Tailscale profiles.

We plan to pass the user's identity down to the `profileManager`, where it can
be used to determine the user's access rights to a given `LoginProfile`. While
the new permission model in `ipnauth` requires more work and is currently
blocked pending PR reviews, we are updating the `profileManager` to reduce its
reliance on the concept of a single OS user being connected to the backend at
the same time.

We extract the switching to the default Tailscale profile, which may also
trigger legacy profile migration, from `profileManager.SetCurrentUserID`. This
introduces `profileManager.DefaultUserProfileID`, which returns the default
profile ID for the current user, and `profileManager.SwitchToDefaultProfile`,
which is essentially a shorthand for `pm.SwitchProfile(pm.DefaultUserProfileID())`.
Both methods will eventually be updated to accept the user's identity and
utilize that user's default profile.

We make access checks more explicit by introducing the `profileManager.checkProfileAccess`
method. The current implementation continues to use `profileManager.currentUserID`
and `LoginProfile.LocalUserID` to determine whether access to a given profile
should be granted. This will be updated to utilize the `ipnauth` package and the
new permissions model once it's ready. We also expand access checks to be used
more widely in the `profileManager`, not just when switching or listing
profiles. This includes access checks in methods like `SetPrefs` and, most notably,
`DeleteProfile` and `DeleteAllProfiles`, preventing unprivileged Windows users
from deleting Tailscale profiles owned by other users on the same device,
including profiles owned by local admins.

We extract `profileManager.ProfilePrefs` and `profileManager.SetProfilePrefs`
methods that can be used to get and set preferences of a given `LoginProfile` if
`profileManager.checkProfileAccess` permits access to it.

We also update `profileManager.setUnattendedModeAsConfigured` to always enable
unattended mode on Windows if `Prefs.ForceDaemon` is true in the current
`LoginProfile`, even if `profileManager.currentUserID` is `""`. This facilitates
enabling unattended mode via `tailscale up --unattended` even if
`tailscale-ipn.exe` is not running, such as when a Group Policy or MDM-deployed
script runs at boot time, or when Tailscale is used on a Server Code or otherwise
headless Windows environments. See #12239, #2137, #3186 and
https://github.com/tailscale/tailscale/pull/6255#issuecomment-2016623838 for
details.

Fixes #12239
Updates tailscale/corp#18342
Updates #3186
Updates #2137

Signed-off-by: Nick Khyl <nickk@tailscale.com>
2024-08-28 14:42:35 -05:00
Brad Fitzpatrick
73b3c8fc8c tstest/natlab/vnet: add IPv6 all-nodes support
This adds support for sending packets to 33:33:00:00:01 at IPv6
multicast address ff02::1 to send to all nodes.

Nothing in Tailscale depends on this (yet?), but it makes debugging in
VMs behind natlab easier (e.g. you can ping all nodes), and other
things might depend on this in the future.

Mostly I'm trying to flesh out the IPv6 support in natlab now that we
can write vnet tests.

Updates #13038

Change-Id: If590031fcf075690ca35c7b230a38c3e72e621eb
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-08-28 12:04:19 -07:00
Nick Khyl
961ee321e8 ipn/{ipnauth,ipnlocal,ipnserver,localapi}: start baby step toward moving access checks from the localapi.Handler to the LocalBackend
Currently, we use PermitRead/PermitWrite/PermitCert permission flags to determine which operations are allowed for a LocalAPI client.
These checks are performed when localapi.Handler handles a request. Additionally, certain operations (e.g., changing the serve config)
requires the connected user to be a local admin. This approach is inherently racey and is subject to TOCTOU issues.
We consider it to be more critical on Windows environments, which are inherently multi-user, and therefore we prevent more than one
OS user from connecting and utilizing the LocalBackend at the same time. However, the same type of issues is also applicable to other
platforms when switching between profiles that have different OperatorUser values in ipn.Prefs.

We'd like to allow more than one Windows user to connect, but limit what they can see and do based on their access rights on the device
(e.g., an local admin or not) and to the currently active LoginProfile (e.g., owner/operator or not), while preventing TOCTOU issues on Windows
and other platforms. Therefore, we'd like to pass an actor from the LocalAPI to the LocalBackend to represent the user performing the operation.
The LocalBackend, or the profileManager down the line, will then check the actor's access rights to perform a given operation on the device
and against the current (and/or the target) profile.

This PR does not change the current permission model in any way, but it introduces the concept of an actor and includes some preparatory
work to pass it around. Temporarily, the ipnauth.Actor interface has methods like IsLocalSystem and IsLocalAdmin, which are only relevant
to the current permission model. It also lacks methods that will actually be used in the new model. We'll be adding these gradually in the next
PRs and removing the deprecated methods and the Permit* flags at the end of the transition.

Updates tailscale/corp#18342

Signed-off-by: Nick Khyl <nickk@tailscale.com>
2024-08-28 13:49:58 -05:00
Brad Fitzpatrick
8b23ba7d05 tstest/natlab/vnet: add qemu + Virtualization.framework protocol tests
To test how virtual machines connect to the natlab vnet code.

Updates #13038

Change-Id: Ia4fd4b0c1803580ee7d94cc9878d777ad4f24f82
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-08-27 22:30:20 -07:00
Brad Fitzpatrick
ff1d0aa027 tstest/natlab/vnet: start adding tests
And refactor some of vnet.go for testability.

The only behavioral change (with a new test) is that ethernet
broadcasts no longer get sent back to the sender.

Updates #13038

Change-Id: Ic2e7e7d6d8805b7b7f2b5c52c2c5ba97101cef14
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-08-27 18:32:48 -07:00
Jordan Whited
31cdbd68b1 net/tstun: fix gvisor inbound GSO packet injection (#13283)
buffs[0] was not sized to hold pkt with GSO, resulting in a panic.

Updates tailscale/corp#22511

Signed-off-by: Jordan Whited <jordan@tailscale.com>
2024-08-27 14:59:43 -07:00
Kristoffer Dalby
a2c42d3cd4 usermetric: add initial user-facing metrics
This commit adds a new usermetric package and wires
up metrics across the tailscale client.

Updates tailscale/corp#22075

Co-authored-by: Anton Tolchanov <anton@tailscale.com>
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
2024-08-27 11:21:35 +02:00
Kristoffer Dalby
06c31f4e91 tsweb/varz: remove pprof
Updates tailscale/corp#22075

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
2024-08-27 11:21:35 +02:00
Jordan Whited
bfcb3562e6 wgengine/netstack: re-enable gVisor GSO on Linux (#13269)
This was previously disabled in 8e42510 due to missing GSO-awareness in
tstun, which was resolved in d097096.

Updates tailscale/corp#22511

Signed-off-by: Jordan Whited <jordan@tailscale.com>
2024-08-26 20:35:32 -07:00
Jordan Whited
d097096ddc net/tstun,wgengine/netstack: make inbound synthetic packet injection GSO-aware (#13266)
Updates tailscale/corp#22511

Signed-off-by: Jordan Whited <jordan@tailscale.com>
2024-08-26 19:26:39 -07:00
Jordan Whited
6d4973e1e0 wgengine/netstack: use types/logger.Logf instead of stdlib log.Printf (#13267)
Updates #cleanup

Signed-off-by: Jordan Whited <jordan@tailscale.com>
2024-08-26 16:16:06 -07:00
Brad Fitzpatrick
f99f970dc1 tstest/natlab/vnet: rename some things for clarity
The bad naming (which had only been half updated with the IPv6
changes) tripped me up in the earlier change.

Updates #13038

Change-Id: I65ce07c167e8219d35b87e1f4bf61aab4cac31ff
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-08-26 15:36:30 -07:00
Brad Fitzpatrick
0157000cab tstest/natlab: fix IPv6 tests, remove TODOs
The reason they weren't working was because the cmd/tta agent in the
guest was dialing out to the test and the vnet couldn't map its global
unicast IPv6 address to a node as it was just using a
map[netip.Addr]*node and blindly trusting the *node was
populated. Instead, it was nil, so the agent connection fetching
didn't work for its RoundTripper and the test could never drive the
node. That map worked for IPv4 but for IPv6 we need to use the method
that takes into account the node's IPv6 SLAAC address. Most call sites
had been converted but I'd missed that one.

Also clean up some debug, and prohibit nodes' link-local unicast
addresses from dialing 2000::/3 directly for now. We can allow that to
be configured opt-in later (some sort of IPv6 NAT mode. Whatever it's
called.) That mode was working on accident, but was confusing: Linux
would do source address selection from link local for the first few
seconds and then after SLAAC and DAD, switch to using the global
unicast source address. Be consistent for now and force it to use the
global unicast.

Updates #13038

Change-Id: I85e973aaa38b43c14611943ff45c7c825ee9200a
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-08-26 15:36:30 -07:00
Brad Fitzpatrick
9f7683e2a1 logpolicy: extend the gokrazy/natlab wait-for-network delay for IPv6
Really we need to fix logpolicy + bootstrapDNS to not be so aggressive,
but this is a quick workaround meanwhile.

Without this, tailscaled starts immediately while IPv6 DAD is
happening for a couple seconds and logpolicy freaks out without the
network available and starts spamming stderr about bootstrap DNS
options. But we see that regularly anyway from people whose wifi is
down. So we need to fix the general case. This is not that fix.

Updates #13038

Change-Id: Iba7e536d08e59d34abded1d279f88fdc9c46d94d
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-08-26 15:36:30 -07:00
Brad Fitzpatrick
2636a83d0e cmd/tta: pull out test driver dialing into a type, fix bugs
There were a few places it could get wedged (notably the dial without
a timeout).

And add a knob for verbose debug logs.

And keep two idle connections always.

Updates #13038

Change-Id: I952ad182d7111481d97a83c12aa2ff4bfdc55fe8
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-08-26 15:36:30 -07:00
Brad Fitzpatrick
6dd1af0d1e tstest/natlab: refactor HandleEthernetPacketForRouter a bit
Move all the UDP handling to its own func to remove a bunch of "if
isUDP" checks in a bunch of blocks.

Updates #13038

Change-Id: If71d71b49e57651d15bd307a2233c43751cc8639
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-08-26 15:36:30 -07:00
Brad Fitzpatrick
3a8cfbc381 tstest/natlab: be more paranoid about IP versions from gvisor
I didn't actually see this, but added this while debugging something
and figured it'd be good to keep.

Updates #13038

Change-Id: I67934c8a329e0233f79c3b08516fd6bad6bfe22a
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-08-26 15:36:30 -07:00
Brad Fitzpatrick
e0bdd5d058 tstest/natlab: simplify a defer
Updates #13038

Change-Id: I4d38701491523c64c81767b0838010609e683a9f
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-08-26 15:36:30 -07:00
Will Norris
cccacff564 types/opt: add BoolFlag for setting Bool value as a flag
Updates tailscale/corp#22578

Signed-off-by: Will Norris <will@tailscale.com>
2024-08-26 11:32:35 -07:00
James Tucker
8af50fa97c ipn/ipnlocal: update routes on link change with ExitNodeAllowLANAccess
On a major link change the LAN routes may change, so on linkChange where
ChangeDelta.Major, we need to call authReconfig to ensure that new
routes are observed and applied.

Updates tailscale/corp#22574

Signed-off-by: James Tucker <james@tailscale.com>
2024-08-26 11:27:38 -07:00
Brad Fitzpatrick
b78df4d48a tstest/natlab/vnet: add start of IPv6 support
Updates #13038

Change-Id: Ic3d095f167daf6c7129463e881b18f2e0d5693f5
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-08-24 18:02:38 -07:00
Maisem Ali
31b5239a2f tstest/natlab/vnet: flush and sync pcap file after every packet
So that we can view the pcap as we debug interactively.

Updates #13038

Signed-off-by: Maisem Ali <maisem@tailscale.com>
2024-08-24 11:44:50 -07:00
Jordan Whited
978306565d tstest/integration: change log.Fatal() to t.Fatal() (#13253)
Updates #cleanup

Signed-off-by: Jordan Whited <jordan@tailscale.com>
2024-08-23 16:58:31 -07:00
Jordan Whited
367bfa607c tstest/integration: exercise TCP DNS queries against quad-100 (#13231)
Updates tailscale/corp#22511

Signed-off-by: Jordan Whited <jordan@tailscale.com>
2024-08-23 16:40:18 -07:00
Jordan Whited
641693d61c ipn/ipnlocal: install IPv6 service addr route (#13252)
This is the equivalent of quad-100, but for IPv6. This is technically
already contained in the Tailscale IPv6 ULA prefix, but that is only
installed when remote peers are visible via control with contained
addrs. The service addr should always be reachable.

Updates #1152

Signed-off-by: Jordan Whited <jordan@tailscale.com>
2024-08-23 16:22:56 -07:00
Brad Fitzpatrick
475ab1fb67 cmd/vnet: omit log spam when backend status hasn't changed
Updates #13038

Change-Id: I9cc67cf18ba44ff66ba03cda486d5e111e395ce7
2024-08-23 14:24:01 -07:00
Brad Fitzpatrick
e5fd36ad78 tstest/natlab: respect NATTable interface's invalid-means-drop everywhere
And sprinkle some more docs around.

Updates #13038

Change-Id: Ia2dcf567b68170481cc2094d64b085c6b94a778a
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-08-23 14:21:59 -07:00
Nick Khyl
03acab2639 cmd/cloner, cmd/viewer, util/codegen: add support for aliases of cloneable types
We have several checked type assertions to *types.Named in both cmd/cloner and cmd/viewer.
As Go 1.23 updates the go/types package to produce Alias type nodes for type aliases,
these type assertions no longer work as expected unless the new behavior is disabled
with gotypesalias=0.

In this PR, we add codegen.NamedTypeOf(t types.Type), which functions like t.(*types.Named)
but also unrolls type aliases. We then use it in place of type assertions in the cmd/cloner and
cmd/viewer packages where appropriate.

We also update type switches to include *types.Alias alongside *types.Named in relevant cases,
remove *types.Struct cases when switching on types.Type.Underlying and update the tests
with more cases where type aliases can be used.

Updates #13224
Updates #12912

Signed-off-by: Nick Khyl <nickk@tailscale.com>
2024-08-23 15:43:40 -05:00
Nick Khyl
a9dc6e07ad util/codegen, cmd/cloner, cmd/viewer: update codegen.LookupMethod to support alias type nodes
Go 1.23 updates the go/types package to produce Alias type nodes for type aliases, unless disabled with gotypesalias=0.
This new default behavior breaks codegen.LookupMethod, which uses checked type assertions to types.Named and
types.Interface, as only named types and interfaces have methods.

In this PR, we update codegen.LookupMethod to perform method lookup on the right-hand side of the alias declaration
and clearly switch on the supported type nodes types. We also improve support for various edge cases, such as when an alias
is used as a type parameter constraint, and add tests for the LookupMethod function.

Additionally, we update cmd/viewer/tests to include types with aliases used in type fields and generic type constraints.

Updates #13224
Updates #12912

Signed-off-by: Nick Khyl <nickk@tailscale.com>
2024-08-23 15:26:05 -05:00
Brad Fitzpatrick
aa42ae9058 tstest/natlab: make a new virtualIP type in prep for IPv6 support
All the magic service names with virtual IPs will need IPv6 variants.

Pull this out in prep.

Updates #13038

Change-Id: I53b5eebd0679f9fa43dc0674805049258c83a0de
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-08-23 13:16:33 -07:00
Brad Fitzpatrick
5a99940dfa tstest/natlab/vnet: explicitly ignore PCP and SSDP UDP queries
So we don't log about them when verbose logging is enabled.

Updates #13038

Change-Id: I925bc3a23e6c93d60dd4fb4bf6a4fdc5a326de95
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-08-23 12:10:19 -07:00
Brad Fitzpatrick
3b70968c25 cmd/vnet: add --blend and --pcap flags
Updates #13038

Change-Id: Id16ea9eb94447a3d9651215f04b2525daf10b3eb
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-08-23 12:10:19 -07:00
Brad Fitzpatrick
3904e4d175 cmd/tta, tstest/natlab/vnet: remove unneeded port 124 log hack, add log buffer
The natlab Test Agent (tta) still had its old log streaming hack in
place where it dialed out to anything on TCP port 124 and those logs
were streamed to the host running the tests. But we'd since added gokrazy
syslog streaming support, which made that redundant.

So remove all the port 124 stuff. And then make sure we log to stderr
so gokrazy logs it to syslog.

Also, keep the first 1MB of logs in memory in tta too, exported via
localhost:8034/logs for interactive debugging. That was very useful
during debugging when I added IPv6 support. (which is coming in future
PRs)

Updates #13038

Change-Id: Ieed904a704410b9031d5fd5f014a73412348fa7f
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-08-23 12:10:19 -07:00
Flakes Updater
d862898fd3 go.mod.sri: update SRI hash for go.mod changes
Signed-off-by: Flakes Updater <noreply+flakes-updater@tailscale.com>
2024-08-23 10:30:07 -07:00
Brad Fitzpatrick
b091264c0a cmd/systray: set ipn.NotifyNoPrivateKeys, permit non-operator use
Otherwise you get "Access denied: watch IPN bus access denied, must
set ipn.NotifyNoPrivateKeys when not running as admin/root or
operator".

This lets a non-operator at least start the app and see the status, even
if they can't change everything. (the web UI is unaffected by operator)

A future change can add a LocalAPI call to check permissions and guide
people through adding a user as an operator (perhaps the web client
can do that?)

Updates #1708

Change-Id: I699e035a251b4ebe14385102d5e7a2993424c4b7
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-08-23 10:15:49 -07:00
Will Norris
3c66ee3f57 cmd/systray: add a basic linux systray app
This adds a systray app for linux, similar to the apps for macOS and
windows. There are already a number of community-developed systray apps,
but most of them are either long abandoned, are built for a specific
desktop environment, or simply wrap the tailscale CLI.

This uses fyne.io/systray (a fork of github.com/getlantern/systray)
which uses newer D-Bus specifications to render the tray icon and menu.
This results in a pretty broad support for modern desktop environments.

This initial commit lacks a number of features like profile switching,
device listing, and exit node selection. This is really focused on the
application structure, the interaction with LocalAPI, and some system
integration pieces like the app icon, notifications, and the clipboard.

Updates #1708

Signed-off-by: Will Norris <will@tailscale.com>
2024-08-23 00:35:25 -07:00
Flakes Updater
6280c44be1 go.mod.sri: update SRI hash for go.mod changes
Signed-off-by: Flakes Updater <noreply+flakes-updater@tailscale.com>
2024-08-22 15:42:08 -07:00
Jonathan Nobels
1191eb0e3d tstest/natlab: add unix address to writer for dgram mode
updates tailcale/corp#22371

For dgram mode, we need to store the write addresses of
the client socket(s) alongside the writer functions and
the write operation needs to use WriteToUnix.

Unix also has multiple clients writing to the same socket,
so the serve method is modified to handle packets from
multiple mac addresses.

Cleans up a bit of cruft from the initial tailmac tooling
commit.

Now all the macOS packets are belong to us.

Signed-off-by: Jonathan Nobels <jonathan@tailscale.com>
2024-08-22 15:37:37 -07:00
Percy Wegmann
743d296073 update to github.com/tailscale/netlink library that doesn't require vishvananda/netlink
Fixes #12298

Signed-off-by: Percy Wegmann <percy@tailscale.com>
2024-08-22 17:35:37 -05:00
Percy Wegmann
d00d6d6dc2 go.mod: update to github.com/tailscale/netlink library that doesn't require vishvananda/netlink
After the upstream PR is merged, we can point directly at github.com/vishvananda/netlink
and retire github.com/tailscale/netlink.

See https://github.com/vishvananda/netlink/pull/1006

Updates #12298

Signed-off-by: Percy Wegmann <percy@tailscale.com>
2024-08-22 17:35:37 -05:00
Brad Fitzpatrick
e54c81d1d0 types/views: add Slice.All iterator
And convert a few callers as an example, but nowhere near all.

Updates #12912

Change-Id: I5eaa12a29a6cd03b58d6f1072bd27bc0467852f2
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-08-22 14:55:33 -07:00
Flakes Updater
aedfb82876 go.mod.sri: update SRI hash for go.mod changes
Signed-off-by: Flakes Updater <noreply+flakes-updater@tailscale.com>
2024-08-22 12:48:46 -07:00
Ilarion Kovalchuk
0cb7eb9b75 net/dns: updated gonotify dependency to v2 that supports closable context
Signed-off-by: Ilarion Kovalchuk <illarion.kovalchuk@gmail.com>
2024-08-22 12:36:26 -07:00
Brad Fitzpatrick
696711cc17 all: switch to and require Go 1.23
Updates #12912

Change-Id: Ib4ae26eb5fb68ad2216cab4913811b94f7eed5b6
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-08-22 12:31:08 -07:00
Brad Fitzpatrick
0ff474ff37 all: fix new lint warnings from bumping staticcheck
In prep for updating to new staticcheck required for Go 1.23.

Updates #12912

Change-Id: If77892a023b79c6fa798f936fc80428fd4ce0673
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-08-22 12:31:08 -07:00
Percy Wegmann
4637ac732e ipn/ipnlocal: remember last notified taildrive shares and only notify if they've changed
Fixes #13195

Signed-off-by: Percy Wegmann <percy@tailscale.com>
2024-08-22 08:51:07 -05:00
Brad Fitzpatrick
690d3bfafe cmd/tailscale/cli: add debug command to do DNS lookups portably
To avoid dig vs nslookup vs $X availability issues between
OSes/distros. And to be in Go, to match the resolver we use.

Updates #13038

Change-Id: Ib7e5c351ed36b5470a42cbc230b8f27eed9a1bf8
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-08-21 20:35:33 -07:00
Jordan Whited
8e42510a71 wgengine/netstack: disable gVisor GSO on Linux (#13215)
net/tstun.Wrapper.InjectInboundPacketBuffer is not GSO-aware, which can
break quad-100 TCP streams as a result. Linux is the only platform where
gVisor GSO was previously enabled.

Updates tailscale/corp#22511
Updates #13211

Signed-off-by: Jordan Whited <jordan@tailscale.com>
2024-08-21 13:59:29 -07:00
Percy Wegmann
4b525fdda0 ssh/tailssh: only chdir incubator process to user's homedir when necessary and possible
Instead of changing the working directory before launching the incubator process,
this now just changes the working directory after dropping privileges, at which
point we're more likely to be able to enter the user's home directory since we're
running as the user.

For paths that use the 'login' or 'su -l' commands, those already take care of changing
the working directory to the user's home directory.

Fixes #13120

Signed-off-by: Percy Wegmann <percy@tailscale.com>
2024-08-21 13:20:12 -05:00
Nick Khyl
af3d3c433b types/prefs: add a package containing generic preference types
This adds a new package containing generic types to be used for defining preference hierarchies.
These include prefs.Item, prefs.List, prefs.StructList, and prefs.StructMap. Each of these types
represents a configurable preference, holding the preference's state, value, and metadata.
The metadata includes the default value (if it differs from the zero value of the Go type)
and flags indicating whether a preference is managed via syspolicy or is hidden/read-only for
another reason. This information can be marshaled and sent to the GUI, CLI and web clients
as a source of truth regarding preference configuration, management, and visibility/mutability states.

We plan to use these types to define device preferences, such as the updater preferences,
the permission mode to be used on Windows with #tailscale/corp#18342, and certain global options
that are currently exposed as tailscaled flags. We also aim to eventually use these types for
profile-local preferences in ipn.Prefs and and as a replacement for ipn.MaskedPrefs.

The generic preference types are compatible with the tailscale.com/cmd/viewer and
tailscale.com/cmd/cloner utilities.

Updates #12736

Signed-off-by: Nick Khyl <nickk@tailscale.com>
2024-08-21 12:44:38 -05:00
Anton Tolchanov
151b77f9d6 cmd/tl-longchain: tool to re-sign nodes with long rotation signatures
In Tailnet Lock, there is an implicit limit on the number of rotation
signatures that can be chained before the signature becomes too long.

This program helps tailnet admins to identify nodes that have signatures
with long chains and prints commands to re-sign those node keys with a
fresh direct signature. It's a temporary mitigation measure, and we will
remove this tool as we design and implement a long-term approach for
rotation signatures.

Example output:

```
2024/08/20 18:25:03 Self: does not need re-signing
2024/08/20 18:25:03 Visible peers with valid signatures:
2024/08/20 18:25:03 Peer xxx2.yy.ts.net. (100.77.192.34) nodeid=nyDmhiZiGA11KTM59, current signature kind=direct: does not need re-signing
2024/08/20 18:25:03 Peer xxx3.yy.ts.net. (100.84.248.22) nodeid=ndQ64mDnaB11KTM59, current signature kind=direct: does not need re-signing
2024/08/20 18:25:03 Peer xxx4.yy.ts.net. (100.85.253.53) nodeid=nmZfVygzkB21KTM59, current signature kind=rotation: chain length 4, printing command to re-sign
tailscale lock sign nodekey:530bddbfbe69e91fe15758a1d6ead5337aa6307e55ac92dafad3794f8b3fc661 tlpub:4bf07597336703395f2149dce88e7c50dd8694ab5bbde3d7c2a1c7b3e231a3c2
```

To support this, the NetworkLockStatus localapi response now includes
information about signatures of all peers rather than just the invalid
ones. This is not displayed by default in `tailscale lock status`, but
will be surfaced in `tailscale lock status --json`.

Updates #13185

Signed-off-by: Anton Tolchanov <anton@tailscale.com>
2024-08-21 18:22:22 +01:00
Percy Wegmann
7d83056a1b ssh/tailssh: fix SSH on busybox systems
This involved the following:

1. Pass the su command path as first of args in call to unix.Exec to make sure that busybox sees the correct program name.
   Busybox is a single executable userspace that implements various core userspace commands in a single binary. You'll
   see it used via symlinking, so that for example /bin/su symlinks to /bin/busybox. Busybox knows that you're trying
   to execute /bin/su because argv[0] is '/bin/su'. When we called unix.Exec, we weren't including the program name for
   argv[0], which caused busybox to fail with 'applet not found', meaning that it didn't know which command it was
   supposed to run.
2. Tell su to whitelist the SSH_AUTH_SOCK environment variable in order to support ssh agent forwarding.
3. Run integration tests on alpine, which uses busybox.
4. Increment CurrentCapabilityVersion to allow turning on SSH V2 behavior from control.

Fixes #12849

Signed-off-by: Percy Wegmann <percy@tailscale.com>
2024-08-21 11:44:41 -05:00
Jordan Whited
7675c3ebf2 wgengine/netstack/gro: exclude importation of gVisor GRO pkg on iOS (#13202)
In df6014f1d7 we removed build tag
gating preventing importation, which tripped a NetworkExtension limit
test in corp. This was a reversal of
25f0a3fc8f which actually made the
situation worse, hence the simplification.

This commit goes back to the strategy in
25f0a3fc8f, and gets us back under the
limit in my local testing. Admittedly, we don't fully understand
the effects of importing or excluding importation of this package,
and have seen mixed results, but this commit allows us to move forward
again.

Updates tailscale/corp#22125

Signed-off-by: Jordan Whited <jordan@tailscale.com>
2024-08-20 16:40:10 -07:00
Jordan Whited
df6014f1d7 net/tstun,wgengine{/netstack/gro}: refactor and re-enable gVisor GRO for Linux (#13172)
In 2f27319baf we disabled GRO due to a
data race around concurrent calls to tstun.Wrapper.Write(). This commit
refactors GRO to be thread-safe, and re-enables it on Linux.

This refactor now carries a GRO type across tstun and netstack APIs
with a lifetime that is scoped to a single tstun.Wrapper.Write() call.

In 25f0a3fc8f we used build tags to
prevent importation of gVisor's GRO package on iOS as at the time we
believed it was contributing to additional memory usage on that
platform. It wasn't, so this commit simplifies and removes those
build tags.

Updates tailscale/corp#22353
Updates tailscale/corp#22125
Updates #6816

Signed-off-by: Jordan Whited <jordan@tailscale.com>
2024-08-20 15:22:19 -07:00
ChandonPierre
93dc2ded6e cmd/k8s-operator: support default proxy class in k8s-operator (#12711)
Signed-off-by: ChandonPierre <cpierre@coreweave.com>

Closes #12421
2024-08-20 15:50:40 +01:00
Aaron Klotz
8f6a2353d8 util/winutil: add GetRegUserString/SetRegUserString accessors for storage and retrieval of string values in HKEY_CURRENT_USER
Fixes #13187

Signed-off-by: Aaron Klotz <aaron@tailscale.com>
2024-08-20 08:07:57 -06:00
pierig-n3xtio
2105773874 cmd/k8s-operator/deploy: replace wildcards in Kubernetes Operator RBAC role definitions with verbs
cmd/k8s-operator/deploy: replace wildcards in Kubernetes Operator RBAC role definitions with verbs

fixes: #13168

Signed-off-by: Pierig Le Saux <pierig@n3xt.io>
2024-08-20 14:44:50 +01:00
Kristoffer Dalby
01aa01f310 ipn/ipnlocal: network-lock, error if no pubkey instead of panic
Updates tailscale/corp#20931

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
2024-08-20 09:12:52 +02:00
Andrea Gottardo
9d2b1820f1 ipnlocal: support setting authkey at login using syspolicy (#13061)
Updates tailscale/corp#22120

Adds the ability to start the backend by reading an authkey stored in the syspolicy database (MDM). This is useful for devices that are provisioned in an unattended fashion.

Signed-off-by: Andrea Gottardo <andrea@gottardo.me>
2024-08-19 23:49:33 -07:00
tomholford
16bb541adb wgengine/magicsock: replace deprecated poly1305 (#13184)
Signed-off-by: tomholford <tomholford@users.noreply.github.com>
2024-08-19 14:20:58 -07:00
Aaron Klotz
f95785f22b util/winutil: add constants from Win32 SDK for dll blocking mitigation policies
Fixes #13182

Signed-off-by: Aaron Klotz <aaron@tailscale.com>
2024-08-19 13:33:48 -06:00
Jonathan Nobels
8fad8c4b9b tstest/tailmac: add customized macOS virtualization tooling (#13146)
updates tailcale/corp#22371

Adds custom macOS vm tooling.  See the README for
the general gist, but this will spin up VMs with unixgram
capable network interfaces listening to a named socket,
and with a virtio socket device for host-guest communication.

We can add other devices like consoles, serial, etc as needed.

The whole things is buildable with a single make command, and
everything is controllable via the command line using the TailMac
utility.

This should all be generally functional but takes a few shortcuts
with error handling and the like.  The virtio socket device support
has not been tested and may require some refinement.

Signed-off-by: Jonathan Nobels <jonathan@tailscale.com>
2024-08-19 15:01:19 -04:00
Andrea Gottardo
1e8f8ee5f1 VERSION.txt: this is v1.73.0 (#13181)
Signed-off-by: Andrea Gottardo <andrea@gottardo.me>
2024-08-19 17:17:29 +00:00
Anton Tolchanov
ee976ad704 posture: deduplicate MAC addresses before returning them
Some machines have multiple network interfaces with the same MAC
address.

Updates tailscale/corp#21371

Signed-off-by: Anton Tolchanov <anton@tailscale.com>
2024-08-16 16:22:19 +01:00
Andrea Gottardo
5cbbb48c2e health/dns: reduce severity of DNS unavailable warning (#13152)
`DNS unavailable` was marked as a high severity warning. On Android (and other platforms), these trigger a system notification. Here we reduce the severity level to medium. A medium severity warning will still display the warning icon on platforms with a tray icon because of the `ImpactsConnectivity=true` flag being set here, but it won't show a notification anymore. If people enter an area with bad cellular reception, they're bound to receive so many of these notifications and we need to reduce notification fatigue.

Signed-off-by: Andrea Gottardo <andrea@tailscale.com>
2024-08-16 11:12:06 -04:00
Jordan Whited
ccf091e4a6 wgengine/magicsock: don't upgrade to linuxBatchingConn on Android (#13161)
In a93dc6cdb1 tryUpgradeToBatchingConn()
moved to build tag gated files, but the runtime.GOOS condition excluding
Android was removed unintentionally from batching_conn_linux.go. Add it
back.

Updates tailscale/corp#22348

Signed-off-by: Jordan Whited <jordan@tailscale.com>
2024-08-15 14:54:25 -07:00
License Updater
cc136a58ea licenses: update license notices
Signed-off-by: License Updater <noreply+license-updater@tailscale.com>
2024-08-15 14:38:12 -07:00
Andrew Lytvynov
d88be7cddf safeweb: add Server.Close method (#13160)
Updates https://github.com/tailscale/corp/issues/14881

Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
2024-08-15 10:49:04 -07:00
Andrew Dunham
e107977f75 wgengine/magicsock: disable SIO_UDP_NETRESET on Windows
By default, Windows sets the SIO_UDP_CONNRESET and SIO_UDP_NETRESET
options on created UDP sockets. These behaviours make the UDP socket
ICMP-aware; when the system gets an ICMP message (e.g. an "ICMP Port
Unreachable" message, in the case of SIO_UDP_CONNRESET), it will cause
the underlying UDP socket to throw an error. Confusingly, this can occur
even on reads, if the same UDP socket is used to write a packet that
triggers this response.

The Go runtime disabled the SIO_UDP_CONNRESET behavior in 3114bd6, but
did not change SIO_UDP_NETRESET–probably because that socket option
isn't documented particularly well.

Various other networking code seem to disable this behaviour, such as
the Godot game engine (godotengine/godot#22332) and the Eclipse TCF
agent (link below). Others appear to work around this by ignoring the
error returned (anacrolix/dht#16, among others).

For now, until it's clear whether this ends up in the upstream Go
implementation or not, let's also disable the SIO_UDP_NETRESET in a
similar manner to SIO_UDP_CONNRESET.

Eclipse TCF agent: https://gitlab.eclipse.org/eclipse/tcf/tcf.agent/-/blob/master/agent/tcf/framework/mdep.c

Updates #10976
Updates golang/go#68614

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I70a2f19855f8dec1bfb82e63f6d14fc4a22ed5c3
2024-08-15 12:11:33 -04:00
Flakes Updater
db4247f705 go.mod.sri: update SRI hash for go.mod changes
Signed-off-by: Flakes Updater <noreply+flakes-updater@tailscale.com>
2024-08-14 21:30:13 -07:00
Kyle Carberry
6c852fa817 go.{mod,sum}: migrate from nhooyr.io/websocket to github.com/coder/websocket
Coder has just adopted nhooyr/websocket which unfortunately changes the import path.

`github.com/coder/coder` imports `tailscale.com/net/wsconn` which was still pointing
to `nhooyr.io/websocket`, but this change updates it.

See https://coder.com/blog/websocket

Updates #13154

Change-Id: I3dec6512472b14eae337ae22c5bcc1e3758888d5
Signed-off-by: Kyle Carberry <kyle@carberry.com>
2024-08-14 21:23:49 -07:00
Nick Khyl
f8f9f05ffe cmd/viewer: add support for map-like container types
This PR modifies viewTypeForContainerType to use the last type parameter of a container type
as the value type, enabling the implementation of map-like container types where the second-to-last
(usually first) type parameter serves as the key type.

It also adds a MapContainer type to test the code generation.

Updates #12736

Signed-off-by: Nick Khyl <nickk@tailscale.com>
2024-08-14 16:33:51 -05:00
Jordan Whited
2f27319baf wgengine/netstack: disable gVisor TCP GRO for Linux (#13138)
A SIGSEGV was observed around packet merging logic in gVisor's GRO
package.

Updates tailscale/corp#22353

Signed-off-by: Jordan Whited <jordan@tailscale.com>
2024-08-14 11:36:48 -07:00
Brad Fitzpatrick
2dd71e64ac wgengine/magicsock: log when a ReceiveFunc fails
Updates #10976

Change-Id: I86d30151a25c7d42ed36e273fb207873f4acfdb4
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-08-14 10:34:55 -07:00
Percy Wegmann
74b9fa1348 ipn/localapi: only flush relevant data in multiFilePostResponseWriter.Flush()
This prevents two things:

1. Crashing if there's no response body
2. Sending a nonsensical 0 response status code

Updates tailscale/corp#22357

Signed-off-by: Percy Wegmann <percy@tailscale.com>
2024-08-14 12:28:40 -05:00
Irbe Krumina
a15ff1bade cmd/k8s-operator,k8s-operator/sessionrecording: support recording kubectl exec sessions over WebSockets (#12947)
cmd/k8s-operator,k8s-operator/sessionrecording: support recording WebSocket sessions

Kubernetes currently supports two streaming protocols, SPDY and WebSockets.
WebSockets are replacing SPDY, see
https://github.com/kubernetes/enhancements/issues/4006.
We were currently only supporting SPDY, erroring out if session
was not SPDY and relying on the kube's built-in SPDY fallback.

This PR:

- adds support for parsing contents of 'kubectl exec' sessions streamed
over WebSockets

- adds logic to distinguish 'kubectl exec' requests for a SPDY/WebSockets
sessions and call the relevant handler

Updates tailscale/corp#19821

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
Co-authored-by: Tom Proctor <tomhjp@users.noreply.github.com>
2024-08-14 17:57:50 +01:00
Brad Fitzpatrick
4c2e978f1e cmd/tailscale/cli: support passing network lock keys via files
Fixes tailscale/corp#22356

Change-Id: I959efae716a22bcf582c20d261fb1b57bacf6dd9
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-08-14 09:18:34 -07:00
cai.li
2506bf5b06 fix #13076: codegen error when using anonymous struct
Signed-off-by: cai.li <cai.li@qingteng.cn>
2024-08-13 23:41:39 -05:00
Irbe Krumina
b9f42814b5 cmd/containerboot: optionally serve health check endpoint (#12899)
Add functionality to optionally serve a health check endpoint
(off by default).
Users can enable health check endpoint by setting
TS_HEALTHCHECK_ADDR_PORT to [<addr>]:<port>.
Containerboot will then serve an unauthenticatd HTTP health check at
/healthz at that address. The health check returns 200 OK if the
node has at least one tailnet IP address, else returns 503.

Updates tailscale/tailscale#12898

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2024-08-14 05:28:29 +01:00
Flakes Updater
b4e595621f go.mod.sri: update SRI hash for go.mod changes
Signed-off-by: Flakes Updater <noreply+flakes-updater@tailscale.com>
2024-08-13 16:37:46 -07:00
Aaron Bieber
c987cf1255 go.mod: pull in latest github.com/creack/pty
This latest version allows for building on various OpenBSD architectures.

(such as openbsd/riscv64)

Updates #8043

Change-Id: Ie9a8738e6aa96335214d5750e090db35e526a4a4
Signed-off-by: Aaron Bieber <aaron@bolddaemon.com>
2024-08-13 16:31:12 -07:00
Brad Fitzpatrick
02581b1603 gokrazy,tstest/integration/nat: add Gokrazy appliance just for natlab
... rather than abusing the generic tsapp.

Per discussion in https://github.com/gokrazy/gokrazy/pull/275

It also means we can remove stuff we don't need, like ntp or randomd.

Updates #13038

Change-Id: Iccf579c354bd3b5025d05fa1128e32f1d5bde4e4
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-08-13 15:26:12 -07:00
Brad Fitzpatrick
b358f489b9 tstest/integration/nat: remove -audio none flag from qemu
It's too new to be supported in Debian bookworm so just remove it.
It doesn't seem to matter or help speed anything up.

Updates #13038

Change-Id: I39077ba8032bebecd75209552b88f1842c843c33
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-08-13 15:18:55 -07:00
Brad Fitzpatrick
d985da207f tstest/natlab/vnet: fix one-by-one from earlier numbering change
84adfa1ba3 made MAC addresses 1-based too, but didn't adjust this IP address
calculation which was based on the MAC address

Updates #13038

Change-Id: Idc112b303b0b85f41fe51fd61ce1c0d8a3f0f57e
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-08-13 12:32:01 -07:00
Brad Fitzpatrick
b26c53368d tstest/integration/nat: make Tailscale status log print less spammy
No need to print all the internal fields. We only care about the BackendState.

Updates #13038

Change-Id: Iaa0e47ade3c6d30e1887ab1e2a7412ed4e0dab7d
2024-08-13 12:32:01 -07:00
Brad Fitzpatrick
eae6a00651 tstest/integration/nat: crank up verbosity of a failing test
Updates #13038

Change-Id: I36cde97b74e4a675b6c0f3be30f817bccdbe8715
2024-08-13 12:32:01 -07:00
Brad Fitzpatrick
b60a9fce4b gokrazy/tsapp: remove implicit heartbeat package
The heartbeat package does nothing if not configured anyway, so don't
even put it in the image and pay the cost of it running.

Updates #13038
Updates #1866

Change-Id: Id22c0fb1f8395ad21ab0e0350973d31730e8d39f
2024-08-13 12:32:01 -07:00
Brad Fitzpatrick
f79e688e0d cmd/tailscale/cli: fix gokrazy CLI-as-a-service detection
The change in b7e48058c8 was too loose; it also captured the CLI
being run as a child process under cmd/tta.

Updates #13038
Updates #1866

Change-Id: Id410b87132938dd38ed4dd3959473c5d0d242ff5
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-08-13 11:29:25 -07:00
Irbe Krumina
adbab25bac cmd/k8s-operator: fix DNS reconciler for dual-stack clusters (#13057)
* cmd/k8s-operator: fix DNS reconciler for dual-stack clusters

This fixes a bug where DNS reconciler logic was always assuming
that no more than one EndpointSlice exists for a Service.
In fact, there can be multiple, for example, in dual-stack
clusters, but also in other cases this is valid (as per kube docs).
This PR:
- allows for multiple EndpointSlices
- picks out the ones for IPv4 family
- deduplicates addresses

Updates tailscale/tailscale#13056

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
Co-authored-by: Tom Proctor <tomhjp@users.noreply.github.com>
2024-08-13 18:42:01 +01:00
Brad Fitzpatrick
9f1d9d324d gokrazy/tsapp: remove builddirs packages that aren't in config.json
These three packages aren't in gokrazy/tsapp/config.json but
used to be. Unfortunately, that meant that were being included
in the resulting image. Apparently `gok` doesn't delete them or
warn about them being present on disk when they're moved from
the config file.

Updates #13038
Updates #1866

Change-Id: I54918a9e3286ea755b11dde5e9efdd433b8f8fb8
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-08-13 10:32:43 -07:00
Brad Fitzpatrick
b7e48058c8 cmd/tailscale/cli: don't run CLI as a service on gokrazy
Updates #13038
Updates #1866

Change-Id: Ie3223573044a92f5715a827fb66cc6705b38004f
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-08-13 10:32:33 -07:00
Brad Fitzpatrick
84adfa1ba3 tstest/natlab/vnet: standardize on 1-based naming of nodes, networks, MACs
We had a mix of 0-based and 1-based nodes and MACs in logs.

Updates #13038

Change-Id: I36d1b00f7f94b37b4ae2cd439bcdc5dbee6eda4d
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-08-13 08:50:03 -07:00
Brad Fitzpatrick
10d0ce8dde tstest/natlab: get tailscaled logs from gokrazy via syslog
Using https://github.com/gokrazy/gokrazy/pull/275

This is much lower latency than logcatcher, which is higher latency
and chunkier. And this is better than getting it via 'tailscale debug
daemon-logs', which misses early interesting logs.

Updates #13038

Change-Id: I499ec254c003a9494c0e9910f9c650c8ac44ef33
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-08-13 07:56:29 -07:00
Brad Fitzpatrick
10662c4282 tstest/integration/nat: annotate test 'want' values, fail on mismatch
Updates #13038

Change-Id: Id711ee19e52a7051a2273c806b184c5571c6e24f
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-08-12 20:49:10 -07:00
Nick Khyl
67df9abdc6 util/syspolicy/setting: add package that contains types for the next syspolicy PRs
Package setting contains types for defining and representing policy settings.
It facilitates the registration of setting definitions using Register and RegisterDefinition,
and the retrieval of registered setting definitions via Definitions and DefinitionOf.
This package is intended for use primarily within the syspolicy package hierarchy,
and added in a preparation for the next PRs.

Updates #12687

Signed-off-by: Nick Khyl <nickk@tailscale.com>
2024-08-12 21:02:35 -05:00
Brad Fitzpatrick
a61825c7b8 cmd/tta, vnet: add host firewall, env var support, more tests
In particular, tests showing that #3824 works. But that test doesn't
actually work yet; it only gets a DERP connection. (why?)

Updates #13038

Change-Id: Ie1fd1b6a38d4e90fae7e72a0b9a142a95f0b2e8f
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-08-12 15:32:12 -07:00
Brad Fitzpatrick
b692985aef client/tailscale: add LocalClient.OmitAuth for tests
Similar to UseSocketOnly, but pulled out separately in case
people are doing unknown weird things.

Updates #13038

Change-Id: I7478e5cb9794439b947440b831caa798941845ea
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-08-12 15:16:22 -07:00
Brad Fitzpatrick
0686bc8b19 cmd/tailscaled: add env knob to control default verbosity
Updates #13038

Change-Id: Ic0e6dfc7a8d127ab5ce0ae9aab9119c56e19b636
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-08-12 15:00:13 -07:00
Flakes Updater
0dd9f5397b go.mod.sri: update SRI hash for go.mod changes
Signed-off-by: Flakes Updater <noreply+flakes-updater@tailscale.com>
2024-08-12 14:54:58 -07:00
Maisem Ali
10c2bee9e1 tstest/natlab/vnet: capture network wan/lan interfaces
Updates #13038

Signed-off-by: Maisem Ali <maisem@tailscale.com>
2024-08-12 14:54:38 -07:00
Jordan Whited
7aec8d4e6b cmd/stunstamp: refactor connection construction (#13110)
getConns() is now responsible for returning both stable and unstable
conns. conn and measureFn are now passed together via connAndMeasureFn.
newConnAndMeasureFn() is responsible for constructing them.

TCP measurement timeouts are adjusted to more closely match netcheck.

Updates tailscale/corp#22114

Signed-off-by: Jordan Whited <jordan@tailscale.com>
2024-08-12 14:09:45 -07:00
Jordan Whited
218110963d cmd/stunstamp: implement HTTPS & TCP latency measurements (#13082)
HTTPS mirrors current netcheck behavior and TCP uses tcp_info->rtt.

Updates tailscale/corp#22114

Signed-off-by: Jordan Whited <jordan@tailscale.com>
2024-08-12 13:39:11 -07:00
Paul Scott
bc2744da4b tsweb: fix TestStdHandler_ConnectionClosedDuringBody flake (#13046)
Fixes #13017

Signed-off-by: Paul Scott <paul@tailscale.com>
2024-08-12 16:30:32 +01:00
Brad Fitzpatrick
2e32abc3e2 cmd/tailscaled: allow setting env via linux cmdline for integration tests
Updates #13038

Change-Id: I51e016d0eb7c14647159706c08f017fdedd68e2a
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-08-10 12:01:21 -07:00
Maisem Ali
ce4413a0bc client/tailscale: add Via to UserRuleMatch
This adds the Via field for the https://tailscale.com/kb/1378/via
feature to the ACLPreview response.

Updates tailscale/corp#22239

Signed-off-by: Maisem Ali <maisem@tailscale.com>
2024-08-09 18:01:14 -07:00
Brad Fitzpatrick
2a88428f24 tstest/integration/nat: skip some tests by default without flags
Updates #13038

Change-Id: I7ebf8bd8590e65ce4d30dd9f03c713b77868fa36
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-08-09 09:06:54 -07:00
Brad Fitzpatrick
44d634395b tstest/natlab/vnet: add easyAF
Endpoint-indepedent Mapping with only Address (but not port) dependent
filtering.

Updates #13038

Change-Id: I1ec88301acafcb79bf878f9600a7286e8af0f173
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-08-09 09:06:54 -07:00
Maisem Ali
d4cc074187 tstest/natlab/vnet: add pcap support
Updates #13038

Change-Id: I89ce2129fee856f97986d6313d2b661c76476c0c
Signed-off-by: Maisem Ali <maisem@tailscale.com>
2024-08-09 09:06:54 -07:00
Maisem Ali
d0e8375b53 cmd/{tta,vnet}: proxy to gokrazy UI
Updates #13038

Change-Id: I1cacb1b0f8c3d0e4c36b7890155f7b1ad0d23575
Signed-off-by: Maisem Ali <maisem@tailscale.com>
2024-08-09 09:06:54 -07:00
Maisem Ali
072d1a4b77 gokrazy: bump
Updates #13038

Change-Id: Ie1a5b8930d5cce6f45ce67102da06a9474444af7
Signed-off-by: Maisem Ali <maisem@tailscale.com>
2024-08-09 09:06:54 -07:00
Brad Fitzpatrick
194ff6ee3d tstest/integration/nat: add sameLAN node type
To test local connections.

Updates #13038

Change-Id: I575dcab31ca812edf7d04fa126772611cf89b9a7
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-08-09 09:06:54 -07:00
Brad Fitzpatrick
730fec1cfd tstest/integration/nat: add start of TestGrid
Updates #13038

Change-Id: I41d1c2bf20ae6dfbb071020d9dc2b742e7995835
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-08-09 09:06:54 -07:00
Brad Fitzpatrick
f47a5fe52b vnet: reduce some log spam
Updates #13038

Change-Id: I76038a90dfde10a82063988a5b54190074d4b5c5
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-08-09 09:06:54 -07:00
Brad Fitzpatrick
bb3e95c40d vnet: fix port mapping (w/ maisem + andrew)
Co-authored-by: Maisem Ali <maisem@tailscale.com>
Co-authored-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I703b39f05af2e3e1a979be8e77091586cb9ec3eb
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-08-09 09:06:54 -07:00
Maisem Ali
f8d23b3582 tstest/integration/nat: stream daemon logs directly
Updates #13038

Signed-off-by: Maisem Ali <maisem@tailscale.com>
Change-Id: I5da5706149c082c27d74c8b894bf53dd9b259e84
2024-08-09 09:06:54 -07:00
Brad Fitzpatrick
17a10f702f vnet: add network.logf
Updates #13038

Change-Id: Ia5a9359b8bfa18264d64600dfa1ef01eb8728dc2
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-08-09 09:06:54 -07:00
Brad Fitzpatrick
082e46b48d vnet: don't hard-code bradfitz or maisem in paths
Updates #13038

Change-Id: Ie8c7591fac3800bb3b7f8c35356cce309fd3c164
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-08-09 09:06:54 -07:00
Brad Fitzpatrick
6798f8ea88 tstest/natlab/vnet: add port mapping
Updates #13038

Change-Id: Iaf274d250398973790873534b236d5cbb34fbe0e
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-08-09 09:06:54 -07:00
Maisem Ali
12764e9db4 natlab: add NodeAgentClient
This adds a new NodeAgentClient type that can be used to
invoke the LocalAPI using the LocalClient instead of
handcrafted URLs. However, there are certain cases where
it does make sense for the node agent to provide more
functionality than whats possible with just the LocalClient,
as such it also exposes a http.Client to make requests directly.

Signed-off-by: Maisem Ali <maisem@tailscale.com>
2024-08-09 09:06:54 -07:00
Brad Fitzpatrick
1016aa045f hostinfo: add hostinfo.IsNATLabGuestVM
And don't make guests under vnet/natlab upload to logcatcher,
as there won't be a valid cert anyway.

Updates #13038

Change-Id: Ie1ce0139788036b8ecc1804549a9b5d326c5fef5
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-08-09 09:06:54 -07:00
Brad Fitzpatrick
8594292aa4 vnet: add control/derps to test, stateful firewall
Updates #13038

Change-Id: Icd65b34c5f03498b5a7109785bb44692bce8911a
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-08-09 09:06:54 -07:00
Jordan Whited
20691894f5 cmd/stunstamp: refactor to support multiple protocols (#13063)
'stun' has been removed from metric names and replaced with a protocol
label. This refactor is preparation work for HTTPS & ICMP support.

Updates tailscale/corp#22114

Signed-off-by: Jordan Whited <jordan@tailscale.com>
2024-08-09 08:03:58 -07:00
Nick Khyl
f23932bd98 net/dns/resolver: log forwarded query details when TS_DEBUG_DNS_FORWARD_SEND is enabled
Troubleshooting DNS resolution issues often requires additional information.
This PR expands the effect of the TS_DEBUG_DNS_FORWARD_SEND envknob to forwarder.forwardWithDestChan,
and includes the request type, domain name length, and the first 3 bytes of the domain's SHA-256 hash in the output.

Fixes #13070

Signed-off-by: Nick Khyl <nickk@tailscale.com>
2024-08-08 15:57:35 -05:00
Brad Fitzpatrick
a867a4869d go.toolchain.rev: bump Go toolchain for net pkg resolv.conf fix
Updates tailscale/corp#22206

Change-Id: I9d995d408d4be3fd552a0d6e12bf79db8461d802
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-08-08 13:35:40 -07:00
Andrew Lytvynov
c0c4791ce7 cmd/gitops-pusher: ignore previous etag if local acls match control (#13068)
In a situation when manual edits are made on the admin panel, around the
GitOps process, the pusher will be stuck if `--fail-on-manual-edits` is
set, as expected.

To recover from this, there are 2 options:
1. revert the admin panel changes to get back in sync with the code
2. check in the manual edits to code

The former will work well, since previous and local ETags will match
control ETag again. The latter will still fail, since local and control
ETags match, but previous does not.

For this situation, check the local ETag against control first and
ignore previous when things are already in sync.

Updates https://github.com/tailscale/corp/issues/22177

Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
2024-08-08 13:23:06 -07:00
Andrew Lytvynov
ad038f4046 cmd/gitops-pusher: add --fail-on-manual-edits flag (#13066)
For cases where users want to be extra careful about not overwriting
manual changes, add a flag to hard-fail. This is only useful if the etag
cache is persistent or otherwise reliable. This flag should not be used
in ephemeral CI workers that won't persist the cache.

Updates https://github.com/tailscale/corp/issues/22177

Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
2024-08-08 11:21:28 -07:00
Anton Tolchanov
46db698333 prober: make status page more clear
Updates tailscale/corp#20583

Signed-off-by: Anton Tolchanov <anton@tailscale.com>
2024-08-08 17:34:29 +01:00
Naman Sood
f79183dac7 cmd/tsidp: add funnel support (#12591)
* cmd/tsidp: add funnel support

Updates #10263.

Signed-off-by: Naman Sood <mail@nsood.in>

* look past funnel-ingress-node to see who we're authenticating

Signed-off-by: Naman Sood <mail@nsood.in>

* fix comment typo

Signed-off-by: Naman Sood <mail@nsood.in>

* address review feedback, support Basic auth for /token

Turns out you need to support Basic auth if you do client ID/secret
according to OAuth.

Signed-off-by: Naman Sood <mail@nsood.in>

* fix typos

Signed-off-by: Naman Sood <mail@nsood.in>

* review fixes

Signed-off-by: Naman Sood <mail@nsood.in>

* remove debugging log

Signed-off-by: Naman Sood <mail@nsood.in>

* add comments, fix header

Signed-off-by: Naman Sood <mail@nsood.in>

---------

Signed-off-by: Naman Sood <mail@nsood.in>
2024-08-08 10:46:45 -04:00
Brad Fitzpatrick
1ed958fe23 tstest/natlab/vnet: add start of virtual network-based NAT Lab
Updates #13038

Change-Id: I3c74120d73149c1329288621f6474bbbcaa7e1a6
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-08-07 09:37:15 -07:00
Brad Fitzpatrick
6ca078c46e cmd/derper: move 204 handler from package main to derphttp
Updates #13038

Change-Id: I28a8284dbe49371cae0e9098205c7c5f17225b40
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-08-06 17:53:33 -07:00
Jordan Whited
a93dc6cdb1 wgengine/magicsock: refactor batchingUDPConn to batchingConn interface (#13042)
This commit adds a batchingConn interface, and renames batchingUDPConn
to linuxBatchingConn. tryUpgradeToBatchingConn() may return a platform-
specific implementation of batchingConn. So far only a Linux
implementation of this interface exists, but this refactor is being
done in anticipation of a Windows implementation.

Updates tailscale/corp#21874

Signed-off-by: Jordan Whited <jordan@tailscale.com>
2024-08-06 09:00:28 -07:00
Anton Tolchanov
7bac5dffcb control/controlhttp: extract the last network connection
The same context we use for the HTTP request here might be re-used by
the dialer, which could result in `GotConn` being called multiple times.
We only care about the last one.

Fixes #13009

Signed-off-by: Anton Tolchanov <anton@tailscale.com>
2024-08-06 11:42:06 +01:00
Anton Tolchanov
b3fc345aba cmd/derpprobe: use a status page from the prober library
Updates tailscale/corp#20583

Signed-off-by: Anton Tolchanov <anton@tailscale.com>
2024-08-06 11:27:59 +01:00
Anton Tolchanov
9106187a95 prober: support JSON response in RunHandler
Updates tailscale/corp#20583

Signed-off-by: Anton Tolchanov <anton@tailscale.com>
2024-08-06 11:27:59 +01:00
Anton Tolchanov
9b08399d9e prober: add a status page handler
This change adds an HTTP handler with a table showing a list of all
probes, their status, and a button that allows triggering a specific
probe.

Updates tailscale/corp#20583

Signed-off-by: Anton Tolchanov <anton@tailscale.com>
2024-08-06 11:27:59 +01:00
Anton Tolchanov
153a476957 prober: add an HTTP endpoint for triggering a probe
- Keep track of the last 10 probe results and successful probe
  latencies;
- Add an HTTP handler that triggers a given probe by name and returns it
  result as a plaintext HTML page, showing recent probe results as a
  baseline

Updates tailscale/corp#20583

Signed-off-by: Anton Tolchanov <anton@tailscale.com>
2024-08-06 11:27:59 +01:00
Anton Tolchanov
227509547f {control,net}: close idle connections of custom transports
I noticed a few places with custom http.Transport where we are not
closing idle connections when transport is no longer used.

Updates tailscale/corp#21609

Signed-off-by: Anton Tolchanov <anton@tailscale.com>
2024-08-05 17:28:15 +01:00
VimT
e3f047618b net/socks5: support UDP
Updates #7581

Signed-off-by: VimT <me@vimt.me>
2024-08-05 09:25:24 -07:00
Kot C
91d2e1772d words: raccoon dog, dog with the raccoon in 'im
Signed-off-by: Kot C <kot@yukata.dev>
2024-08-05 09:24:33 -07:00
License Updater
3b6849e362 licenses: update license notices
Signed-off-by: License Updater <noreply+license-updater@tailscale.com>
2024-08-05 08:45:07 -07:00
Anton Tolchanov
0fd73746dd cmd/tailscale/cli: fix revoke-keys command name in CLI output
During review of #8644 the `recover-compromised-key` command was renamed
to `revoke-key`, but the old name remained in some messages printed by
the command.

Fixes tailscale/corp#19446

Signed-off-by: Anton Tolchanov <anton@tailscale.com>
2024-08-05 14:49:48 +01:00
Jordan Whited
17c88a19be net/captivedetection: mark TestAllEndpointsAreUpAndReturnExpectedResponse flaky (#13021)
Updates #13019

Signed-off-by: Jordan Whited <jordan@tailscale.com>
2024-08-03 22:08:55 +00:00
Jordan Whited
25f0a3fc8f wgengine/netstack: use build tags to exclude gVisor GRO importation on iOS (#13015)
Updates tailscale/corp#22125

Signed-off-by: Jordan Whited <jordan@tailscale.com>
2024-08-03 15:03:44 -07:00
Maisem Ali
a7a394e7d9 tstest/integration: mark TestNATPing flaky
Updates #12169

Signed-off-by: Maisem Ali <maisem@tailscale.com>
2024-08-03 15:02:17 -07:00
Maisem Ali
07e2487c1d wgengine/capture: fix v6 field typo in wireshark dissector
It was using a v4 field for a v6 address.

Updates tailscale/corp#8020

Signed-off-by: Maisem Ali <maisem@tailscale.com>
2024-08-03 14:56:17 -07:00
Maisem Ali
1dd9c44d51 tsweb: mark TestStdHandler_ConnectionClosedDuringBody flaky
Updates #13107

Signed-off-by: Maisem Ali <maisem@tailscale.com>
2024-08-03 14:54:10 -07:00
Flakes Updater
0a6eb12f05 go.mod.sri: update SRI hash for go.mod changes
Signed-off-by: Flakes Updater <noreply+flakes-updater@tailscale.com>
2024-08-03 11:45:38 -07:00
Maisem Ali
f205efcf18 net/packet/checksum: fix v6 NAT
We were copying 12 out of the 16 bytes which meant that
the 1:1 NAT required would only work if the last 4 bytes
happened to match between the new and old address, something
that our tests accidentally had. Fix it by copying the full
16 bytes and make the tests also verify the addr and use rand
addresses.

Updates #9511

Signed-off-by: Maisem Ali <maisem@tailscale.com>
2024-08-03 11:38:00 -07:00
Maisem Ali
a917718353 util/linuxfw: return nil interface not concrete type
It was returning a nil `*iptablesRunner` instead of a
nil `NetfilterRunner` interface which would then fail
checks later.

Fixes #13012

Signed-off-by: Maisem Ali <maisem@tailscale.com>
2024-08-03 09:53:46 -07:00
Nick Khyl
4099a36468 util/winutil/gp: fix a busy loop bug
Updates #12687

Signed-off-by: Nick Khyl <nickk@tailscale.com>
2024-08-02 20:16:41 -05:00
Jordan Whited
d9d9d525d9 wgengine/netstack: increase gVisor's TCP send and receive buffer sizes (#12994)
This commit increases gVisor's TCP max send (4->6MiB) and receive
(4->8MiB) buffer sizes on all platforms except iOS. These values are
biased towards higher throughput on high bandwidth-delay product paths.

The iperf3 results below demonstrate the effect of this commit between
two Linux computers with i5-12400 CPUs. 100ms of RTT latency is
introduced via Linux's traffic control network emulator queue
discipline.

The first set of results are from commit f0230ce prior to TCP buffer
resizing.

gVisor write direction:
Test Complete. Summary Results:
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec   180 MBytes   151 Mbits/sec    0  sender
[  5]   0.00-10.10  sec   179 MBytes   149 Mbits/sec       receiver

gVisor read direction:
Test Complete. Summary Results:
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.10  sec   337 MBytes   280 Mbits/sec   20 sender
[  5]   0.00-10.00  sec   323 MBytes   271 Mbits/sec         receiver

The second set of results are from this commit with increased TCP
buffer sizes.

gVisor write direction:
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec   297 MBytes   249 Mbits/sec    0 sender
[  5]   0.00-10.10  sec   297 MBytes   247 Mbits/sec        receiver

gVisor read direction:
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.10  sec   501 MBytes   416 Mbits/sec   17  sender
[  5]   0.00-10.00  sec   485 MBytes   407 Mbits/sec       receiver

Updates #9707
Updates tailscale/corp#22119

Signed-off-by: Jordan Whited <jordan@tailscale.com>
2024-08-02 15:50:47 -07:00
Andrew Dunham
9939374c48 wgengine/magicsock: use cloud metadata to get public IPs
Updates #12774

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I1661b6a2da7966ab667b075894837afd96f4742f
2024-08-02 16:05:14 -04:00
Andrea Gottardo
4055b63b9b net/captivedetection: exclude cellular data interfaces (#13002)
Updates tailscale/tailscale#1634

This PR optimizes captive portal detection on Android and iOS by excluding cellular data interfaces (`pdp*` and `rmnet`). As cellular networks do not present captive portals, frequent network switches between Wi-Fi and cellular would otherwise trigger captive detection unnecessarily, causing battery drain.

Signed-off-by: Andrea Gottardo <andrea@gottardo.me>
2024-08-02 12:23:48 -07:00
Jordan Whited
f0230ce0b5 go.mod,net/tstun,wgengine/netstack: implement gVisor TCP GRO for Linux (#12921)
This commit implements TCP GRO for packets being written to gVisor on
Linux. Windows support will follow later. The wireguard-go dependency is
updated in order to make use of newly exported IP checksum functions.
gVisor is updated in order to make use of newly exported
stack.PacketBuffer GRO logic.

TCP throughput towards gVisor, i.e. TUN write direction, is dramatically
improved as a result of this commit. Benchmarks show substantial
improvement, sometimes as high as 2x. High bandwidth-delay product
paths remain receive window limited, bottlenecked by gVisor's default
TCP receive socket buffer size. This will be addressed in a  follow-on
commit.

The iperf3 results below demonstrate the effect of this commit between
two Linux computers with i5-12400 CPUs. There is roughly ~13us of round
trip latency between them.

The first result is from commit 57856fc without TCP GRO.

Starting Test: protocol: TCP, 1 streams, 131072 byte blocks
- - - - - - - - - - - - - - - - - - - - - - - - -
Test Complete. Summary Results:
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  4.77 GBytes  4.10 Gbits/sec   20 sender
[  5]   0.00-10.00  sec  4.77 GBytes  4.10 Gbits/sec      receiver

The second result is from this commit with TCP GRO.

Starting Test: protocol: TCP, 1 streams, 131072 byte blocks
- - - - - - - - - - - - - - - - - - - - - - - - -
Test Complete. Summary Results:
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  10.6 GBytes  9.14 Gbits/sec   20 sender
[  5]   0.00-10.00  sec  10.6 GBytes  9.14 Gbits/sec      receiver

Updates #6816

Signed-off-by: Jordan Whited <jordan@tailscale.com>
2024-08-02 10:41:10 -07:00
Brad Fitzpatrick
cc370314e7 health: don't show login error details with context cancelations
Fixes #12991

Change-Id: I2a5e109395761b720ecf1069d0167cf0caf72876
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-08-01 08:29:27 -07:00
Aaron Klotz
655b4f8fc5 net/netns: remove some logspam by avoiding logging parse errors due to unspecified addresses
I updated the address parsing stuff to return a specific error for
unspecified hosts passed as empty strings, and look for that
when logging errors. I explicitly did not make parseAddress return a
netip.Addr containing an unspecified address because at this layer,
in the absence of any host, we don't necessarily know the address
family we're dealing with.

For the purposes of this code I think this is fine, at least until
we implement #12588.

Fixes #12979

Signed-off-by: Aaron Klotz <aaron@tailscale.com>
2024-07-31 12:34:16 -06:00
Brad Fitzpatrick
004dded0a8 net/tlsdial: relax self-signed cert health warning
It seems some security software or macOS itself might be MITMing TLS
(for ScreenTime?), so don't warn unless it fails x509 validation
against system roots.

Updates #3198

Change-Id: I6ea381b5bb6385b3d51da4a1468c0d803236b7bf
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-07-31 10:03:48 -07:00
Aaron Klotz
0def4f8e38 net/netns: on Windows, fall back to default interface index when unspecified address is passed to ControlC and bindToInterfaceByRoute is enabled
We were returning an error instead of binding to the default interface.

Updates #12979

Signed-off-by: Aaron Klotz <aaron@tailscale.com>
2024-07-31 10:58:45 -06:00
Jordan Whited
7bc2ddaedc go.mod,net/tstun,wgengine/netstack: implement gVisor TCP GSO for Linux (#12869)
This commit implements TCP GSO for packets being read from gVisor on
Linux. Windows support will follow later. The wireguard-go dependency is
updated in order to make use of newly exported GSO logic from its tun
package.

A new gVisor stack.LinkEndpoint implementation has been established
(linkEndpoint) that is loosely modeled after its predecessor
(channel.Endpoint). This new implementation supports GSO of monster TCP
segments up to 64K in size, whereas channel.Endpoint only supports up to
32K. linkEndpoint will also be required for GRO, which will be
implemented in a follow-on commit.

TCP throughput from gVisor, i.e. TUN read direction, is dramatically
improved as a result of this commit. Benchmarks show substantial
improvement through a wide range of RTT and loss conditions, sometimes
as high as 5x.

The iperf3 results below demonstrate the effect of this commit between
two Linux computers with i5-12400 CPUs. There is roughly ~13us of round
trip latency between them.

The first result is from commit 57856fc without TCP GSO.

Starting Test: protocol: TCP, 1 streams, 131072 byte blocks
- - - - - - - - - - - - - - - - - - - - - - - - -
Test Complete. Summary Results:
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  2.51 GBytes  2.15 Gbits/sec  154 sender
[  5]   0.00-10.00  sec  2.49 GBytes  2.14 Gbits/sec      receiver

The second result is from this commit with TCP GSO.

Starting Test: protocol: TCP, 1 streams, 131072 byte blocks
- - - - - - - - - - - - - - - - - - - - - - - - -
Test Complete. Summary Results:
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  12.6 GBytes  10.8 Gbits/sec    6 sender
[  5]   0.00-10.00  sec  12.6 GBytes  10.8 Gbits/sec      receiver

Updates #6816

Signed-off-by: Jordan Whited <jordan@tailscale.com>
2024-07-31 09:42:11 -07:00
Andrea Gottardo
949b15d858 net/captivedetection: call SetHealthy once connectivity restored (#12974)
Fixes tailscale/tailscale#12973
Updates tailscale/tailscale#1634

There was a logic issue in the captive detection code we shipped in https://github.com/tailscale/tailscale/pull/12707.

Assume a captive portal has been detected, and the user notified. Upon switching to another Wi-Fi that does *not* have a captive portal, we were issuing a signal to interrupt any pending captive detection attempt. However, we were not also setting the `captive-portal-detected` warnable to healthy. The result was that any "captive portal detected" alert would not be cleared from the UI.

Also fixes a broken log statement value.

Signed-off-by: Andrea Gottardo <andrea@gottardo.me>
2024-07-30 13:39:25 -07:00
Jonathan Nobels
8a8ecac6a7 net/dns, cmd/tailscaled: plumb system health tracker into dns cleanup (#12969)
fixes tailscale#12968

The dns manager cleanup func was getting passed a nil
health tracker, which will panic.  Fixed to pass it
the system health tracker.

Signed-off-by: Jonathan Nobels <jonathan@tailscale.com>
2024-07-30 12:54:03 -04:00
Irbe Krumina
eead25560f build_docker.sh: update script comment (#12970)
It is no longer correct to state that we don't support running Tailscale in containers or on Kubernetes.

Updates tailscale/tailscale#12842

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2024-07-30 15:12:43 +01:00
dependabot[bot]
1b64961320 build(deps): bump github.com/docker/docker (#12966)
Bumps [github.com/docker/docker](https://github.com/docker/docker) from 25.0.5+incompatible to 26.1.4+incompatible.
- [Release notes](https://github.com/docker/docker/releases)
- [Commits](https://github.com/docker/docker/compare/v25.0.5...v26.1.4)

---
updated-dependencies:
- dependency-name: github.com/docker/docker
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-07-30 12:46:14 +01:00
Irbe Krumina
32308fcf71 Dockerfile: add a warning that this is not used to build our published images (#12955)
Add a warning that the Dockerfile in the OSS repo is not the
currently used mechanism to build the images we publish - for folks
who want to contribute to image build scripts or otherwise need to
understand the image build process that we use.

Updates#cleanup

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2024-07-30 12:22:53 +01:00
Flakes Updater
34de96d06e go.mod.sri: update SRI hash for go.mod changes
Signed-off-by: Flakes Updater <noreply+flakes-updater@tailscale.com>
2024-07-29 19:40:24 -07:00
Brad Fitzpatrick
575feb486f util/osuser: turn wasm check into a const expression
All wasi* are GOARCH wasm, so check that instead.

Updates #12732

Change-Id: Id3cc346295c1641bcf80a6c5eb1ad65488509656
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-07-29 19:39:55 -07:00
Brad Fitzpatrick
2ab1d532e8 gokrazy/tsapp: add go.mod replacing two tailscale.com binaries with parent module
Updates #1866

Change-Id: I1ee7d41f7ee55806fb7ad94d0333dd0ec33d8efd
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-07-29 19:07:25 -07:00
Brad Fitzpatrick
360046e5c3 words: add some associated with scales
Updates tailscale/corp#14698

Change-Id: Ica7f179bd368d3c15f58fb236d377881cd80efcf
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-07-29 15:18:08 -07:00
Andrew Dunham
35a8fca379 cmd/tailscale/cli: release portmap after netcheck
Updates #12954

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: Ic14f037b48a79b1263b140c6699579b466d89310
2024-07-29 14:10:32 -04:00
Jonathan Nobels
19b0c8a024 net/dns, health: raise health warning for failing forwarded DNS queries (#12888)
updates tailscale/corp#21823

Misconfigured, broken, or blocked DNS will often present as
"internet is broken'" to the end user.  This  plumbs the health tracker
into the dns manager and forwarder and adds a health warning
with a 5 second delay that is raised on failures in the forwarder and
lowered on successes.

Signed-off-by: Jonathan Nobels <jonathan@tailscale.com>
2024-07-29 13:48:46 -04:00
Percy Wegmann
3088c6105e go.mod: pull in latest github.com/tailscale/xnet
This picks up https://github.com/tailscale/xnet/pull/1 so that
clients can move files even when holding only a lock for the source
file.

Updates #12941

Signed-off-by: Percy Wegmann <percy@tailscale.com>
2024-07-29 10:41:53 -05:00
Irbe Krumina
a21bf100f3 cmd/k8s-operator,k8s-operator/sessionrecording,sessionrecording,ssh/tailssh: refactor session recording functionality (#12945)
cmd/k8s-operator,k8s-operator/sessionrecording,sessionrecording,ssh/tailssh: refactor session recording functionality

Refactor SSH session recording functionality (mostly the bits related to
Kubernetes API server proxy 'kubectl exec' session recording):

- move the session recording bits used by both Tailscale SSH
and the Kubernetes API server proxy into a shared sessionrecording package,
to avoid having the operator to import ssh/tailssh

- move the Kubernetes API server proxy session recording functionality
into a k8s-operator/sessionrecording package, add some abstractions
in preparation for adding support for a second streaming protocol (WebSockets)

Updates tailscale/corp#19821

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2024-07-29 13:57:11 +01:00
Paul Scott
1bf7ed0348 tsweb: add QuietLogging option (#12838)
Allows the use of tsweb.LogHandler exclusively for callbacks describing the
handler HTTP requests.

Fixes #12837

Signed-off-by: Paul Scott <paul@tailscale.com>
2024-07-29 13:53:01 +01:00
Irbe Krumina
c5623e0471 go.{mod,sum},tstest/tools,k8s-operator,cmd/k8s-operator: autogenerate CRD API docs (#12884)
Re-instates the functionality that generates CRD API docs, but using
a different library as the one we were using earlier seemed to have
some issues with its Git history.
Also regenerates the docs (make kube-generate-all).

Updates tailscale/tailscale#12859

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2024-07-29 11:50:27 +01:00
Ross Williams
1bf82ddf84 util/osuser: run getent on non-Linux Unixes
Remove the restriction that getent is skipped on non-Linux unixes.
Improve validation of the parsed output from getent, in case unknown
systems return unusable information.

Fixes #12730.

Signed-off-by: Ross Williams <ross@ross-williams.net>
2024-07-26 14:25:46 -07:00
Andrea Gottardo
6840f471c0 net/dnsfallback: set CanPort80 in static DERPMap (#12929)
Updates tailscale/corp#21949

As discussed with @raggi, this PR updates the static DERPMap embedded in the client to reflect the availability of HTTP on the DERP servers run by Tailscale.

Signed-off-by: Andrea Gottardo <andrea@gottardo.me>
2024-07-26 13:04:12 -07:00
Andrea Gottardo
90be06bd5b health: introduce captive-portal-detected Warnable (#12707)
Updates tailscale/tailscale#1634

This PR introduces a new `captive-portal-detected` Warnable which is set to an unhealthy state whenever a captive portal is detected on the local network, preventing Tailscale from connecting.



ipn/ipnlocal: fix captive portal loop shutdown


Change-Id: I7cafdbce68463a16260091bcec1741501a070c95

net/captivedetection: fix mutex misuse

ipn/ipnlocal: ensure that we don't fail to start the timer


Change-Id: I3e43fb19264d793e8707c5031c0898e48e3e7465

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Signed-off-by: Andrea Gottardo <andrea@gottardo.me>
2024-07-26 11:25:55 -07:00
Brad Fitzpatrick
cf97cff33b wgengine/netstack: simplify netaddrIPFromNetstackIP
Updates #cleanup

Change-Id: I66878b08a75d44170460cbf33c895277c187bd8d
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-07-25 20:05:16 -07:00
Paul Scott
855da47777 tsweb: Add MiddlewareStack func to apply lists of Middleware (#12907)
Fixes #12909

Signed-off-by: Paul Scott <paul@tailscale.com>
2024-07-25 14:20:17 +01:00
Nick Khyl
43375c6efb types/lazy: re-init SyncValue during test cleanup if it wasn't set before SetForTest
Updates #12687

Signed-off-by: Nick Khyl <nickk@tailscale.com>
2024-07-24 11:47:58 -05:00
Paul Scott
ba7f2d129e tsweb: log all cancellations as 499s (#12894)
Updates #12141

Signed-off-by: Paul Scott <paul@tailscale.com>
2024-07-24 08:58:06 +01:00
Irbe Krumina
57856fc0d5 ipn,wgengine/magicsock: allow setting static node endpoints via tailscaled configfile (#12882)
wgengine/magicsock,ipn: allow setting static node endpoints via tailscaled config file.

Adds a new StaticEndpoints field to tailscaled config
that can be used to statically configure the endpoints
that the node advertizes. This field will replace
TS_DEBUG_PRETENDPOINTS env var that can be used to achieve the same.

Additionally adds some functionality that ensures that endpoints
are updated when configfile is reloaded.

Also, refactor configuring/reconfiguring components to use the
same functionality when configfile is parsed the first time or
subsequent times (after reload). Previously a configfile reload
did not result in resetting of prefs. Now it does- but does not yet
tell the relevant components to consume the new prefs. This is to
be done in a follow-up.

Updates tailscale/tailscale#12578


Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2024-07-23 16:50:55 +01:00
License Updater
9904421853 licenses: update license notices
Signed-off-by: License Updater <noreply+license-updater@tailscale.com>
2024-07-22 14:50:50 -07:00
Nick Khyl
5d09649b0b types/lazy: add (*SyncValue[T]).SetForTest method
It is sometimes necessary to change a global lazy.SyncValue for the duration of a test. This PR adds a (*SyncValue[T]).SetForTest method to facilitate that.

Updates #12687

Signed-off-by: Nick Khyl <nickk@tailscale.com>
2024-07-22 15:10:31 -05:00
Nick Khyl
d500a92926 util/slicesx: add HasPrefix, HasSuffix, CutPrefix, and CutSuffix functions
The standard library includes these for strings and byte slices,
but it lacks similar functions for generic slices of comparable types.
Although they are not as commonly used, these functions are useful
in scenarios such as working with field index sequences (i.e., []int)
via reflection.

Updates #12687

Signed-off-by: Nick Khyl <nickk@tailscale.com>
2024-07-22 11:03:46 -05:00
Flakes Updater
1f94047475 go.mod.sri: update SRI hash for go.mod changes
Signed-off-by: Flakes Updater <noreply+flakes-updater@tailscale.com>
2024-07-21 14:29:01 -07:00
Nick Khyl
bd54b61746 types/opt: add (Value[T]).GetOr(def T) T method
Updates #12736

Signed-off-by: Nick Khyl <nickk@tailscale.com>
2024-07-19 15:50:24 -05:00
Nick Khyl
20562a4fb9 cmd/viewer, types/views, util/codegen: add viewer support for custom container types
This adds support for container-like types such as Container[T] that
don't explicitly specify a view type for T. Instead, a package implementing
a container type should also implement and export a ContainerView[T, V] type
and a ContainerViewOf(*Container[T]) ContainerView[T, V] function, which
returns a view for the specified container, inferring the element view type V
from the element type T.

Updates #12736

Signed-off-by: Nick Khyl <nickk@tailscale.com>
2024-07-19 12:50:39 -05:00
Andrew Lytvynov
e7bf6e716b cmd/tailscale: add --min-validity flag to the cert command (#12822)
Some users run "tailscale cert" in a cron job to renew their
certificates on disk. The time until the next cron job run may be long
enough for the old cert to expire with our default heristics.

Add a `--min-validity` flag which ensures that the returned cert is
valid for at least the provided duration (unless it's longer than the
cert lifetime set by Let's Encrypt).

Updates #8725

Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
2024-07-19 09:35:22 -07:00
Lee Briggs
32ce18716b Add extra environment variables in deployment template (#12858)
Fixes #12857

Signed-off-by: Lee Briggs <lee@leebriggs.co.uk>
2024-07-19 06:52:27 -07:00
Irbe Krumina
0f57b9340b cmd/k8s-operator,tstest,go.{mod,sum}: remove fybrik.io/crdoc dependency (#12862)
Remove fybrik.io/crdoc dependency as it is causing issues for folks attempting
to vendor tailscale using GOPROXY=direct.
This means that the CRD API docs in ./k8s-operator/api.md will no longer
be generated- I am going to look at replacing it with another tool
in a follow-up.

Updates tailscale/tailscale#12859

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2024-07-19 14:17:28 +01:00
Paul Scott
b2c522ce95 tsweb: log cancelled requests as 499
Fixes #12860

Signed-off-by: Paul Scott <paul@tailscale.com>
2024-07-19 11:30:38 +01:00
Adrian Dewhurst
54f58d1143 ipn/ipnlocal: add comment explaining auto exit node migration
Updates tailscale/corp#19681

Change-Id: I6d396780b058ff0fbea0e9e53100f04ef3b76339
Signed-off-by: Adrian Dewhurst <adrian@tailscale.com>
2024-07-18 16:48:43 -04:00
Mario Minardi
485018696a {tool,client}: bump node version (#12840)
Bump node version to latest lts on the 18.x line which is 18.20.4 at the time of writing.

Updates https://github.com/tailscale/corp/issues/21741

Signed-off-by: Mario Minardi <mario@tailscale.com>
2024-07-18 13:12:42 -06:00
Nick Khyl
1608831c33 wgengine/router: use quad-100 as the nexthop on Windows
Windows requires routes to have a nexthop. Routes created using the interface's local IP address or an unspecified IP address ("0.0.0.0" or "::") as the nexthop are considered on-link routes. Notably, Windows treats on-link subnet routes differently, reserving the last IP in the range as the broadcast IP and therefore prohibiting TCP connections to it, resulting in WSA error 10049: "The requested address is not valid in its context. This does not happen with single-host routes, such as routes to Tailscale IP addresses, but becomes a problem with advertised subnets when all IPs in the range should be reachable.

Before Windows 8, only routes created with an unspecified IP address were considered on-link, so our previous approach of using the interface's own IP as the nexthop likely worked on Windows 7.

This PR updates configureInterface to use the TailscaleServiceIP (100.100.100.100) and its IPv6 counterpart as the nexthop for subnet routes.

Fixes tailscale/support-escalations#57

Signed-off-by: Nick Khyl <nickk@tailscale.com>
2024-07-18 10:08:29 -05:00
Brad Fitzpatrick
d3af54444c client/tailscale: document ACLTestFailureSummary.User field
And justify its legacy name.

Updates #1931

Change-Id: I3eff043679bf8f046aed6e2c4fb7592fe2e66514
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-07-18 08:02:49 -07:00
Paul Scott
d97cddd876 tsweb: swallow panics
With this change, the error handling and request logging are all done in defers
after calling inner.ServeHTTP. This ensures that any recovered values which we
want to re-panic with retain a useful stacktrace.  However, we now only
re-panic from errorHandler when there's no outside logHandler. Which if you're
using StdHandler there always is. We prefer this to ensure that we are able to
write a 500 Internal Server Error to the client. If a panic hits http.Server
then the response is not sent back.

Updates #12784

Signed-off-by: Paul Scott <paul@tailscale.com>
2024-07-18 15:41:04 +01:00
Brad Fitzpatrick
f77821fd63 derp/derphttp: determine whether a region connect was to non-ideal node
... and then do approximately nothing with that information, other
than a big TODO. This is mostly me relearning this code and leaving
breadcrumbs for others in the future.

Updates #12724

Signed-off-by: Brad Fitzpatrick <brad@danga.com>
2024-07-17 14:59:45 -07:00
Brad Fitzpatrick
0b32adf9ec hostinfo: set Hostinfo.PackageType for mkctr container builds
Fixes tailscale/corp#21448

Change-Id: Id60fb5cd7d31ef94cdbb176141e034845a480a00
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-07-17 11:26:16 -07:00
Cameron Stokes
1ac14d7216 Dockerfile: remove warning (#12841)
Fixes tailscale/tailscale#12842

Signed-off-by: Cameron Stokes <cameron@cameronstokes.com>
2024-07-17 10:30:15 -07:00
Aaron Klotz
4ff276cf52 VERSION.txt: this is v1.71.0
Signed-off-by: Aaron Klotz <aaron@tailscale.com>
2024-07-17 11:27:05 -06:00
Irbe Krumina
2742153f84 cmd/k8s-operator: add a metric to track the amount of ProxyClass resources (#12833)
Updates tailscale/tailscale#10709

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2024-07-17 14:34:56 +01:00
Paul Scott
646990a7d0 tsweb: log once per request
StdHandler/retHandler would previously emit one log line for each request.
If there were multiple StdHandler in the chain, there would be one log line
per instance of retHandler.

With this change, only the outermost StdHandler/logHandler actually logs the
request or invokes OnStart or OnCompletion callbacks. The error-rendering part
of retHandler lives on in errorHandler, and errorHandler passes those errors up
the stack to logHandler through a callback that logHandler places in the
request.Context().

Updates tailscale/corp#19999

Signed-off-by: Paul Scott <paul@tailscale.com>
2024-07-16 15:52:23 +01:00
Adrian Dewhurst
8882c6b730 ipn/ipnlocal: wait for DERP before auto exit node migration
Updates tailscale/corp#19681

Change-Id: I31dec154aa3b5edba01f10eec37640f631729cb2
Signed-off-by: Adrian Dewhurst <adrian@tailscale.com>
2024-07-15 12:53:03 -04:00
License Updater
35d2efd692 licenses: update license notices
Signed-off-by: License Updater <noreply+license-updater@tailscale.com>
2024-07-15 08:44:32 -07:00
Anton Tolchanov
fc074a6b9f client/tailscale: add the nodeAttrs section
This change allows ACL contents to include node attributes
https://tailscale.com/kb/1337/acl-syntax#node-attributes-nodeattrs

Updates tailscale/corp#20583

Signed-off-by: Anton Tolchanov <anton@tailscale.com>
2024-07-15 16:43:48 +01:00
Paul Scott
014bf25c0a tsweb: fix TestStdHandler_panic flake
Fixes #12816

Signed-off-by: Paul Scott <paul@tailscale.com>
2024-07-15 16:34:13 +01:00
Adrian Dewhurst
0834712c91 ipn: allow FQDN in exit node selection
To match the format of exit node suggestions and ensure that the result
is not ambiguous, relax exit node CLI selection to permit using a FQDN
including the trailing dot.

Updates #12618

Change-Id: I04b9b36d2743154aa42f2789149b2733f8555d3f
Signed-off-by: Adrian Dewhurst <adrian@tailscale.com>
2024-07-15 11:22:30 -04:00
Paul Scott
fec41e4904 tsweb: add stack trace to panic error msg
Updates #12784

Signed-off-by: Paul Scott <paul@tailscale.com>
2024-07-15 10:34:13 +01:00
Nick Khyl
fd0acc4faf cmd/cloner, cmd/viewer: add _test prefix for files generated with the test build tag
Updates #12736

Signed-off-by: Nick Khyl <nickk@tailscale.com>
2024-07-12 15:31:34 -05:00
Fran Bull
380a3a0834 appc: track metrics for route info storing
Track how often we're writing state and how many routes we're writing.

Updates #11008

Signed-off-by: Fran Bull <fran@tailscale.com>
2024-07-12 10:39:48 -07:00
Anton Tolchanov
5d61d1c7b0 log/sockstatlog: don't block for more than 5s on shutdown
Fixes tailscale/corp#21618

Signed-off-by: Anton Tolchanov <anton@tailscale.com>
2024-07-12 17:50:11 +01:00
Linus Brogan
9609b26541 cmd/tailscale: resolve taildrive share paths
Fixes #12258.

Signed-off-by: Linus Brogan <git@linusbrogan.com>
2024-07-12 11:47:48 -05:00
Anton Tolchanov
7403d8e9a8 logtail: close idle HTTP connections on shutdown
Fixes tailscale/corp#21609

Co-authored-by: Maisem Ali <maisem@tailscale.com>
Signed-off-by: Anton Tolchanov <anton@tailscale.com>
2024-07-12 17:47:30 +01:00
Jordan Whited
f0b9d3f477 net/tstun: fix docstring for Wrapper.SetWGConfig (#12796)
Updates #cleanup

Signed-off-by: Jordan Whited <jordan@tailscale.com>
2024-07-12 09:28:35 -07:00
Andrea Gottardo
3f3edeec07 health: drop unnecessary logging in TestSetUnhealthyWithTimeToVisible (#12795)
Fixes tailscale/tailscale#12794

We were printing some leftover debug logs within a callback function that would be executed after the test completion, causing the test to fail. This change drops the log calls to address the issue.

Signed-off-by: Andrea Gottardo <andrea@gottardo.me>
2024-07-12 16:05:27 +00:00
Brad Fitzpatrick
808b4139ee wgengine/magicsock: use wireguard-go/conn.PeerAwareEndpoint
If we get an non-disco presumably-wireguard-encrypted UDP packet from
an IP:port we don't recognize, rather than drop the packet, give it to
WireGuard anyway and let WireGuard try to figure out who it's from and
tell us.

This uses the new hook added in https://github.com/tailscale/wireguard-go/pull/27

Updates tailscale/corp#20732

Change-Id: I5c61a40143810592f9efac6c12808a87f924ecf2
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-07-12 08:24:06 -07:00
Claire Wang
49bf63cdd0 ipn/ipnlocal: check for offline auto exit node in SetControlClientStatus (#12772)
Updates tailscale/corp#19681

Signed-off-by: Claire Wang <claire@tailscale.com>
2024-07-12 11:06:07 -04:00
Joe Tsai
d209b032ab syncs: add Map.WithLock to allow mutations to the underlying map (#8101)
Some operations cannot be implemented with the prior API:
* Iterating over the map and deleting keys
* Iterating over the map and replacing items
* Calling APIs that expect a native Go map

Add a Map.WithLock method that acquires a write-lock on the map
and then calls a user-provided closure with the underlying Go map.
This allows users to interact with the Map as a regular Go map,
but with the gaurantees that it is concurrent safe.

Updates tailscale/corp#9115

Signed-off-by: Joe Tsai <joetsai@digital-static.net>
2024-07-11 16:16:30 -07:00
Nick Khyl
fc28c8e7f3 cmd/cloner, cmd/viewer, util/codegen: add support for generic types and interfaces
This adds support for generic types and interfaces to our cloner and viewer codegens.
It updates these packages to determine whether to make shallow or deep copies based
on the type parameter constraints. Additionally, if a template parameter or an interface
type has View() and Clone() methods, we'll use them for getters and the cloner of the
owning structure.

Updates #12736

Signed-off-by: Nick Khyl <nickk@tailscale.com>
2024-07-11 16:38:53 -05:00
Andrea Gottardo
b7c3cfe049 health: support delayed Warnable visibility (#12783)
Updates tailscale/tailscale#4136

To reduce the likelihood of presenting spurious warnings, add the ability to delay the visibility of certain Warnables, based on a TimeToVisible time.Duration field on each Warnable. The default is zero, meaning that a Warnable is immediately visible to the user when it enters an unhealthy state.

Signed-off-by: Andrea Gottardo <andrea@gottardo.me>
2024-07-11 18:51:47 +00:00
KevinLiang10
8d7b78f3f7 net/dns/publicdns: remove additional information in DOH URL passed to IPv6 address generation for controlD.
This commit truncates any additional information (mainly hostnames) that's passed to controlD via DOH URL in DoHIPsOfBase.
This change is to make sure only resolverID is passed to controlDv6Gen but not the additional information.

Updates: #7946
Signed-off-by: KevinLiang10 <37811973+KevinLiang10@users.noreply.github.com>
2024-07-10 16:14:05 -04:00
Mario Minardi
041733d3d1 publicapi: add note that API docs have moved to existing docs files (#12770)
Add note that API docs have moved to `https://tailscale.com/api` to the
top of existing API docs markdown files.

Updates https://github.com/tailscale/corp/issues/1301

Signed-off-by: Mario Minardi <mario@tailscale.com>
2024-07-10 12:42:34 -06:00
Anton Tolchanov
874972b683 posture: add network hardware addresses to posture identity
If an optional `hwaddrs` URL parameter is present, add network interface
hardware addresses to the posture identity response.

Just like with serial numbers, this requires client opt-in via MDM or
`tailscale set --posture-checking=true`
(https://tailscale.com/kb/1326/device-identity)

Updates tailscale/corp#21371

Signed-off-by: Anton Tolchanov <anton@tailscale.com>
2024-07-10 18:28:30 +01:00
Lee Briggs
b546a6e758 wgengine/magicsock: allow a CSV list for pretendpoint
Load Balancers often have more than one ingress IP, so allowing us to
add multiple means we can offer multiple options.

Updates #12578

Change-Id: I4aa49a698d457627d2f7011796d665c67d4c7952
Signed-off-by: Lee Briggs <lee@leebriggs.co.uk>
2024-07-10 09:57:28 -07:00
Brad Fitzpatrick
c6af5bbfe8 all: add test for package comments, fix, add comments as needed
Updates #cleanup

Change-Id: Ic4304e909d2131a95a38b26911f49e7b1729aaef
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-07-10 09:57:00 -07:00
Joe Tsai
e92f4c6af8 syncs: add generic Pool (#12759)
Pool is a type-safe wrapper over sync.Pool.

Updates tailscale/corp#11038
Updates #cleanup

Signed-off-by: Joe Tsai <joetsai@digital-static.net>
2024-07-10 09:39:52 -07:00
Irbe Krumina
986d60a094 cmd/k8s-operator: add metrics for attempted/uploaded session recordings (#12765)
Updates tailscale/corp#19821

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2024-07-10 14:00:42 +01:00
Irbe Krumina
6a982faa7d cmd/k8s-operator: send container name to session recorder (#12763)
Updates tailscale/corp#19821

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2024-07-10 10:48:53 +01:00
Anton Tolchanov
c8f258a904 prober: propagate DERPMap request creation errors
Updates tailscale/corp#8497

Signed-off-by: Anton Tolchanov <anton@tailscale.com>
2024-07-09 13:43:51 +01:00
Nick Khyl
726d5d507d cmd/k8s-operator: update depaware.txt
This fixes an issue caused by the merge order of 2b638f550d and 8bd442ba8c.

Updates #Cleanup

Signed-off-by: Nick Khyl <nickk@tailscale.com>
2024-07-08 23:02:27 -05:00
Maisem Ali
2238ca8a05 go.mod: bump bart
Updates #bart

Signed-off-by: Maisem Ali <maisem@tailscale.com>
2024-07-08 19:10:44 -07:00
Nick Khyl
8bd442ba8c util/winutil/gp, net/dns: add package for Group Policy API
This adds a package with GP-related functions and types to be used in the future PRs.
It also updates nrptRuleDatabase to use the new package instead of its own gpNotificationWatcher implementation.

Updates #12687

Signed-off-by: Nick Khyl <nickk@tailscale.com>
2024-07-08 20:37:03 -05:00
Andrew Lytvynov
7b1c764088 ipn/ipnlocal: gate systemd-run flags on systemd version (#12747)
We added a workaround for --wait, but didn't confirm the other flags,
which were added in systemd 235 and 236. Check systemd version for
deciding when to set all 3 flags.

Fixes #12136

Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
2024-07-08 16:40:06 -07:00
Andrew Lytvynov
b8af91403d clientupdate: return true for CanAutoUpdate for macsys (#12746)
While `clientupdate.Updater` won't be able to apply updates on macsys,
we use `clientupdate.CanAutoUpdate` to gate the EditPrefs endpoint in
localAPI. We should allow the GUI client to set AutoUpdate.Apply on
macsys for it to properly get reported to the control plane. This also
allows the tailnet-wide default for auto-updates to propagate to macsys
clients.

Updates https://github.com/tailscale/corp/issues/21339

Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
2024-07-08 15:54:50 -07:00
Nick Khyl
e21d8768f9 types/opt: add generic Value[T any] for optional values of any types
Updates #12736

Signed-off-by: Nick Khyl <nickk@tailscale.com>
2024-07-08 17:00:43 -05:00
Maisem Ali
5576972261 client/tailscale: use safesocket.ConnectContext
I apparently missed this in 4b6a0c42c8.

Updates tailscale/corp#18266

Signed-off-by: Maisem Ali <maisem@tailscale.com>
2024-07-08 13:59:41 -07:00
Irbe Krumina
ba517ab388 cmd/k8s-operator,ssh/tailssh,tsnet: optionally record 'kubectl exec' sessions via Kubernetes operator's API server proxy (#12274)
cmd/k8s-operator,ssh/tailssh,tsnet: optionally record kubectl exec sessions

The Kubernetes operator's API server proxy, when it receives a request
for 'kubectl exec' session now reads 'RecorderAddrs', 'EnforceRecorder'
fields from tailcfg.KubernetesCapRule.
If 'RecorderAddrs' is set to one or more addresses (of a tsrecorder instance(s)),
it attempts to connect to those and sends the session contents
to the recorder before forwarding the request to the kube API
server. If connection cannot be established or fails midway,
it is only allowed if 'EnforceRecorder' is not true (fail open).

Updates tailscale/corp#19821

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
Co-authored-by: Maisem Ali <maisem@tailscale.com>
2024-07-08 21:18:55 +01:00
Maisem Ali
2b638f550d cmd/k8s-operator: add depaware.txt
Updates #12742

Signed-off-by: Maisem Ali <maisem@tailscale.com>
2024-07-08 12:43:10 -07:00
License Updater
9102a5bb73 licenses: update license notices
Signed-off-by: License Updater <noreply+license-updater@tailscale.com>
2024-07-08 11:19:29 -05:00
Flakes Updater
c8fe9f0064 go.mod.sri: update SRI hash for go.mod changes
Signed-off-by: Flakes Updater <noreply+flakes-updater@tailscale.com>
2024-07-08 11:19:06 -05:00
Brad Fitzpatrick
42dac7c5c2 wgengine/magicsock: add debug envknob for injecting an endpoint
For testing. Lee wants to play with 'AWS Global Accelerator Custom
Routing with Amazon Elastic Kubernetes Service'. If this works well
enough, we can promote it.

Updates #12578

Change-Id: I5018347ed46c15c9709910717d27305d0aedf8f4
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-07-08 07:59:40 -07:00
Brad Fitzpatrick
d2fef01206 control/controlknobs,tailcfg,wgengine/magicsock: remove DRPO shutoff switch
The DERP Return Path Optimization (DRPO) is over four years old (and
on by default for over two) and we haven't had problems, so time to
remove the emergency shutoff code (controlknob) which we've never
used. The controlknobs are only meant for new features, to mitigate
risk. But we don't want to keep them forever, as they kinda pollute
the code.

Updates #150

Change-Id: If021bc8fd1b51006d8bddd1ffab639bb1abb0ad1
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-07-06 19:50:53 -07:00
Brad Fitzpatrick
9df107f4f0 wgengine/magicsock: use derp-region-as-magic-AddrPort hack in fewer places
And fix up a bogus comment and flesh out some other comments.

Updates #cleanup

Change-Id: Ia60a1c04b0f5e44e8d9587914af819df8e8f442a
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-07-06 19:38:59 -07:00
Aaron Klotz
e181f12a7b util/winutil/s4u: fix some doc comments in the s4u package
This is #cleanup

Signed-off-by: Aaron Klotz <aaron@tailscale.com>
2024-07-05 13:19:47 -07:00
Brad Fitzpatrick
c4b20c5411 go.mod: bump github.com/tailscale/wireguard-go
Updates tailscale/corp#20732

Change-Id: Ic0272fe9a226afef4e23dfca5da8cd1d550c1cd6
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-07-05 09:44:15 -07:00
Tom Proctor
01a7726cf7 cmd/containerboot,cmd/k8s-operator: enable IPv6 for fqdn egress proxies (#12577)
cmd/containerboot,cmd/k8s-operator: enable IPv6 for fqdn egress proxies

Don't skip installing egress forwarding rules for IPv6 (as long as the host
supports IPv6), and set headless services `ipFamilyPolicy` to
`PreferDualStack` to optionally enable both IP families when possible. Note
that even with `PreferDualStack` set, testing a dual-stack GKE cluster with
the default DNS setup of kube-dns did not correctly set both A and
AAAA records for the headless service, and instead only did so when
switching the cluster DNS to Cloud DNS. For both IPv4 and IPv6 to work
simultaneously in a dual-stack cluster, we require headless services to
return both A and AAAA records.

If the host doesn't support IPv6 but the FQDN specified only has IPv6
addresses available, containerboot will exit with error code 1 and an
error message because there is no viable egress route.

Fixes #12215

Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
2024-07-05 12:21:48 +01:00
Andrea Gottardo
309afa53cf health: send ImpactsConnectivity value over LocalAPI (#12700)
Updates tailscale/tailscale#4136

We should make sure to send the value of ImpactsConnectivity over to the clients using LocalAPI as they need it to display alerts in the GUI properly.

Signed-off-by: Andrea Gottardo <andrea@gottardo.me>
2024-07-03 20:19:06 +00:00
Charlotte Brandhorst-Satzkorn
42f01afe26 cmd/tailscale/cli: exit node filter should display all exit node options (#12699)
This change expands the `exit-node list -filter` command to display all
location based exit nodes for the filtered country. This allows users
to switch to alternative servers when our recommended exit node is not
working as intended.

This change also makes the country filter matching case insensitive,
e.g. both USA and usa will work.

Updates #12698

Signed-off-by: Charlotte Brandhorst-Satzkorn <charlotte@tailscale.com>
2024-07-03 11:48:20 -07:00
Chris Palmer
59936e6d4a scripts: don't refresh the pacman repository on Arch (#12194)
Fixes #12186

Signed-off-by: Chris Palmer <cpalmer@tailscale.com>
Co-authored-by: Chris Palmer <cpalmer@tailscale.com>
2024-07-03 09:58:01 -07:00
Andrea Gottardo
732af2f6e0 health: reduce severity of some warnings, improve update messages (#12689)
Updates tailscale/tailscale#4136

High severity health warning = a system notification will appear, which can be quite disruptive to the user and cause unnecessary concern in the event of a temporary network issue.

Per design decision (@sonovawolf), the severity of all warnings but "network is down" should be tuned down to medium/low. ImpactsConnectivity should be set, to change the icon to an exclamation mark in some cases, but without a notification bubble.

I also tweaked the messaging for update-available, to reflect how each platform gets updates in different ways.

Signed-off-by: Andrea Gottardo <andrea@gottardo.me>
2024-07-02 23:11:28 -07:00
Andrew Lytvynov
458decdeb0 go.toolchain.rev: update to Go 1.22.5 (#12690)
Updates https://github.com/tailscale/corp/issues/21304

Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
2024-07-02 14:39:30 -07:00
Jonathan Nobels
4e5ef5b628 net/dns: fix broken dns benchmark tests (#12686)
Updates tailscale/corp#20677

The recover function wasn't getting set in the benchmark
tests.  Default changed to an empty func.

Signed-off-by: Jonathan Nobels <jonathan@tailscale.com>
2024-07-02 14:22:13 -04:00
Flakes Updater
012933635b go.mod.sri: update SRI hash for go.mod changes
Signed-off-by: Flakes Updater <noreply+flakes-updater@tailscale.com>
2024-07-01 16:58:27 -07:00
Brad Fitzpatrick
da32468988 version/mkversion: allow env config of oss git cache dir
Updates tailscale/corp#21262

Change-Id: I80bd880b53f6d851c15479f39fad62b25f1095f1
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-07-01 16:22:55 -07:00
Jordan Whited
ddf94a7b39 cmd/stunstamp: fix handling of invalid DERP map resp (#12679)
Updates tailscale/corp#20344

Signed-off-by: Jordan Whited <jordan@tailscale.com>
2024-07-01 16:07:48 -07:00
Brad Fitzpatrick
b56058d7e3 tool/gocross: fix regression detecting when gocross needs rebuild
Fix regression from #8108 (Mar 2023). Since that change, gocross has
always been rebuilt on each run of ./tool/go (gocross-wrapper.sh),
adding ~100ms.  (Well, not totally rebuilt; cmd/go's caching still
ends up working fine.)

The problem was $gocross_path was just "gocross", which isn't in my
path (and "." isn't in my $PATH, as it shouldn't be), so this line was
always evaluating to the empty string:

    gotver="$($gocross_path gocross-version 2>/dev/null || echo '')"

The ./gocross is fine because of the earlier `cd "$repo_root"`

Updates tailscale/corp#21262
Updates tailscale/corp#21263

Change-Id: I80d25446097a3bb3423490c164352f0b569add5f
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-07-01 14:40:51 -07:00
License Updater
d780755340 licenses: update license notices
Signed-off-by: License Updater <noreply+license-updater@tailscale.com>
2024-07-01 10:31:21 -07:00
Percy Wegmann
489b990240 tailcfg: bump CurrentCapabilityVersion to capture SSH agent forwarding fix
Updates #12467

Signed-off-by: Percy Wegmann <percy@tailscale.com>
2024-07-01 11:57:55 -05:00
Tom Proctor
d15250aae9 go.{mod,sum}: bump mkctr (#12654)
go get github.com/tailscale/mkctr@main

Pulls in changes to support a local target that only pushes
a single-platform image to the machine's local image store.

Fixes tailscale/mkctr#18

Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
2024-07-01 10:23:46 +01:00
Claire Wang
8965e87fa8 ipn/ipnlocal: handle auto value for ExitNodeID syspolicy (#12512)
Updates tailscale/corp#19681

Signed-off-by: Claire Wang <claire@tailscale.com>
2024-06-28 23:17:31 -04:00
James Tucker
114d1caf55 derp/xdp: retain the link so that the fd is not closed
BPF links require that the owning FD remains open, this FD is embedded
into the RawLink returned by the attach function and must live for the
duration of the server.

Updates ENG-4274

Signed-off-by: James Tucker <james@tailscale.com>
2024-06-28 14:38:21 -07:00
James Tucker
b565a9faa7 cmd/xdpderper: add autodetection for default interface name
This makes deployment easier in hetrogenous environments.

Updates ENG-4274

Signed-off-by: James Tucker <james@tailscale.com>
2024-06-27 15:42:11 -07:00
Anton Tolchanov
781f79408d ipn/ipnlocal: allow multiple signature chains from the same SigCredential
Detection of duplicate Network Lock signature chains added in
01847e0123 failed to account for chains
originating with a SigCredential signature, which is used for wrapped
auth keys. This results in erroneous removal of signatures that
originate from the same re-usable auth key.

This change ensures that multiple nodes created by the same re-usable
auth key are not getting filtered out by the network lock.

Updates tailscale/corp#19764

Signed-off-by: Anton Tolchanov <anton@tailscale.com>
2024-06-27 19:28:57 +01:00
Anton Tolchanov
4651827f20 tka: test SigCredential signatures and netmap filtering
This change moves handling of wrapped auth keys to the `tka` package and
adds a test covering auth key originating signatures (SigCredential) in
netmap.

Updates tailscale/corp#19764

Signed-off-by: Anton Tolchanov <anton@tailscale.com>
2024-06-27 19:28:57 +01:00
Adrian Dewhurst
8f7588900a ipn/ipnlocal: fix nil pointer dereference and add related test
Fixes #12644

Change-Id: I3589b01a9c671937192caaedbb1312fd906ca712
Signed-off-by: Adrian Dewhurst <adrian@tailscale.com>
2024-06-27 14:21:59 -04:00
Jordan Whited
0bb82561ba go.mod: update wireguard-go (#12645)
This pulls in device.WaitPool fixes from tailscale/wireguard-go@1e08883
and tailscale/wireguard-go@cfa4567.

Updates tailscale/corp#21095

Signed-off-by: Jordan Whited <jordan@tailscale.com>
2024-06-27 10:32:14 -07:00
Andrew Lytvynov
2064dc20d4 health,ipn/ipnlocal: hide update warning when auto-updates are enabled (#12631)
When auto-udpates are enabled, we don't need to nag users to update
after a new release, before we release auto-updates.

Updates https://github.com/tailscale/corp/issues/20081

Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
2024-06-27 09:36:29 -07:00
Anton Tolchanov
23c5870bd3 tsnet: do not log an error on shutdown
Updates tailscale/corp#20583

Signed-off-by: Anton Tolchanov <anton@tailscale.com>
2024-06-27 13:28:19 +01:00
Josh McKinney
18939df0a7 fix: broken tests for localhost
Signed-off-by: Josh McKinney <joshka@users.noreply.github.com>
2024-06-26 20:57:19 -07:00
Josh McKinney
1d6ab9f9db cmd/serve: don't convert localhost to 127.0.0.1
This is not valid in many situations, specifically when running a local astro site that listens on localhost, but ignores 127.0.0.1

Fixes: https://github.com/tailscale/tailscale/issues/12201

Signed-off-by: Josh McKinney <joshka@users.noreply.github.com>
2024-06-26 20:57:19 -07:00
Brad Fitzpatrick
210264f942 cmd/derper: clarify that derper and tailscaled need to be in sync
Fixes #12617

Change-Id: Ifc87b7d9cf699635087afb57febd01fb9a6d11b7
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-06-26 19:46:42 -07:00
Brad Fitzpatrick
6b801a8e9e cmd/derper: link to various derper docs in more places
In hopes it'll be found more.

Updates tailscale/corp#20844

Change-Id: Ic92ee9908f45b88f8770de285f838333f9467465
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-06-26 19:46:35 -07:00
Flakes Updater
b3f91845dc go.mod.sri: update SRI hash for go.mod changes
Signed-off-by: Flakes Updater <noreply+flakes-updater@tailscale.com>
2024-06-26 19:43:06 -07:00
James Tucker
46fda6bf4c cmd/derper: add some DERP diagnostics pointers
A few other minor language updates.

Updates tailscale/corp#20844

Change-Id: Idba85941baa0e2714688cc8a4ec3e242e7d1a362
Signed-off-by: James Tucker <james@tailscale.com>
2024-06-26 19:18:28 -07:00
Brad Fitzpatrick
9766f0e110 net/dns: move mutex before the field it guards
And some misc doc tweaks for idiomatic Go style.

Updates #cleanup

Change-Id: I3ca45f78aaca037f433538b847fd6a9571a2d918
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-06-26 16:56:02 -07:00
dependabot[bot]
94defc4056 build(deps): bump golang.org/x/image from 0.15.0 to 0.18.0
Bumps [golang.org/x/image](https://github.com/golang/image) from 0.15.0 to 0.18.0.
- [Commits](https://github.com/golang/image/compare/v0.15.0...v0.18.0)

---
updated-dependencies:
- dependency-name: golang.org/x/image
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-06-26 16:19:35 -07:00
Aaron Klotz
b292f7f9ac util/winutil/s4u: fix incorrect token type specified in s4u Login
This was correct before, I think I just made a copy/paste error when
updating that PR.

Updates #12383

Signed-off-by: Aaron Klotz <aaron@tailscale.com>
2024-06-26 14:28:56 -06:00
Aaron Klotz
5f177090e3 util/winutil: ensure domain controller address is used when retrieving remote profile information
We cannot directly pass a flat domain name into NetUserGetInfo; we must
resolve the address of a domain controller first.

This PR implements the appropriate resolution mechanisms to do that, and
also exposes a couple of new utility APIs for future needs.

Fixes #12627

Signed-off-by: Aaron Klotz <aaron@tailscale.com>
2024-06-26 13:10:10 -06:00
Andrew Dunham
0323dd01b2 ci: enable checklocks workflow for specific packages
This turns the checklocks workflow into a real check, and adds
annotations to a few basic packages as a starting point.

Updates #12625

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I2b0185bae05a843b5257980fc6bde732b1bdd93f
2024-06-26 13:55:07 -04:00
Andrew Dunham
8487fd2ec2 wgengine/magicsock: add more DERP home clientmetrics
Updates tailscale/corp#18095

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I423adca2de0730092394bb5fd5796cd35557d352
2024-06-26 11:44:26 -04:00
Adrian Dewhurst
a6b13e6972 cmd/tailscale/cli: correct command emitted by exit node suggestion
The exit node suggestion CLI command was written with the assumption
that it's possible to provide a stableid on the command line, but this
is incorrect. Instead, it will now emit the name of the exit node.

Fixes #12618

Change-Id: Id7277f395b5fca090a99b0d13bfee7b215bc9802
Signed-off-by: Adrian Dewhurst <adrian@tailscale.com>
2024-06-26 11:29:14 -04:00
Naman Sood
75254178a0 ipn/ipnlocal: don't bind localListener if its context is canceled (#12621)
The context can get canceled during backoff, and binding after that
makes the listener impossible to close afterwards.

Fixes #12620.

Signed-off-by: Naman Sood <mail@nsood.in>
2024-06-26 11:18:45 -04:00
Anton Tolchanov
787ead835f tsweb: accept a function to call before request handling
To complement the existing `onCompletion` callback, which is called
after request handler.

Updates tailscale/corp#17075

Signed-off-by: Anton Tolchanov <anton@tailscale.com>
2024-06-26 11:27:26 +01:00
Andrea Gottardo
6e55d8f6a1 health: add warming-up warnable (#12553) 2024-06-25 22:02:38 -07:00
Andrew Dunham
30f8d8199a ipn/ipnlocal: fix data race in tests
We can observe a data race in tests when logging after a test is
finished. `b.onHealthChange` is called in a goroutine after being
registered with `health.Tracker.RegisterWatcher`, which calls callbacks
in `setUnhealthyLocked` in a new goroutine.

See: https://github.com/tailscale/tailscale/actions/runs/9672919302/job/26686038740

Updates #12054

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: Ibf22cc994965d88a9e7236544878d5373f91229e
2024-06-25 21:43:22 -07:00
Aaron Klotz
da078b4c09 util/winutil: add package for logging into Windows via Service-for-User (S4U)
This PR ties together pseudoconsoles, user profiles, s4u logons, and
process creation into what is (hopefully) a simple API for various
Tailscale services to obtain Windows access tokens without requiring
knowledge of any Windows passwords. It works both for domain-joined
machines (Kerberos) and non-domain-joined machines. The former case
is fairly straightforward as it is fully documented. OTOH, the latter
case is not documented, though it is fully defined in the C headers in
the Windows SDK. The documentation blanks were filled in by reading
the source code of Microsoft's Win32 port of OpenSSH.

We need to do a bit of acrobatics to make conpty work correctly while
creating a child process with an s4u token; see the doc comments above
startProcessInternal for details.

Updates #12383

Signed-off-by: Aaron Klotz <aaron@tailscale.com>
2024-06-25 22:05:52 -06:00
Andrew Dunham
53a5d00fff net/dns: ensure /etc/resolv.conf is world-readable even with a umask
Previously, if we had a umask set (e.g. 0027) that prevented creating a
world-readable file, /etc/resolv.conf would be created without the o+r
bit and thus other users may be unable to resolve DNS.

Since a umask only applies to file creation, chmod the file after
creation and before renaming it to ensure that it has the appropriate
permissions.

Updates #12609

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I2a05d64f4f3a8ee8683a70be17a7da0e70933137
2024-06-26 00:02:05 -04:00
Andrew Dunham
8161024176 wgengine/magicsock: always set home DERP if no control conn
The logic we added in #11378 would prevent selecting a home DERP if we
have no control connection.

Updates tailscale/corp#18095

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I44bb6ac4393989444e4961b8cfa27dc149a33c6e
2024-06-25 23:31:14 -04:00
Andrew Dunham
a475c435ec net/dns/resolver: fix test failure
Updates #cleanup

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I0e815a69ee44ca0ff7c0ea0ca3c6904bbf67ed1f
2024-06-25 23:08:08 -04:00
Jonathan Nobels
27033c6277 net/dns: recheck DNS config on SERVFAIL errors (#12547)
Fixes tailscale/corp#20677

Replaces the original attempt to rectify this (by injecting a netMon
event) which was both heavy handed, and missed cases where the
netMon event was "minor".

On apple platforms, the fetching the interface's nameservers can
and does return an empty list in certain situations.   Apple's API
in particular is very limiting here.  The header hints at notifications
for dns changes which would let us react ahead of time, but it's all
private APIs.

To avoid remaining in the state where we end up with no
nameservers but we absolutely need them, we'll react
to a lack of upstream nameservers by attempting to re-query
the OS.

We'll rate limit this to space out the attempts.   It seems relatively
harmless to attempt a reconfig every 5 seconds (triggered
by an incoming query) if the network is in this broken state.

Missing nameservers might possibly be a persistent condition
(vs a transient error), but that would  also imply that something
out of our control is badly misconfigured.

Tested by randomly returning [] for the nameservers.   When switching
between Wifi networks, or cell->wifi, this will randomly trigger
the bug, and we appear to reliably heal the DNS state.

Signed-off-by: Jonathan Nobels <jonathan@tailscale.com>
2024-06-25 14:56:13 -04:00
Brad Fitzpatrick
d5e692f7e7 ipn/ipnlocal: check operator user via osuser package
So non-local users (e.g. Kerberos on FreeIPA) on Linux can be looked
up. Our default binaries are built with pure Go os/user which only
supports the classic /etc/passwd and not any libc-hooked lookups.

Updates #12601

Change-Id: I9592db89e6ca58bf972f2dcee7a35fbf44608a4f
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-06-25 10:56:32 -07:00
Jordan Whited
94415e8029 cmd/stunstamp: remove sqlite DB and API (#12604)
stunstamp now sends data to Prometheus via remote write, and Prometheus
can serve the same data. Retaining and cleaning up old data in sqlite
leads to long probing pauses, and it's not worth investing more effort
to optimize the schema and/or concurrency model.

Updates tailscale/corp#20344

Signed-off-by: Jordan Whited <jordan@tailscale.com>
2024-06-25 10:21:40 -07:00
Brad Fitzpatrick
3485e4bf5a derp: make RunConnectionLoop funcs take Messages, support PeerPresentFlags
PeerPresentFlags was added in 5ffb2668ef but wasn't plumbed through to
the RunConnectionLoop. Rather than add yet another parameter (as
IP:port was added earlier), pass in the raw PeerPresentMessage and
PeerGoneMessage struct values, which are the same things, plus two
fields: PeerGoneReasonType for gone and the PeerPresentFlags from
5ffb2668ef.

Updates tailscale/corp#17816

Change-Id: Ib19d9f95353651ada90656071fc3656cf58b7987
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-06-25 09:47:25 -07:00
Fran Bull
7eb8a77ac8 appc: don't schedule advertisement of 0 routes
When the store-appc-routes flag is on for a tailnet we are writing the
routes more often than seems necessary. Investigation reveals that we
are doing so ~every time we observe a dns response, even if this causes
us not to advertise any new routes. So when we have no new routes,
instead do not advertise routes.

Fixes #12593

Signed-off-by: Fran Bull <fran@tailscale.com>
2024-06-25 08:12:51 -07:00
Irbe Krumina
24a40f54d9 util/linuxfw: verify that IPv6 if available if (#12598)
nftable runner for an IPv6 address gets requested.

Updates tailscale/tailscale#12215

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2024-06-25 14:13:49 +01:00
Brad Fitzpatrick
d91e5c25ce derp: redo, simplify how mesh update writes are queued/written
I couldn't convince myself the old way was safe and couldn't lose
writes.

And it seemed too complicated.

Updates tailscale/corp#21104

Change-Id: I17ba7c7d6fd83458a311ac671146a1f6a458a5c1
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-06-24 21:42:14 -07:00
Brad Fitzpatrick
ded7734c36 derp: account for increased size of peerPresent messages in mesh updates
sendMeshUpdates tries to write as much as possible without blocking,
being careful to check the bufio.Writer.Available size before writes.

Except that regressed in 6c791f7d60 which made those messages larger, which
meants we were doing network I/O with the Server mutex held.

Updates tailscale/corp#13945

Change-Id: Ic327071d2e37de262931b9b390cae32084811919
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-06-24 16:21:01 -07:00
Andrew Dunham
200d92121f types/lazy: add Peek method to SyncValue
This adds the ability to "peek" at the value of a SyncValue, so that
it's possible to observe a value without computing this.

Updates tailscale/corp#17122

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Co-authored-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Change-Id: I06f88c22a1f7ffcbc7ff82946335356bb0ef4622
2024-06-24 12:41:00 -07:00
Aaron Klotz
7dd76c3411 net/netns: add Windows support for bind-to-interface-by-route
This is implemented via GetBestInterfaceEx. Should we encounter errors
or fail to resolve a valid, non-Tailscale interface, we fall back to
returning the index for the default interface instead.

Fixes #12551

Signed-off-by: Aaron Klotz <aaron@tailscale.com>
2024-06-24 10:43:34 -06:00
tailscale-license-updater[bot]
591979b95f licenses: update license notices (#12414)
Signed-off-by: License Updater <noreply+license-updater@tailscale.com>
Co-authored-by: License Updater <noreply+license-updater@tailscale.com>
2024-06-24 09:20:34 -07:00
Brad Fitzpatrick
91786ff958 cmd/derper: add debug endpoint to adjust mutex profiling rate
Updates #3560

Change-Id: I474421ce75c79fb66e1c306ed47daebc5a0e069e
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-06-24 09:05:31 -07:00
Brad Fitzpatrick
5ffb2668ef derp: add PeerPresentFlags bitmask to Watch messages
Updates tailscale/corp#17816

Change-Id: Ib5baf6c981a6a4c279f8bbfef02048cfbfb3323b
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-06-22 20:38:25 -07:00
Aaron Klotz
d7a4f9d31c net/dns: ensure multiple hosts with the same IP address are combined into a single HostEntry
This ensures that each line has a unique IP address.

Fixes #11939

Signed-off-by: Aaron Klotz <aaron@tailscale.com>
2024-06-21 13:16:49 -06:00
Jordan Whited
0d6e71df70 cmd/stunstamp: add explicit metric to track timeout events (#12564)
Timeouts could already be identified as NaN values on
stunstamp_derp_stun_rtt_ns, but we can't use NaN effectively with
promql to visualize them. So, this commit adds a timeouts metric that
we can use with rate/delta/etc promql functions.

Updates tailscale/corp#20689

Signed-off-by: Jordan Whited <jordan@tailscale.com>
2024-06-21 09:17:35 -07:00
Kristoffer Dalby
dcb0f189cc cmd/proxy-to-grafana: add flag for alternative control server
Fixes #12571

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
2024-06-21 12:17:39 +02:00
712 changed files with 68835 additions and 15947 deletions

View File

@@ -18,11 +18,17 @@ jobs:
runs-on: [ ubuntu-latest ]
steps:
- name: Check out code
uses: actions/checkout@v4
uses: actions/checkout@692973e3d937129bcbf40652eb9f2f61becf3332 # v4.1.7
- name: Build checklocks
run: ./tool/go build -o /tmp/checklocks gvisor.dev/gvisor/tools/checklocks/cmd/checklocks
- name: Run checklocks vet
# TODO: remove || true once we have applied checklocks annotations everywhere.
run: ./tool/go vet -vettool=/tmp/checklocks ./... || true
# TODO(#12625): add more packages as we add annotations
run: |-
./tool/go vet -vettool=/tmp/checklocks \
./envknob \
./ipn/store/mem \
./net/stun/stuntest \
./net/wsconn \
./proxymap

View File

@@ -45,17 +45,17 @@ jobs:
steps:
- name: Checkout repository
uses: actions/checkout@v4
uses: actions/checkout@692973e3d937129bcbf40652eb9f2f61becf3332 # v4.1.7
# Install a more recent Go that understands modern go.mod content.
- name: Install Go
uses: actions/setup-go@v4
uses: actions/setup-go@41dfa10bad2bb2ae585af6ee5bb4d7d973ad74ed # v5.1.0
with:
go-version-file: go.mod
# Initializes the CodeQL tools for scanning.
- name: Initialize CodeQL
uses: github/codeql-action/init@v2
uses: github/codeql-action/init@4f3212b61783c3c68e8309a0f18a699764811cda # v3.27.1
with:
languages: ${{ matrix.language }}
# If you wish to specify custom queries, you can do so here or in a config file.
@@ -66,7 +66,7 @@ jobs:
# Autobuild attempts to build any compiled languages (C/C++, C#, or Java).
# If this step fails, then you should remove it and run the build manually (see below)
- name: Autobuild
uses: github/codeql-action/autobuild@v2
uses: github/codeql-action/autobuild@4f3212b61783c3c68e8309a0f18a699764811cda # v3.27.1
# Command-line programs to run using the OS shell.
# 📚 https://git.io/JvXDl
@@ -80,4 +80,4 @@ jobs:
# make release
- name: Perform CodeQL Analysis
uses: github/codeql-action/analyze@v2
uses: github/codeql-action/analyze@4f3212b61783c3c68e8309a0f18a699764811cda # v3.27.1

View File

@@ -10,6 +10,6 @@ jobs:
deploy:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/checkout@692973e3d937129bcbf40652eb9f2f61becf3332 # v4.1.7
- name: "Build Docker image"
run: docker build .

View File

@@ -17,7 +17,7 @@ jobs:
id-token: "write"
contents: "read"
steps:
- uses: "actions/checkout@v4"
- uses: actions/checkout@692973e3d937129bcbf40652eb9f2f61becf3332 # v4.1.7
with:
ref: "${{ (inputs.tag != null) && format('refs/tags/{0}', inputs.tag) || '' }}"
- uses: "DeterminateSystems/nix-installer-action@main"

View File

@@ -23,18 +23,18 @@ jobs:
name: lint
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/checkout@692973e3d937129bcbf40652eb9f2f61becf3332 # v4.1.7
- uses: actions/setup-go@v4
- uses: actions/setup-go@41dfa10bad2bb2ae585af6ee5bb4d7d973ad74ed # v5.1.0
with:
go-version-file: go.mod
cache: false
- name: golangci-lint
# Note: this is the 'v3' tag as of 2023-08-14
uses: golangci/golangci-lint-action@639cd343e1d3b897ff35927a75193d57cfcba299
# Note: this is the 'v6.1.0' tag as of 2024-08-21
uses: golangci/golangci-lint-action@aaa42aa0628b4ae2578232a66b541047968fac86
with:
version: v1.56
version: v1.60
# Show only new issues if it's a pull request.
only-new-issues: true

View File

@@ -14,7 +14,7 @@ jobs:
steps:
- name: Check out code into the Go module directory
uses: actions/checkout@v4
uses: actions/checkout@692973e3d937129bcbf40652eb9f2f61becf3332 # v4.1.7
- name: Install govulncheck
run: ./tool/go install golang.org/x/vuln/cmd/govulncheck@latest
@@ -24,7 +24,7 @@ jobs:
- name: Post to slack
if: failure() && github.event_name == 'schedule'
uses: slackapi/slack-github-action@v1.24.0
uses: slackapi/slack-github-action@37ebaef184d7626c5f204ab8d3baff4262dd30f0 # v1.27.0
env:
SLACK_BOT_TOKEN: ${{ secrets.GOVULNCHECK_BOT_TOKEN }}
with:

View File

@@ -67,6 +67,11 @@ jobs:
image: ${{ matrix.image }}
options: --user root
steps:
- name: install dependencies (pacman)
# Refresh the package databases to ensure that the tailscale package is
# defined.
run: pacman -Sy
if: contains(matrix.image, 'archlinux')
- name: install dependencies (yum)
# tar and gzip are needed by the actions/checkout below.
run: yum install -y --allowerasing tar gzip ${{ matrix.deps }}
@@ -93,7 +98,7 @@ jobs:
# We cannot use v4, as it requires a newer glibc version than some of the
# tested images provide. See
# https://github.com/actions/checkout/issues/1487
uses: actions/checkout@v3
uses: actions/checkout@f43a0e5ff2bd294095638e18286ca9a3d1956744 # v3.6.0
- name: run installer
run: scripts/installer.sh
# Package installation can fail in docker because systemd is not running

View File

@@ -17,7 +17,7 @@ jobs:
runs-on: [ ubuntu-latest ]
steps:
- name: Check out code
uses: actions/checkout@v4
uses: actions/checkout@692973e3d937129bcbf40652eb9f2f61becf3332 # v4.1.7
- name: Build and lint Helm chart
run: |
eval `./tool/go run ./cmd/mkversion`

View File

@@ -17,7 +17,7 @@ jobs:
runs-on: ubuntu-latest
steps:
- name: Check out code
uses: actions/checkout@v4
uses: actions/checkout@692973e3d937129bcbf40652eb9f2f61becf3332 # v4.1.7
- name: Run SSH integration tests
run: |
make sshintegrationtest

View File

@@ -50,7 +50,7 @@ jobs:
- shard: '4/4'
steps:
- name: checkout
uses: actions/checkout@v4
uses: actions/checkout@692973e3d937129bcbf40652eb9f2f61becf3332 # v4.1.7
- name: build test wrapper
run: ./tool/go build -o /tmp/testwrapper ./cmd/testwrapper
- name: integration tests as root
@@ -78,9 +78,9 @@ jobs:
runs-on: ubuntu-22.04
steps:
- name: checkout
uses: actions/checkout@v4
uses: actions/checkout@692973e3d937129bcbf40652eb9f2f61becf3332 # v4.1.7
- name: Restore Cache
uses: actions/cache@v3
uses: actions/cache@6849a6489940f00c2f30c0fb92c6274307ccb58a # v4.1.2
with:
# Note: unlike the other setups, this is only grabbing the mod download
# cache, rather than the whole mod directory, as the download cache
@@ -150,16 +150,16 @@ jobs:
runs-on: windows-2022
steps:
- name: checkout
uses: actions/checkout@v4
uses: actions/checkout@692973e3d937129bcbf40652eb9f2f61becf3332 # v4.1.7
- name: Install Go
uses: actions/setup-go@v4
uses: actions/setup-go@41dfa10bad2bb2ae585af6ee5bb4d7d973ad74ed # v5.1.0
with:
go-version-file: go.mod
cache: false
- name: Restore Cache
uses: actions/cache@v3
uses: actions/cache@6849a6489940f00c2f30c0fb92c6274307ccb58a # v4.1.2
with:
# Note: unlike the other setups, this is only grabbing the mod download
# cache, rather than the whole mod directory, as the download cache
@@ -190,7 +190,7 @@ jobs:
options: --privileged
steps:
- name: checkout
uses: actions/checkout@v4
uses: actions/checkout@692973e3d937129bcbf40652eb9f2f61becf3332 # v4.1.7
- name: chown
run: chown -R $(id -u):$(id -g) $PWD
- name: privileged tests
@@ -202,7 +202,7 @@ jobs:
if: github.repository == 'tailscale/tailscale'
steps:
- name: checkout
uses: actions/checkout@v4
uses: actions/checkout@692973e3d937129bcbf40652eb9f2f61becf3332 # v4.1.7
- name: Run VM tests
run: ./tool/go test ./tstest/integration/vms -v -no-s3 -run-vm-tests -run=TestRunUbuntu2004
env:
@@ -214,7 +214,7 @@ jobs:
runs-on: ubuntu-22.04
steps:
- name: checkout
uses: actions/checkout@v4
uses: actions/checkout@692973e3d937129bcbf40652eb9f2f61becf3332 # v4.1.7
- name: build all
run: ./tool/go install -race ./cmd/...
- name: build tests
@@ -258,9 +258,9 @@ jobs:
runs-on: ubuntu-22.04
steps:
- name: checkout
uses: actions/checkout@v4
uses: actions/checkout@692973e3d937129bcbf40652eb9f2f61becf3332 # v4.1.7
- name: Restore Cache
uses: actions/cache@v3
uses: actions/cache@6849a6489940f00c2f30c0fb92c6274307ccb58a # v4.1.2
with:
# Note: unlike the other setups, this is only grabbing the mod download
# cache, rather than the whole mod directory, as the download cache
@@ -295,7 +295,7 @@ jobs:
runs-on: ubuntu-22.04
steps:
- name: checkout
uses: actions/checkout@v4
uses: actions/checkout@692973e3d937129bcbf40652eb9f2f61becf3332 # v4.1.7
- name: build some
run: ./tool/go build ./ipn/... ./wgengine/ ./types/... ./control/controlclient
env:
@@ -317,9 +317,9 @@ jobs:
runs-on: ubuntu-22.04
steps:
- name: checkout
uses: actions/checkout@v4
uses: actions/checkout@692973e3d937129bcbf40652eb9f2f61becf3332 # v4.1.7
- name: Restore Cache
uses: actions/cache@v3
uses: actions/cache@6849a6489940f00c2f30c0fb92c6274307ccb58a # v4.1.2
with:
# Note: unlike the other setups, this is only grabbing the mod download
# cache, rather than the whole mod directory, as the download cache
@@ -350,7 +350,7 @@ jobs:
runs-on: ubuntu-22.04
steps:
- name: checkout
uses: actions/checkout@v4
uses: actions/checkout@692973e3d937129bcbf40652eb9f2f61becf3332 # v4.1.7
# Super minimal Android build that doesn't even use CGO and doesn't build everything that's needed
# and is only arm64. But it's a smoke build: it's not meant to catch everything. But it'll catch
# some Android breakages early.
@@ -365,9 +365,9 @@ jobs:
runs-on: ubuntu-22.04
steps:
- name: checkout
uses: actions/checkout@v4
uses: actions/checkout@692973e3d937129bcbf40652eb9f2f61becf3332 # v4.1.7
- name: Restore Cache
uses: actions/cache@v3
uses: actions/cache@6849a6489940f00c2f30c0fb92c6274307ccb58a # v4.1.2
with:
# Note: unlike the other setups, this is only grabbing the mod download
# cache, rather than the whole mod directory, as the download cache
@@ -399,7 +399,7 @@ jobs:
runs-on: ubuntu-22.04
steps:
- name: checkout
uses: actions/checkout@v4
uses: actions/checkout@692973e3d937129bcbf40652eb9f2f61becf3332 # v4.1.7
- name: test tailscale_go
run: ./tool/go test -tags=tailscale_go,ts_enable_sockstats ./net/sockstats/...
@@ -456,18 +456,22 @@ jobs:
fuzz-seconds: 300
dry-run: false
language: go
- name: Set artifacts_path in env (workaround for actions/upload-artifact#176)
if: steps.run.outcome != 'success' && steps.build.outcome == 'success'
run: |
echo "artifacts_path=$(realpath .)" >> $GITHUB_ENV
- name: upload crash
uses: actions/upload-artifact@v3
uses: actions/upload-artifact@b4b15b8c7c6ac21ea08fcf65892d2ee8f75cf882 # v4.4.3
if: steps.run.outcome != 'success' && steps.build.outcome == 'success'
with:
name: artifacts
path: ./out/artifacts
path: ${{ env.artifacts_path }}/out/artifacts
depaware:
runs-on: ubuntu-22.04
steps:
- name: checkout
uses: actions/checkout@v4
uses: actions/checkout@692973e3d937129bcbf40652eb9f2f61becf3332 # v4.1.7
- name: check depaware
run: |
export PATH=$(./tool/go env GOROOT)/bin:$PATH
@@ -477,7 +481,7 @@ jobs:
runs-on: ubuntu-22.04
steps:
- name: checkout
uses: actions/checkout@v4
uses: actions/checkout@692973e3d937129bcbf40652eb9f2f61becf3332 # v4.1.7
- name: check that 'go generate' is clean
run: |
pkgs=$(./tool/go list ./... | grep -Ev 'dnsfallback|k8s-operator|xdp')
@@ -490,7 +494,7 @@ jobs:
runs-on: ubuntu-22.04
steps:
- name: checkout
uses: actions/checkout@v4
uses: actions/checkout@692973e3d937129bcbf40652eb9f2f61becf3332 # v4.1.7
- name: check that 'go mod tidy' is clean
run: |
./tool/go mod tidy
@@ -502,7 +506,7 @@ jobs:
runs-on: ubuntu-22.04
steps:
- name: checkout
uses: actions/checkout@v4
uses: actions/checkout@692973e3d937129bcbf40652eb9f2f61becf3332 # v4.1.7
- name: check licenses
run: ./scripts/check_license_headers.sh .
@@ -518,7 +522,7 @@ jobs:
goarch: "386"
steps:
- name: checkout
uses: actions/checkout@v4
uses: actions/checkout@692973e3d937129bcbf40652eb9f2f61becf3332 # v4.1.7
- name: install staticcheck
run: GOBIN=~/.local/bin ./tool/go install honnef.co/go/tools/cmd/staticcheck
- name: run staticcheck
@@ -559,7 +563,7 @@ jobs:
# By having the job always run, but skipping its only step as needed, we
# let the CI output collapse nicely in PRs.
if: failure() && github.event_name == 'push'
uses: ruby/action-slack@v3.2.1
uses: slackapi/slack-github-action@37ebaef184d7626c5f204ab8d3baff4262dd30f0 # v1.27.0
with:
payload: |
{
@@ -574,6 +578,7 @@ jobs:
}
env:
SLACK_WEBHOOK_URL: ${{ secrets.SLACK_WEBHOOK_URL }}
SLACK_WEBHOOK_TYPE: INCOMING_WEBHOOK
check_mergeability:
if: always()
@@ -596,6 +601,6 @@ jobs:
steps:
- name: Decide if change is okay to merge
if: github.event_name != 'push'
uses: re-actors/alls-green@release/v1
uses: re-actors/alls-green@05ac9388f0aebcb5727afa17fcccfecd6f8ec5fe # v1.2.2
with:
jobs: ${{ toJSON(needs) }}

View File

@@ -21,21 +21,22 @@ jobs:
steps:
- name: Check out code
uses: actions/checkout@v4
uses: actions/checkout@692973e3d937129bcbf40652eb9f2f61becf3332 # v4.1.7
- name: Run update-flakes
run: ./update-flake.sh
- name: Get access token
uses: tibdex/github-app-token@b62528385c34dbc9f38e5f4225ac829252d1ea92 # v1.8.0
uses: tibdex/github-app-token@3beb63f4bd073e61482598c45c71c1019b59b73a # v2.1.0
id: generate-token
with:
app_id: ${{ secrets.LICENSING_APP_ID }}
installation_id: ${{ secrets.LICENSING_APP_INSTALLATION_ID }}
installation_retrieval_mode: "id"
installation_retrieval_payload: ${{ secrets.LICENSING_APP_INSTALLATION_ID }}
private_key: ${{ secrets.LICENSING_APP_PRIVATE_KEY }}
- name: Send pull request
uses: peter-evans/create-pull-request@284f54f989303d2699d373481a0cfa13ad5a6666 #v5.0.1
uses: peter-evans/create-pull-request@5e914681df9dc83aa4e4905692ca88beb2f9e91f #v7.0.5
with:
token: ${{ steps.generate-token.outputs.token }}
author: Flakes Updater <noreply+flakes-updater@tailscale.com>

View File

@@ -14,7 +14,7 @@ jobs:
steps:
- name: Check out code
uses: actions/checkout@v4
uses: actions/checkout@692973e3d937129bcbf40652eb9f2f61becf3332 # v4.1.7
- name: Run go get
run: |
@@ -23,18 +23,19 @@ jobs:
./tool/go mod tidy
- name: Get access token
uses: tibdex/github-app-token@b62528385c34dbc9f38e5f4225ac829252d1ea92 # v1.8.0
uses: tibdex/github-app-token@3beb63f4bd073e61482598c45c71c1019b59b73a # v2.1.0
id: generate-token
with:
# TODO(will): this should use the code updater app rather than licensing.
# It has the same permissions, so not a big deal, but still.
app_id: ${{ secrets.LICENSING_APP_ID }}
installation_id: ${{ secrets.LICENSING_APP_INSTALLATION_ID }}
installation_retrieval_mode: "id"
installation_retrieval_payload: ${{ secrets.LICENSING_APP_INSTALLATION_ID }}
private_key: ${{ secrets.LICENSING_APP_PRIVATE_KEY }}
- name: Send pull request
id: pull-request
uses: peter-evans/create-pull-request@284f54f989303d2699d373481a0cfa13ad5a6666 #v5.0.1
uses: peter-evans/create-pull-request@5e914681df9dc83aa4e4905692ca88beb2f9e91f #v7.0.5
with:
token: ${{ steps.generate-token.outputs.token }}
author: OSS Updater <noreply+oss-updater@tailscale.com>

View File

@@ -24,7 +24,7 @@ jobs:
steps:
- name: Check out code
uses: actions/checkout@v4
uses: actions/checkout@692973e3d937129bcbf40652eb9f2f61becf3332 # v4.1.7
- name: Install deps
run: ./tool/yarn --cwd client/web
- name: Run lint

6
.gitignore vendored
View File

@@ -43,3 +43,9 @@ client/web/build/assets
/gocross
/dist
# Ignore xcode userstate and workspace data
*.xcuserstate
*.xcworkspacedata
/tstest/tailmac/bin
/tstest/tailmac/build

View File

@@ -1,17 +1,13 @@
# Copyright (c) Tailscale Inc & AUTHORS
# SPDX-License-Identifier: BSD-3-Clause
############################################################################
# Note that this Dockerfile is currently NOT used to build any of the published
# Tailscale container images and may have drifted from the image build mechanism
# we use.
# Tailscale images are currently built using https://github.com/tailscale/mkctr,
# and the build script can be found in ./build_docker.sh.
#
# WARNING: Tailscale is not yet officially supported in container
# environments, such as Docker and Kubernetes. Though it should work, we
# don't regularly test it, and we know there are some feature limitations.
#
# See current bugs tagged "containers":
# https://github.com/tailscale/tailscale/labels/containers
#
############################################################################
# This Dockerfile includes all the tailscale binaries.
#
# To build the Dockerfile:
@@ -31,7 +27,7 @@
# $ docker exec tailscaled tailscale status
FROM golang:1.22-alpine AS build-env
FROM golang:1.23-alpine AS build-env
WORKDIR /go/src/tailscale
@@ -46,7 +42,7 @@ RUN go install \
gvisor.dev/gvisor/pkg/tcpip/stack \
golang.org/x/crypto/ssh \
golang.org/x/crypto/acme \
nhooyr.io/websocket \
github.com/coder/websocket \
github.com/mdlayher/netlink
COPY . .

View File

@@ -21,6 +21,7 @@ updatedeps: ## Update depaware deps
tailscale.com/cmd/tailscaled \
tailscale.com/cmd/tailscale \
tailscale.com/cmd/derper \
tailscale.com/cmd/k8s-operator \
tailscale.com/cmd/stund
depaware: ## Run depaware checks
@@ -30,6 +31,7 @@ depaware: ## Run depaware checks
tailscale.com/cmd/tailscaled \
tailscale.com/cmd/tailscale \
tailscale.com/cmd/derper \
tailscale.com/cmd/k8s-operator \
tailscale.com/cmd/stund
buildwindows: ## Build tailscale CLI for windows/amd64
@@ -115,7 +117,8 @@ sshintegrationtest: ## Run the SSH integration tests in various Docker container
echo "Testing on ubuntu:focal" && docker build --build-arg="BASE=ubuntu:focal" -t ssh-ubuntu-focal ssh/tailssh/testcontainers && \
echo "Testing on ubuntu:jammy" && docker build --build-arg="BASE=ubuntu:jammy" -t ssh-ubuntu-jammy ssh/tailssh/testcontainers && \
echo "Testing on ubuntu:mantic" && docker build --build-arg="BASE=ubuntu:mantic" -t ssh-ubuntu-mantic ssh/tailssh/testcontainers && \
echo "Testing on ubuntu:noble" && docker build --build-arg="BASE=ubuntu:noble" -t ssh-ubuntu-noble ssh/tailssh/testcontainers
echo "Testing on ubuntu:noble" && docker build --build-arg="BASE=ubuntu:noble" -t ssh-ubuntu-noble ssh/tailssh/testcontainers && \
echo "Testing on alpine:latest" && docker build --build-arg="BASE=alpine:latest" -t ssh-alpine-latest ssh/tailssh/testcontainers
help: ## Show this help
@echo "\nSpecify a command. The choices are:\n"

View File

@@ -37,7 +37,7 @@ not open source.
## Building
We always require the latest Go release, currently Go 1.22. (While we build
We always require the latest Go release, currently Go 1.23. (While we build
releases with our [Go fork](https://github.com/tailscale/go/), its use is not
required.)

View File

@@ -1 +1 @@
1.69.0
1.77.0

103
api.md
View File

@@ -1,101 +1,2 @@
# Tailscale API
The Tailscale API documentation is located in **[tailscale/publicapi](./publicapi/readme.md#tailscale-api)**.
# APIs
**[Overview](./publicapi/readme.md)**
**[Device](./publicapi/device.md#device)**
<a href="device-delete"></a>
<a href="expire-device-key"></a>
<a href="device-routes-get">
<a href="device-routes-post"></a>
<a href="#device-authorized-post"></a>
<a href="device-tags-post"></a>
<a href="device-key-post"></a>
<a href="tailnet-acl-get"></a>
- Get a device: [`GET /api/v2/device/{deviceid}`](./publicapi/device.md#get-device)
- Delete a device: [`DELETE /api/v2/device/{deviceID}`](./publicapi/device.md#delete-device)
- Expire device key: [`POST /api/v2/device/{deviceID}/expire`](./publicapi/device.md#expire-device-key)
- [**Routes**](./publicapi/device.md#routes)
- Get device routes: [`GET /api/v2/device/{deviceID}/routes`](./publicapi/device.md#get-device-routes)
- Set device routes: [`POST /api/v2/device/{deviceID}/routes`](./publicapi/device.md#set-device-routes)
- [**Authorize**](./publicapi/device.md#authorize)
- Authorize a device: [`POST /api/v2/device/{deviceID}/authorized`](./publicapi/device.md#authorize-device)
- [**Tags**](./publicapi/device.md#tags)
- Update tags: [`POST /api/v2/device/{deviceID}/tags`](./publicapi/device.md#update-device-tags)
- [**Keys**](./publicapi/device.md#keys)
- Update device key: [`POST /api/v2/device/{deviceID}/key`](./publicapi/device.md#update-device-key)
- [**IP Addresses**](./publicapi/device.md#ip-addresses)
- Set device IPv4 address: [`POST /api/v2/device/{deviceID}/ip`](./publicapi/device.md#set-device-ipv4-address)
- [**Device posture attributes**](./publicapi/device.md#device-posture-attributes)
- Get device posture attributes: [`GET /api/v2/device/{deviceID}/attributes`](./publicapi/device.md#get-device-posture-attributes)
- Set custom device posture attributes: [`POST /api/v2/device/{deviceID}/attributes/{attributeKey}`](./publicapi/device.md#set-device-posture-attributes)
- Delete custom device posture attributes: [`DELETE /api/v2/device/{deviceID}/attributes/{attributeKey}`](./publicapi/device.md#delete-custom-device-posture-attributes)
- [**Device invites**](./publicapi/device.md#invites-to-a-device)
- List device invites: [`GET /api/v2/device/{deviceID}/device-invites`](./publicapi/device.md#list-device-invites)
- Create device invites: [`POST /api/v2/device/{deviceID}/device-invites`](./publicapi/device.md#create-device-invites)
**[Tailnet](./publicapi/tailnet.md#tailnet)**
<a href="tailnet-acl-post"></a>
<a href="tailnet-acl-preview-post"></a>
<a href="tailnet-acl-validate-post"></a>
<a href="tailnet-devices"></a>
<a href="tailnet-keys-get"></a>
<a href="tailnet-keys-post"></a>
<a href="tailnet-keys-key-get"></a>
<a href="tailnet-keys-key-delete"></a>
<a href="tailnet-dns"></a>
<a href="tailnet-dns-nameservers-get"></a>
<a href="tailnet-dns-nameservers-post"></a>
<a href="tailnet-dns-preferences-get"></a>
<a href="tailnet-dns-preferences-post"></a>
<a href="tailnet-dns-searchpaths-get"></a>
<a href="tailnet-dns-searchpaths-post"></a>
- [**Policy File**](./publicapi/tailnet.md#policy-file)
- Get policy file: [`GET /api/v2/tailnet/{tailnet}/acl`](./publicapi/tailnet.md#get-policy-file)
- Update policy file: [`POST /api/v2/tailnet/{tailnet}/acl`](./publicapi/tailnet.md#update-policy-file)
- Preview rule matches: [`POST /api/v2/tailnet/{tailnet}/acl/preview`](./publicapi/tailnet.md#preview-policy-file-rule-matches)
- Validate and test policy file: [`POST /api/v2/tailnet/{tailnet}/acl/validate`](./publicapi/tailnet.md#validate-and-test-policy-file)
- [**Devices**](./publicapi/tailnet.md#devices)
- List tailnet devices: [`GET /api/v2/tailnet/{tailnet}/devices`](./publicapi/tailnet.md#list-tailnet-devices)
- [**Keys**](./publicapi/tailnet.md#tailnet-keys)
- List tailnet keys: [`GET /api/v2/tailnet/{tailnet}/keys`](./publicapi/tailnet.md#list-tailnet-keys)
- Create an auth key: [`POST /api/v2/tailnet/{tailnet}/keys`](./publicapi/tailnet.md#create-auth-key)
- Get a key: [`GET /api/v2/tailnet/{tailnet}/keys/{keyid}`](./publicapi/tailnet.md#get-key)
- Delete a key: [`DELETE /api/v2/tailnet/{tailnet}/keys/{keyid}`](./publicapi/tailnet.md#delete-key)
- [**DNS**](./publicapi/tailnet.md#dns)
- [**Nameservers**](./publicapi/tailnet.md#nameservers)
- Get nameservers: [`GET /api/v2/tailnet/{tailnet}/dns/nameservers`](./publicapi/tailnet.md#get-nameservers)
- Set nameservers: [`POST /api/v2/tailnet/{tailnet}/dns/nameservers`](./publicapi/tailnet.md#set-nameservers)
- [**Preferences**](./publicapi/tailnet.md#preferences)
- Get DNS preferences: [`GET /api/v2/tailnet/{tailnet}/dns/preferences`](./publicapi/tailnet.md#get-dns-preferences)
- Set DNS preferences: [`POST /api/v2/tailnet/{tailnet}/dns/preferences`](./publicapi/tailnet.md#set-dns-preferences)
- [**Search Paths**](./publicapi/tailnet.md#search-paths)
- Get search paths: [`GET /api/v2/tailnet/{tailnet}/dns/searchpaths`](./publicapi/tailnet.md#get-search-paths)
- Set search paths: [`POST /api/v2/tailnet/{tailnet}/dns/searchpaths`](./publicapi/tailnet.md#set-search-paths)
- [**Split DNS**](./publicapi/tailnet.md#split-dns)
- Get split DNS: [`GET /api/v2/tailnet/{tailnet}/dns/split-dns`](./publicapi/tailnet.md#get-split-dns)
- Update split DNS: [`PATCH /api/v2/tailnet/{tailnet}/dns/split-dns`](./publicapi/tailnet.md#update-split-dns)
- Set split DNS: [`PUT /api/v2/tailnet/{tailnet}/dns/split-dns`](./publicapi/tailnet.md#set-split-dns)
- [**User invites**](./publicapi/tailnet.md#tailnet-user-invites)
- List user invites: [`GET /api/v2/tailnet/{tailnet}/user-invites`](./publicapi/tailnet.md#list-user-invites)
- Create user invites: [`POST /api/v2/tailnet/{tailnet}/user-invites`](./publicapi/tailnet.md#create-user-invites)
**[User invites](./publicapi/userinvites.md#user-invites)**
- Get user invite: [`GET /api/v2/user-invites/{userInviteId}`](./publicapi/userinvites.md#get-user-invite)
- Delete user invite: [`DELETE /api/v2/user-invites/{userInviteId}`](./publicapi/userinvites.md#delete-user-invite)
- Resend user invite (by email): [`POST /api/v2/user-invites/{userInviteId}/resend`](#resend-user-invite)
**[Device invites](./publicapi/deviceinvites.md#device-invites)**
- Get device invite: [`GET /api/v2/device-invites/{deviceInviteId}`](./publicapi/deviceinvites.md#get-device-invite)
- Delete device invite: [`DELETE /api/v2/device-invites/{deviceInviteId}`](./publicapi/deviceinvites.md#delete-device-invite)
- Resend device invite (by email): [`POST /api/v2/device-invites/{deviceInviteId}/resend`](./publicapi/deviceinvites.md#resend-device-invite)
- Accept device invite [`POST /api/v2/device-invites/-/accept`](#accept-device-invite)
> [!IMPORTANT]
> The Tailscale API documentation has moved to https://tailscale.com/api

View File

@@ -11,6 +11,7 @@ package appc
import (
"context"
"fmt"
"net/netip"
"slices"
"strings"
@@ -21,6 +22,7 @@ import (
"golang.org/x/net/dns/dnsmessage"
"tailscale.com/types/logger"
"tailscale.com/types/views"
"tailscale.com/util/clientmetric"
"tailscale.com/util/dnsname"
"tailscale.com/util/execqueue"
"tailscale.com/util/mak"
@@ -78,6 +80,42 @@ type RouteAdvertiser interface {
UnadvertiseRoute(...netip.Prefix) error
}
var (
metricStoreRoutesRateBuckets = []int64{1, 2, 3, 4, 5, 10, 100, 1000}
metricStoreRoutesNBuckets = []int64{1, 2, 3, 4, 5, 10, 100, 1000, 10000}
metricStoreRoutesRate []*clientmetric.Metric
metricStoreRoutesN []*clientmetric.Metric
)
func initMetricStoreRoutes() {
for _, n := range metricStoreRoutesRateBuckets {
metricStoreRoutesRate = append(metricStoreRoutesRate, clientmetric.NewCounter(fmt.Sprintf("appc_store_routes_rate_%d", n)))
}
metricStoreRoutesRate = append(metricStoreRoutesRate, clientmetric.NewCounter("appc_store_routes_rate_over"))
for _, n := range metricStoreRoutesNBuckets {
metricStoreRoutesN = append(metricStoreRoutesN, clientmetric.NewCounter(fmt.Sprintf("appc_store_routes_n_routes_%d", n)))
}
metricStoreRoutesN = append(metricStoreRoutesN, clientmetric.NewCounter("appc_store_routes_n_routes_over"))
}
func recordMetric(val int64, buckets []int64, metrics []*clientmetric.Metric) {
if len(buckets) < 1 {
return
}
// finds the first bucket where val <=, or len(buckets) if none match
// for bucket values of 1, 10, 100; 0-1 goes to [0], 2-10 goes to [1], 11-100 goes to [2], 101+ goes to [3]
bucket, _ := slices.BinarySearch(buckets, val)
metrics[bucket].Add(1)
}
func metricStoreRoutes(rate, nRoutes int64) {
if len(metricStoreRoutesRate) == 0 {
initMetricStoreRoutes()
}
recordMetric(rate, metricStoreRoutesRateBuckets, metricStoreRoutesRate)
recordMetric(nRoutes, metricStoreRoutesNBuckets, metricStoreRoutesN)
}
// RouteInfo is a data structure used to persist the in memory state of an AppConnector
// so that we can know, even after a restart, which routes came from ACLs and which were
// learned from domains.
@@ -141,6 +179,7 @@ func NewAppConnector(logf logger.Logf, routeAdvertiser RouteAdvertiser, routeInf
}
ac.writeRateMinute = newRateLogger(time.Now, time.Minute, func(c int64, s time.Time, l int64) {
ac.logf("routeInfo write rate: %d in minute starting at %v (%d routes)", c, s, l)
metricStoreRoutes(c, l)
})
ac.writeRateDay = newRateLogger(time.Now, 24*time.Hour, func(c int64, s time.Time, l int64) {
ac.logf("routeInfo write rate: %d in 24 hours starting at %v (%d routes)", c, s, l)
@@ -442,8 +481,10 @@ func (e *AppConnector) ObserveDNSResponse(res []byte) {
}
}
e.logf("[v2] observed new routes for %s: %s", domain, toAdvertise)
e.scheduleAdvertisement(domain, toAdvertise...)
if len(toAdvertise) > 0 {
e.logf("[v2] observed new routes for %s: %s", domain, toAdvertise)
e.scheduleAdvertisement(domain, toAdvertise...)
}
}
}

View File

@@ -15,6 +15,7 @@ import (
"golang.org/x/net/dns/dnsmessage"
"tailscale.com/appc/appctest"
"tailscale.com/tstest"
"tailscale.com/util/clientmetric"
"tailscale.com/util/mak"
"tailscale.com/util/must"
)
@@ -569,3 +570,35 @@ func TestRateLogger(t *testing.T) {
t.Fatalf("wasCalled: got false, want true")
}
}
func TestRouteStoreMetrics(t *testing.T) {
metricStoreRoutes(1, 1)
metricStoreRoutes(1, 1) // the 1 buckets value should be 2
metricStoreRoutes(5, 5) // the 5 buckets value should be 1
metricStoreRoutes(6, 6) // the 10 buckets value should be 1
metricStoreRoutes(10001, 10001) // the over buckets value should be 1
wanted := map[string]int64{
"appc_store_routes_n_routes_1": 2,
"appc_store_routes_rate_1": 2,
"appc_store_routes_n_routes_5": 1,
"appc_store_routes_rate_5": 1,
"appc_store_routes_n_routes_10": 1,
"appc_store_routes_rate_10": 1,
"appc_store_routes_n_routes_over": 1,
"appc_store_routes_rate_over": 1,
}
for _, x := range clientmetric.Metrics() {
if x.Value() != wanted[x.Name()] {
t.Errorf("%s: want: %d, got: %d", x.Name(), wanted[x.Name()], x.Value())
}
}
}
func TestMetricBucketsAreSorted(t *testing.T) {
if !slices.IsSorted(metricStoreRoutesRateBuckets) {
t.Errorf("metricStoreRoutesRateBuckets must be in order")
}
if !slices.IsSorted(metricStoreRoutesNBuckets) {
t.Errorf("metricStoreRoutesNBuckets must be in order")
}
}

View File

@@ -1,6 +1,7 @@
// Copyright (c) Tailscale Inc & AUTHORS
// SPDX-License-Identifier: BSD-3-Clause
// Package appctest contains code to help test App Connectors.
package appctest
import (

View File

@@ -0,0 +1,27 @@
// Copyright (c) Tailscale Inc & AUTHORS
// SPDX-License-Identifier: BSD-3-Clause
//go:build tailscale_go
package tailscaleroot
import (
"fmt"
"os"
"strings"
)
func init() {
tsRev, ok := tailscaleToolchainRev()
if !ok {
panic("binary built with tailscale_go build tag but failed to read build info or find tailscale.toolchain.rev in build info")
}
want := strings.TrimSpace(GoToolchainRev)
if tsRev != want {
if os.Getenv("TS_PERMIT_TOOLCHAIN_MISMATCH") == "1" {
fmt.Fprintf(os.Stderr, "tailscale.toolchain.rev = %q, want %q; but ignoring due to TS_PERMIT_TOOLCHAIN_MISMATCH=1\n", tsRev, want)
return
}
panic(fmt.Sprintf("binary built with tailscale_go build tag but Go toolchain %q doesn't match github.com/tailscale/tailscale expected value %q; override this failure with TS_PERMIT_TOOLCHAIN_MISMATCH=1", tsRev, want))
}
}

View File

@@ -1,21 +1,11 @@
#!/usr/bin/env sh
#
# Runs `go build` with flags configured for docker distribution. All
# it does differently from `go build` is burn git commit and version
# information into the binaries inside docker, so that we can track down user
# issues.
#
############################################################################
#
# WARNING: Tailscale is not yet officially supported in container
# environments, such as Docker and Kubernetes. Though it should work, we
# don't regularly test it, and we know there are some feature limitations.
#
# See current bugs tagged "containers":
# https://github.com/tailscale/tailscale/labels/containers
#
############################################################################
# This script builds Tailscale container images using
# github.com/tailscale/mkctr.
# By default the images will be tagged with the current version and git
# hash of this repository as produced by ./cmd/mkversion.
# This is the image build mechanim used to build the official Tailscale
# container images.
set -eu
@@ -27,12 +17,20 @@ eval "$(./build_dist.sh shellvars)"
DEFAULT_TARGET="client"
DEFAULT_TAGS="v${VERSION_SHORT},v${VERSION_MINOR}"
DEFAULT_BASE="tailscale/alpine-base:3.18"
# Set a few pre-defined OCI annotations. The source annotation is used by tools such as Renovate that scan the linked
# Github repo to find release notes for any new image tags. Note that for official Tailscale images the default
# annotations defined here will be overriden by release scripts that call this script.
# https://github.com/opencontainers/image-spec/blob/main/annotations.md#pre-defined-annotation-keys
DEFAULT_ANNOTATIONS="org.opencontainers.image.source=https://github.com/tailscale/tailscale/blob/main/build_docker.sh,org.opencontainers.image.vendor=Tailscale"
PUSH="${PUSH:-false}"
TARGET="${TARGET:-${DEFAULT_TARGET}}"
TAGS="${TAGS:-${DEFAULT_TAGS}}"
BASE="${BASE:-${DEFAULT_BASE}}"
PLATFORM="${PLATFORM:-}" # default to all platforms
# OCI annotations that will be added to the image.
# https://github.com/opencontainers/image-spec/blob/main/annotations.md
ANNOTATIONS="${ANNOTATIONS:-${DEFAULT_ANNOTATIONS}}"
case "$TARGET" in
client)
@@ -49,10 +47,11 @@ case "$TARGET" in
-X tailscale.com/version.gitCommitStamp=${VERSION_GIT_HASH}" \
--base="${BASE}" \
--tags="${TAGS}" \
--gotags="ts_kube" \
--gotags="ts_kube,ts_package_container" \
--repos="${REPOS}" \
--push="${PUSH}" \
--target="${PLATFORM}" \
--annotations="${ANNOTATIONS}" \
/usr/local/bin/containerboot
;;
operator)
@@ -66,9 +65,11 @@ case "$TARGET" in
-X tailscale.com/version.gitCommitStamp=${VERSION_GIT_HASH}" \
--base="${BASE}" \
--tags="${TAGS}" \
--gotags="ts_kube,ts_package_container" \
--repos="${REPOS}" \
--push="${PUSH}" \
--target="${PLATFORM}" \
--annotations="${ANNOTATIONS}" \
/usr/local/bin/operator
;;
k8s-nameserver)
@@ -82,9 +83,11 @@ case "$TARGET" in
-X tailscale.com/version.gitCommitStamp=${VERSION_GIT_HASH}" \
--base="${BASE}" \
--tags="${TAGS}" \
--gotags="ts_kube,ts_package_container" \
--repos="${REPOS}" \
--push="${PUSH}" \
--target="${PLATFORM}" \
--annotations="${ANNOTATIONS}" \
/usr/local/bin/k8s-nameserver
;;
*)

View File

@@ -19,6 +19,7 @@ import (
// Only one of Src/Dst or Users/Ports may be specified.
type ACLRow struct {
Action string `json:"action,omitempty"` // valid values: "accept"
Proto string `json:"proto,omitempty"` // protocol
Users []string `json:"users,omitempty"` // old name for src
Ports []string `json:"ports,omitempty"` // old name for dst
Src []string `json:"src,omitempty"`
@@ -31,12 +32,23 @@ type ACLRow struct {
type ACLTest struct {
Src string `json:"src,omitempty"` // source
User string `json:"user,omitempty"` // old name for source
Proto string `json:"proto,omitempty"` // protocol
Accept []string `json:"accept,omitempty"` // expected destination ip:port that user can access
Deny []string `json:"deny,omitempty"` // expected destination ip:port that user cannot access
Allow []string `json:"allow,omitempty"` // old name for accept
}
// NodeAttrGrant defines additional string attributes that apply to specific devices.
type NodeAttrGrant struct {
// Target specifies which nodes the attributes apply to. The nodes can be a
// tag (tag:server), user (alice@example.com), group (group:kids), or *.
Target []string `json:"target,omitempty"`
// Attr are the attributes to set on Target(s).
Attr []string `json:"attr,omitempty"`
}
// ACLDetails contains all the details for an ACL.
type ACLDetails struct {
Tests []ACLTest `json:"tests,omitempty"`
@@ -44,6 +56,7 @@ type ACLDetails struct {
Groups map[string][]string `json:"groups,omitempty"`
TagOwners map[string][]string `json:"tagowners,omitempty"`
Hosts map[string]string `json:"hosts,omitempty"`
NodeAttrs []NodeAttrGrant `json:"nodeAttrs,omitempty"`
}
// ACL contains an ACLDetails and metadata.
@@ -150,7 +163,12 @@ func (c *Client) ACLHuJSON(ctx context.Context) (acl *ACLHuJSON, err error) {
// ACLTestFailureSummary specifies the JSON format sent to the
// JavaScript client to be rendered in the HTML.
type ACLTestFailureSummary struct {
User string `json:"user,omitempty"`
// User is the source ("src") value of the ACL test that failed.
// The name "user" is a legacy holdover from the original naming and
// is kept for compatibility but it may also contain any value
// that's valid in a ACL test "src" field.
User string `json:"user,omitempty"`
Errors []string `json:"errors,omitempty"`
Warnings []string `json:"warnings,omitempty"`
}
@@ -270,6 +288,9 @@ type UserRuleMatch struct {
Users []string `json:"users"`
Ports []string `json:"ports"`
LineNumber int `json:"lineNumber"`
// Via is the list of targets through which Users can access Ports.
// See https://tailscale.com/kb/1378/via for more information.
Via []string `json:"via,omitempty"`
// Postures is a list of posture policies that are
// associated with this match. The rules can be looked

View File

@@ -4,7 +4,10 @@
// Package apitype contains types for the Tailscale LocalAPI and control plane API.
package apitype
import "tailscale.com/tailcfg"
import (
"tailscale.com/tailcfg"
"tailscale.com/types/dnstype"
)
// LocalAPIHost is the Host header value used by the LocalAPI.
const LocalAPIHost = "local-tailscaled.sock"
@@ -57,3 +60,19 @@ type ExitNodeSuggestionResponse struct {
Name string
Location tailcfg.LocationView `json:",omitempty"`
}
// DNSOSConfig mimics dns.OSConfig without forcing us to import the entire dns package
// into the CLI.
type DNSOSConfig struct {
Nameservers []string
SearchDomains []string
MatchDomains []string
}
// DNSQueryResponse is the response to a DNS query request sent via LocalAPI.
type DNSQueryResponse struct {
// Bytes is the raw DNS response bytes.
Bytes []byte
// Resolvers is the list of resolvers that the forwarder deemed able to resolve the query.
Resolvers []*dnstype.Resolver
}

View File

@@ -37,8 +37,10 @@ import (
"tailscale.com/safesocket"
"tailscale.com/tailcfg"
"tailscale.com/tka"
"tailscale.com/types/dnstype"
"tailscale.com/types/key"
"tailscale.com/types/tkatype"
"tailscale.com/util/syspolicy/setting"
)
// defaultLocalClient is the default LocalClient when using the legacy
@@ -69,6 +71,14 @@ type LocalClient struct {
// connecting to the GUI client variants.
UseSocketOnly bool
// OmitAuth, if true, omits sending the local Tailscale daemon any
// authentication token that might be required by the platform.
//
// As of 2024-08-12, only macOS uses an authentication token. OmitAuth is
// meant for when Dial is set and the LocalAPI is being proxied to a
// different operating system, such as in integration tests.
OmitAuth bool
// tsClient does HTTP requests to the local Tailscale daemon.
// It's lazily initialized on first use.
tsClient *http.Client
@@ -103,7 +113,7 @@ func (lc *LocalClient) defaultDialer(ctx context.Context, network, addr string)
return d.DialContext(ctx, "tcp", "127.0.0.1:"+strconv.Itoa(port))
}
}
return safesocket.Connect(lc.socket())
return safesocket.ConnectContext(ctx, lc.socket())
}
// DoLocalRequest makes an HTTP request to the local machine's Tailscale daemon.
@@ -124,8 +134,10 @@ func (lc *LocalClient) DoLocalRequest(req *http.Request) (*http.Response, error)
},
}
})
if _, token, err := safesocket.LocalTCPPortAndToken(); err == nil {
req.SetBasicAuth("", token)
if !lc.OmitAuth {
if _, token, err := safesocket.LocalTCPPortAndToken(); err == nil {
req.SetBasicAuth("", token)
}
}
return lc.tsClient.Do(req)
}
@@ -343,6 +355,12 @@ func (lc *LocalClient) DaemonMetrics(ctx context.Context) ([]byte, error) {
return lc.get200(ctx, "/localapi/v0/metrics")
}
// UserMetrics returns the user metrics in
// the Prometheus text exposition format.
func (lc *LocalClient) UserMetrics(ctx context.Context) ([]byte, error) {
return lc.get200(ctx, "/localapi/v0/usermetrics")
}
// IncrementCounter increments the value of a Tailscale daemon's counter
// metric by the given delta. If the metric has yet to exist, a new counter
// metric is created and initialized to delta.
@@ -797,6 +815,62 @@ func (lc *LocalClient) EditPrefs(ctx context.Context, mp *ipn.MaskedPrefs) (*ipn
return decodeJSON[*ipn.Prefs](body)
}
// GetEffectivePolicy returns the effective policy for the specified scope.
func (lc *LocalClient) GetEffectivePolicy(ctx context.Context, scope setting.PolicyScope) (*setting.Snapshot, error) {
scopeID, err := scope.MarshalText()
if err != nil {
return nil, err
}
body, err := lc.get200(ctx, "/localapi/v0/policy/"+string(scopeID))
if err != nil {
return nil, err
}
return decodeJSON[*setting.Snapshot](body)
}
// ReloadEffectivePolicy reloads the effective policy for the specified scope
// by reading and merging policy settings from all applicable policy sources.
func (lc *LocalClient) ReloadEffectivePolicy(ctx context.Context, scope setting.PolicyScope) (*setting.Snapshot, error) {
scopeID, err := scope.MarshalText()
if err != nil {
return nil, err
}
body, err := lc.send(ctx, "POST", "/localapi/v0/policy/"+string(scopeID), 200, http.NoBody)
if err != nil {
return nil, err
}
return decodeJSON[*setting.Snapshot](body)
}
// GetDNSOSConfig returns the system DNS configuration for the current device.
// That is, it returns the DNS configuration that the system would use if Tailscale weren't being used.
func (lc *LocalClient) GetDNSOSConfig(ctx context.Context) (*apitype.DNSOSConfig, error) {
body, err := lc.get200(ctx, "/localapi/v0/dns-osconfig")
if err != nil {
return nil, err
}
var osCfg apitype.DNSOSConfig
if err := json.Unmarshal(body, &osCfg); err != nil {
return nil, fmt.Errorf("invalid dns.OSConfig: %w", err)
}
return &osCfg, nil
}
// QueryDNS executes a DNS query for a name (`google.com.`) and query type (`CNAME`).
// It returns the raw DNS response bytes and the resolvers that were used to answer the query
// (often just one, but can be more if we raced multiple resolvers).
func (lc *LocalClient) QueryDNS(ctx context.Context, name string, queryType string) (bytes []byte, resolvers []*dnstype.Resolver, err error) {
body, err := lc.get200(ctx, fmt.Sprintf("/localapi/v0/dns-query?name=%s&type=%s", url.QueryEscape(name), queryType))
if err != nil {
return nil, nil, err
}
var res apitype.DNSQueryResponse
if err := json.Unmarshal(body, &res); err != nil {
return nil, nil, fmt.Errorf("invalid query response: %w", err)
}
return res.Bytes, res.Resolvers, nil
}
// StartLoginInteractive starts an interactive login.
func (lc *LocalClient) StartLoginInteractive(ctx context.Context) error {
_, err := lc.send(ctx, "POST", "/localapi/v0/login-interactive", http.StatusNoContent, nil)
@@ -933,7 +1007,20 @@ func CertPair(ctx context.Context, domain string) (certPEM, keyPEM []byte, err e
//
// API maturity: this is considered a stable API.
func (lc *LocalClient) CertPair(ctx context.Context, domain string) (certPEM, keyPEM []byte, err error) {
res, err := lc.send(ctx, "GET", "/localapi/v0/cert/"+domain+"?type=pair", 200, nil)
return lc.CertPairWithValidity(ctx, domain, 0)
}
// CertPairWithValidity returns a cert and private key for the provided DNS
// domain.
//
// It returns a cached certificate from disk if it's still valid.
// When minValidity is non-zero, the returned certificate will be valid for at
// least the given duration, if permitted by the CA. If the certificate is
// valid, but for less than minValidity, it will be synchronously renewed.
//
// API maturity: this is considered a stable API.
func (lc *LocalClient) CertPairWithValidity(ctx context.Context, domain string, minValidity time.Duration) (certPEM, keyPEM []byte, err error) {
res, err := lc.send(ctx, "GET", fmt.Sprintf("/localapi/v0/cert/%s?type=pair&min_validity=%s", domain, minValidity), 200, nil)
if err != nil {
return nil, nil, err
}
@@ -1240,6 +1327,17 @@ func (lc *LocalClient) SetServeConfig(ctx context.Context, config *ipn.ServeConf
return nil
}
// DisconnectControl shuts down all connections to control, thus making control consider this node inactive. This can be
// run on HA subnet router or app connector replicas before shutting them down to ensure peers get told to switch over
// to another replica whilst there is still some grace period for the existing connections to terminate.
func (lc *LocalClient) DisconnectControl(ctx context.Context) error {
_, _, err := lc.sendWithHeaders(ctx, "POST", "/localapi/v0/disconnect-control", 200, nil, nil)
if err != nil {
return fmt.Errorf("error disconnecting control: %w", err)
}
return nil
}
// NetworkLockDisable shuts down network-lock across the tailnet.
func (lc *LocalClient) NetworkLockDisable(ctx context.Context, secret []byte) error {
if _, err := lc.send(ctx, "POST", "/localapi/v0/tka/disable", 200, bytes.NewReader(secret)); err != nil {

View File

@@ -1,10 +1,10 @@
// Copyright (c) Tailscale Inc & AUTHORS
// SPDX-License-Identifier: BSD-3-Clause
//go:build !go1.21
//go:build !go1.23
package tailscale
func init() {
you_need_Go_1_21_to_compile_Tailscale()
you_need_Go_1_23_to_compile_Tailscale()
}

View File

@@ -51,6 +51,9 @@ type Client struct {
// HTTPClient optionally specifies an alternate HTTP client to use.
// If nil, http.DefaultClient is used.
HTTPClient *http.Client
// UserAgent optionally specifies an alternate User-Agent header
UserAgent string
}
func (c *Client) httpClient() *http.Client {
@@ -97,8 +100,9 @@ func (c *Client) setAuth(r *http.Request) {
// and can be changed manually by the user.
func NewClient(tailnet string, auth AuthMethod) *Client {
return &Client{
tailnet: tailnet,
auth: auth,
tailnet: tailnet,
auth: auth,
UserAgent: "tailscale-client-oss",
}
}
@@ -110,17 +114,16 @@ func (c *Client) Do(req *http.Request) (*http.Response, error) {
return nil, errors.New("use of Client without setting I_Acknowledge_This_API_Is_Unstable")
}
c.setAuth(req)
if c.UserAgent != "" {
req.Header.Set("User-Agent", c.UserAgent)
}
return c.httpClient().Do(req)
}
// sendRequest add the authentication key to the request and sends it. It
// receives the response and reads up to 10MB of it.
func (c *Client) sendRequest(req *http.Request) ([]byte, *http.Response, error) {
if !I_Acknowledge_This_API_Is_Unstable {
return nil, nil, errors.New("use of Client without setting I_Acknowledge_This_API_Is_Unstable")
}
c.setAuth(req)
resp, err := c.httpClient().Do(req)
resp, err := c.Do(req)
if err != nil {
return nil, resp, err
}

View File

@@ -3,7 +3,7 @@
"version": "0.0.1",
"license": "BSD-3-Clause",
"engines": {
"node": "18.16.1",
"node": "18.20.4",
"yarn": "1.22.19"
},
"type": "module",

View File

@@ -17,7 +17,6 @@ import (
"os"
"path"
"path/filepath"
"slices"
"strings"
"sync"
"time"
@@ -27,6 +26,7 @@ import (
"tailscale.com/client/tailscale/apitype"
"tailscale.com/clientupdate"
"tailscale.com/envknob"
"tailscale.com/envknob/featureknob"
"tailscale.com/hostinfo"
"tailscale.com/ipn"
"tailscale.com/ipn/ipnstate"
@@ -35,6 +35,7 @@ import (
"tailscale.com/net/tsaddr"
"tailscale.com/tailcfg"
"tailscale.com/types/logger"
"tailscale.com/types/views"
"tailscale.com/util/httpm"
"tailscale.com/version"
"tailscale.com/version/distro"
@@ -113,11 +114,6 @@ const (
ManageServerMode ServerMode = "manage"
)
var (
exitNodeRouteV4 = netip.MustParsePrefix("0.0.0.0/0")
exitNodeRouteV6 = netip.MustParsePrefix("::/0")
)
// ServerOpts contains options for constructing a new Server.
type ServerOpts struct {
// Mode specifies the mode of web client being constructed.
@@ -283,6 +279,12 @@ func (s *Server) serve(w http.ResponseWriter, r *http.Request) {
}
}
if r.URL.Path == "/metrics" {
r.URL.Path = "/api/local/v0/usermetrics"
s.proxyRequestToLocalAPI(w, r)
return
}
if strings.HasPrefix(r.URL.Path, "/api/") {
switch {
case r.URL.Path == "/api/auth" && r.Method == httpm.GET:
@@ -921,10 +923,10 @@ func (s *Server) serveGetNodeData(w http.ResponseWriter, r *http.Request) {
return p == route
})
}
data.AdvertisingExitNodeApproved = routeApproved(exitNodeRouteV4) || routeApproved(exitNodeRouteV6)
data.AdvertisingExitNodeApproved = routeApproved(tsaddr.AllIPv4()) || routeApproved(tsaddr.AllIPv6())
for _, r := range prefs.AdvertiseRoutes {
if r == exitNodeRouteV4 || r == exitNodeRouteV6 {
if tsaddr.IsExitRoute(r) {
data.AdvertisingExitNode = true
} else {
data.AdvertisedRoutes = append(data.AdvertisedRoutes, subnetRoute{
@@ -959,37 +961,16 @@ func (s *Server) serveGetNodeData(w http.ResponseWriter, r *http.Request) {
}
func availableFeatures() map[string]bool {
env := hostinfo.GetEnvType()
features := map[string]bool{
"advertise-exit-node": true, // available on all platforms
"advertise-routes": true, // available on all platforms
"use-exit-node": canUseExitNode(env) == nil,
"ssh": envknob.CanRunTailscaleSSH() == nil,
"use-exit-node": featureknob.CanUseExitNode() == nil,
"ssh": featureknob.CanRunTailscaleSSH() == nil,
"auto-update": version.IsUnstableBuild() && clientupdate.CanAutoUpdate(),
}
if env == hostinfo.HomeAssistantAddOn {
// Setting SSH on Home Assistant causes trouble on startup
// (since the flag is not being passed to `tailscale up`).
// Although Tailscale SSH does work here,
// it's not terribly useful since it's running in a separate container.
features["ssh"] = false
}
return features
}
func canUseExitNode(env hostinfo.EnvType) error {
switch dist := distro.Get(); dist {
case distro.Synology, // see https://github.com/tailscale/tailscale/issues/1995
distro.QNAP,
distro.Unraid:
return fmt.Errorf("Tailscale exit nodes cannot be used on %s.", dist)
}
if env == hostinfo.HomeAssistantAddOn {
return errors.New("Tailscale exit nodes cannot be used on Home Assistant.")
}
return nil
}
// aclsAllowAccess returns whether tailnet ACLs (as expressed in the provided filter rules)
// permit any devices to access the local web client.
// This does not currently check whether a specific device can connect, just any device.
@@ -1065,7 +1046,7 @@ func (s *Server) servePostRoutes(ctx context.Context, data postRoutesRequest) er
var currNonExitRoutes []string
var currAdvertisingExitNode bool
for _, r := range prefs.AdvertiseRoutes {
if r == exitNodeRouteV4 || r == exitNodeRouteV6 {
if tsaddr.IsExitRoute(r) {
currAdvertisingExitNode = true
continue
}
@@ -1086,12 +1067,7 @@ func (s *Server) servePostRoutes(ctx context.Context, data postRoutesRequest) er
return err
}
hasExitNodeRoute := func(all []netip.Prefix) bool {
return slices.Contains(all, exitNodeRouteV4) ||
slices.Contains(all, exitNodeRouteV6)
}
if !data.UseExitNode.IsZero() && hasExitNodeRoute(routes) {
if !data.UseExitNode.IsZero() && tsaddr.ContainsExitRoutes(views.SliceOf(routes)) {
return errors.New("cannot use and advertise exit node at same time")
}

View File

@@ -5382,9 +5382,9 @@ wrappy@1:
integrity sha512-l4Sp/DRseor9wL6EvV2+TuQn63dMkPjZ/sp9XkghTEbV9KlPS1xUsZ3u7/IQO4wxtcFB4bgpQPRcR3QCvezPcQ==
ws@^8.14.2:
version "8.14.2"
resolved "https://registry.yarnpkg.com/ws/-/ws-8.14.2.tgz#6c249a806eb2db7a20d26d51e7709eab7b2e6c7f"
integrity sha512-wEBG1ftX4jcglPxgFCMJmZ2PLtSbJ2Peg6TmpJFTbe9GZYOQCDPdMYu/Tm0/bGZkw8paZnJY45J4K2PZrLYq8g==
version "8.17.1"
resolved "https://registry.yarnpkg.com/ws/-/ws-8.17.1.tgz#9293da530bb548febc95371d90f9c878727d919b"
integrity sha512-6XQFvXTkbfUOZOKKILFG1PDK2NDQs4azKQl26T0YS5CxqWLgXajbPZ+h4gZekJyRqFU8pvnbAbbs/3TgRPy+GQ==
xml-name-validator@^5.0.0:
version "5.0.0"

View File

@@ -27,11 +27,8 @@ import (
"strconv"
"strings"
"github.com/google/uuid"
"tailscale.com/clientupdate/distsign"
"tailscale.com/types/logger"
"tailscale.com/util/cmpver"
"tailscale.com/util/winutil"
"tailscale.com/version"
"tailscale.com/version/distro"
)
@@ -248,6 +245,11 @@ func (up *Updater) getUpdateFunction() (fn updateFunction, canAutoUpdate bool) {
// CanAutoUpdate reports whether auto-updating via the clientupdate package
// is supported for the current os/distro.
func CanAutoUpdate() bool {
if version.IsMacSysExt() {
// Macsys uses Sparkle for auto-updates, which doesn't have an update
// function in this package.
return true
}
_, canAutoUpdate := (&Updater{}).getUpdateFunction()
return canAutoUpdate
}
@@ -751,164 +753,6 @@ func (up *Updater) updateMacAppStore() error {
return nil
}
const (
// winMSIEnv is the environment variable that, if set, is the MSI file for
// the update command to install. It's passed like this so we can stop the
// tailscale.exe process from running before the msiexec process runs and
// tries to overwrite ourselves.
winMSIEnv = "TS_UPDATE_WIN_MSI"
// winExePathEnv is the environment variable that is set along with
// winMSIEnv and carries the full path of the calling tailscale.exe binary.
// It is used to re-launch the GUI process (tailscale-ipn.exe) after
// install is complete.
winExePathEnv = "TS_UPDATE_WIN_EXE_PATH"
)
var (
verifyAuthenticode func(string) error // set non-nil only on Windows
markTempFileFunc func(string) error // set non-nil only on Windows
)
func (up *Updater) updateWindows() error {
if msi := os.Getenv(winMSIEnv); msi != "" {
// stdout/stderr from this part of the install could be lost since the
// parent tailscaled is replaced. Create a temp log file to have some
// output to debug with in case update fails.
close, err := up.switchOutputToFile()
if err != nil {
up.Logf("failed to create log file for installation: %v; proceeding with existing outputs", err)
} else {
defer close.Close()
}
up.Logf("installing %v ...", msi)
if err := up.installMSI(msi); err != nil {
up.Logf("MSI install failed: %v", err)
return err
}
up.Logf("success.")
return nil
}
if !winutil.IsCurrentProcessElevated() {
return errors.New(`update must be run as Administrator
you can run the command prompt as Administrator one of these ways:
* right-click cmd.exe, select 'Run as administrator'
* press Windows+x, then press a
* press Windows+r, type in "cmd", then press Ctrl+Shift+Enter`)
}
ver, err := requestedTailscaleVersion(up.Version, up.Track)
if err != nil {
return err
}
arch := runtime.GOARCH
if arch == "386" {
arch = "x86"
}
if !up.confirm(ver) {
return nil
}
tsDir := filepath.Join(os.Getenv("ProgramData"), "Tailscale")
msiDir := filepath.Join(tsDir, "MSICache")
if fi, err := os.Stat(tsDir); err != nil {
return fmt.Errorf("expected %s to exist, got stat error: %w", tsDir, err)
} else if !fi.IsDir() {
return fmt.Errorf("expected %s to be a directory; got %v", tsDir, fi.Mode())
}
if err := os.MkdirAll(msiDir, 0700); err != nil {
return err
}
up.cleanupOldDownloads(filepath.Join(msiDir, "*.msi"))
pkgsPath := fmt.Sprintf("%s/tailscale-setup-%s-%s.msi", up.Track, ver, arch)
msiTarget := filepath.Join(msiDir, path.Base(pkgsPath))
if err := up.downloadURLToFile(pkgsPath, msiTarget); err != nil {
return err
}
up.Logf("verifying MSI authenticode...")
if err := verifyAuthenticode(msiTarget); err != nil {
return fmt.Errorf("authenticode verification of %s failed: %w", msiTarget, err)
}
up.Logf("authenticode verification succeeded")
up.Logf("making tailscale.exe copy to switch to...")
up.cleanupOldDownloads(filepath.Join(os.TempDir(), "tailscale-updater-*.exe"))
selfOrig, selfCopy, err := makeSelfCopy()
if err != nil {
return err
}
defer os.Remove(selfCopy)
up.Logf("running tailscale.exe copy for final install...")
cmd := exec.Command(selfCopy, "update")
cmd.Env = append(os.Environ(), winMSIEnv+"="+msiTarget, winExePathEnv+"="+selfOrig)
cmd.Stdout = up.Stderr
cmd.Stderr = up.Stderr
cmd.Stdin = os.Stdin
if err := cmd.Start(); err != nil {
return err
}
// Once it's started, exit ourselves, so the binary is free
// to be replaced.
os.Exit(0)
panic("unreachable")
}
func (up *Updater) switchOutputToFile() (io.Closer, error) {
var logFilePath string
exePath, err := os.Executable()
if err != nil {
logFilePath = filepath.Join(os.TempDir(), "tailscale-updater.log")
} else {
logFilePath = strings.TrimSuffix(exePath, ".exe") + ".log"
}
up.Logf("writing update output to %q", logFilePath)
logFile, err := os.Create(logFilePath)
if err != nil {
return nil, err
}
up.Logf = func(m string, args ...any) {
fmt.Fprintf(logFile, m+"\n", args...)
}
up.Stdout = logFile
up.Stderr = logFile
return logFile, nil
}
func (up *Updater) installMSI(msi string) error {
var err error
for tries := 0; tries < 2; tries++ {
cmd := exec.Command("msiexec.exe", "/i", filepath.Base(msi), "/quiet", "/norestart", "/qn")
cmd.Dir = filepath.Dir(msi)
cmd.Stdout = up.Stdout
cmd.Stderr = up.Stderr
cmd.Stdin = os.Stdin
err = cmd.Run()
if err == nil {
break
}
up.Logf("Install attempt failed: %v", err)
uninstallVersion := up.currentVersion
if v := os.Getenv("TS_DEBUG_UNINSTALL_VERSION"); v != "" {
uninstallVersion = v
}
// Assume it's a downgrade, which msiexec won't permit. Uninstall our current version first.
up.Logf("Uninstalling current version %q for downgrade...", uninstallVersion)
cmd = exec.Command("msiexec.exe", "/x", msiUUIDForVersion(uninstallVersion), "/norestart", "/qn")
cmd.Stdout = up.Stdout
cmd.Stderr = up.Stderr
cmd.Stdin = os.Stdin
err = cmd.Run()
up.Logf("msiexec uninstall: %v", err)
}
return err
}
// cleanupOldDownloads removes all files matching glob (see filepath.Glob).
// Only regular files are removed, so the glob must match specific files and
// not directories.
@@ -933,53 +777,6 @@ func (up *Updater) cleanupOldDownloads(glob string) {
}
}
func msiUUIDForVersion(ver string) string {
arch := runtime.GOARCH
if arch == "386" {
arch = "x86"
}
track, err := versionToTrack(ver)
if err != nil {
track = UnstableTrack
}
msiURL := fmt.Sprintf("https://pkgs.tailscale.com/%s/tailscale-setup-%s-%s.msi", track, ver, arch)
return "{" + strings.ToUpper(uuid.NewSHA1(uuid.NameSpaceURL, []byte(msiURL)).String()) + "}"
}
func makeSelfCopy() (origPathExe, tmpPathExe string, err error) {
selfExe, err := os.Executable()
if err != nil {
return "", "", err
}
f, err := os.Open(selfExe)
if err != nil {
return "", "", err
}
defer f.Close()
f2, err := os.CreateTemp("", "tailscale-updater-*.exe")
if err != nil {
return "", "", err
}
if f := markTempFileFunc; f != nil {
if err := f(f2.Name()); err != nil {
return "", "", err
}
}
if _, err := io.Copy(f2, f); err != nil {
f2.Close()
return "", "", err
}
return selfExe, f2.Name(), f2.Close()
}
func (up *Updater) downloadURLToFile(pathSrc, fileDst string) (ret error) {
c, err := distsign.NewClient(up.Logf, up.PkgsAddr)
if err != nil {
return err
}
return c.Download(context.Background(), pathSrc, fileDst)
}
func (up *Updater) updateFreeBSD() (err error) {
if up.Version != "" {
return errors.New("installing a specific version on FreeBSD is not supported")

View File

@@ -0,0 +1,20 @@
// Copyright (c) Tailscale Inc & AUTHORS
// SPDX-License-Identifier: BSD-3-Clause
//go:build (linux && !android) || windows
package clientupdate
import (
"context"
"tailscale.com/clientupdate/distsign"
)
func (up *Updater) downloadURLToFile(pathSrc, fileDst string) (ret error) {
c, err := distsign.NewClient(up.Logf, up.PkgsAddr)
if err != nil {
return err
}
return c.Download(context.Background(), pathSrc, fileDst)
}

View File

@@ -0,0 +1,10 @@
// Copyright (c) Tailscale Inc & AUTHORS
// SPDX-License-Identifier: BSD-3-Clause
//go:build !((linux && !android) || windows)
package clientupdate
func (up *Updater) downloadURLToFile(pathSrc, fileDst string) (ret error) {
panic("unreachable")
}

View File

@@ -0,0 +1,10 @@
// Copyright (c) Tailscale Inc & AUTHORS
// SPDX-License-Identifier: BSD-3-Clause
//go:build !windows
package clientupdate
func (up *Updater) updateWindows() error {
panic("unreachable")
}

View File

@@ -7,13 +7,57 @@
package clientupdate
import (
"errors"
"fmt"
"io"
"os"
"os/exec"
"path"
"path/filepath"
"runtime"
"strings"
"github.com/google/uuid"
"golang.org/x/sys/windows"
"tailscale.com/util/winutil"
"tailscale.com/util/winutil/authenticode"
)
func init() {
markTempFileFunc = markTempFileWindows
verifyAuthenticode = verifyTailscale
const (
// winMSIEnv is the environment variable that, if set, is the MSI file for
// the update command to install. It's passed like this so we can stop the
// tailscale.exe process from running before the msiexec process runs and
// tries to overwrite ourselves.
winMSIEnv = "TS_UPDATE_WIN_MSI"
// winExePathEnv is the environment variable that is set along with
// winMSIEnv and carries the full path of the calling tailscale.exe binary.
// It is used to re-launch the GUI process (tailscale-ipn.exe) after
// install is complete.
winExePathEnv = "TS_UPDATE_WIN_EXE_PATH"
)
func makeSelfCopy() (origPathExe, tmpPathExe string, err error) {
selfExe, err := os.Executable()
if err != nil {
return "", "", err
}
f, err := os.Open(selfExe)
if err != nil {
return "", "", err
}
defer f.Close()
f2, err := os.CreateTemp("", "tailscale-updater-*.exe")
if err != nil {
return "", "", err
}
if err := markTempFileWindows(f2.Name()); err != nil {
return "", "", err
}
if _, err := io.Copy(f2, f); err != nil {
f2.Close()
return "", "", err
}
return selfExe, f2.Name(), f2.Close()
}
func markTempFileWindows(name string) error {
@@ -23,6 +67,159 @@ func markTempFileWindows(name string) error {
const certSubjectTailscale = "Tailscale Inc."
func verifyTailscale(path string) error {
func verifyAuthenticode(path string) error {
return authenticode.Verify(path, certSubjectTailscale)
}
func (up *Updater) updateWindows() error {
if msi := os.Getenv(winMSIEnv); msi != "" {
// stdout/stderr from this part of the install could be lost since the
// parent tailscaled is replaced. Create a temp log file to have some
// output to debug with in case update fails.
close, err := up.switchOutputToFile()
if err != nil {
up.Logf("failed to create log file for installation: %v; proceeding with existing outputs", err)
} else {
defer close.Close()
}
up.Logf("installing %v ...", msi)
if err := up.installMSI(msi); err != nil {
up.Logf("MSI install failed: %v", err)
return err
}
up.Logf("success.")
return nil
}
if !winutil.IsCurrentProcessElevated() {
return errors.New(`update must be run as Administrator
you can run the command prompt as Administrator one of these ways:
* right-click cmd.exe, select 'Run as administrator'
* press Windows+x, then press a
* press Windows+r, type in "cmd", then press Ctrl+Shift+Enter`)
}
ver, err := requestedTailscaleVersion(up.Version, up.Track)
if err != nil {
return err
}
arch := runtime.GOARCH
if arch == "386" {
arch = "x86"
}
if !up.confirm(ver) {
return nil
}
tsDir := filepath.Join(os.Getenv("ProgramData"), "Tailscale")
msiDir := filepath.Join(tsDir, "MSICache")
if fi, err := os.Stat(tsDir); err != nil {
return fmt.Errorf("expected %s to exist, got stat error: %w", tsDir, err)
} else if !fi.IsDir() {
return fmt.Errorf("expected %s to be a directory; got %v", tsDir, fi.Mode())
}
if err := os.MkdirAll(msiDir, 0700); err != nil {
return err
}
up.cleanupOldDownloads(filepath.Join(msiDir, "*.msi"))
pkgsPath := fmt.Sprintf("%s/tailscale-setup-%s-%s.msi", up.Track, ver, arch)
msiTarget := filepath.Join(msiDir, path.Base(pkgsPath))
if err := up.downloadURLToFile(pkgsPath, msiTarget); err != nil {
return err
}
up.Logf("verifying MSI authenticode...")
if err := verifyAuthenticode(msiTarget); err != nil {
return fmt.Errorf("authenticode verification of %s failed: %w", msiTarget, err)
}
up.Logf("authenticode verification succeeded")
up.Logf("making tailscale.exe copy to switch to...")
up.cleanupOldDownloads(filepath.Join(os.TempDir(), "tailscale-updater-*.exe"))
selfOrig, selfCopy, err := makeSelfCopy()
if err != nil {
return err
}
defer os.Remove(selfCopy)
up.Logf("running tailscale.exe copy for final install...")
cmd := exec.Command(selfCopy, "update")
cmd.Env = append(os.Environ(), winMSIEnv+"="+msiTarget, winExePathEnv+"="+selfOrig)
cmd.Stdout = up.Stderr
cmd.Stderr = up.Stderr
cmd.Stdin = os.Stdin
if err := cmd.Start(); err != nil {
return err
}
// Once it's started, exit ourselves, so the binary is free
// to be replaced.
os.Exit(0)
panic("unreachable")
}
func (up *Updater) installMSI(msi string) error {
var err error
for tries := 0; tries < 2; tries++ {
cmd := exec.Command("msiexec.exe", "/i", filepath.Base(msi), "/quiet", "/norestart", "/qn")
cmd.Dir = filepath.Dir(msi)
cmd.Stdout = up.Stdout
cmd.Stderr = up.Stderr
cmd.Stdin = os.Stdin
err = cmd.Run()
if err == nil {
break
}
up.Logf("Install attempt failed: %v", err)
uninstallVersion := up.currentVersion
if v := os.Getenv("TS_DEBUG_UNINSTALL_VERSION"); v != "" {
uninstallVersion = v
}
// Assume it's a downgrade, which msiexec won't permit. Uninstall our current version first.
up.Logf("Uninstalling current version %q for downgrade...", uninstallVersion)
cmd = exec.Command("msiexec.exe", "/x", msiUUIDForVersion(uninstallVersion), "/norestart", "/qn")
cmd.Stdout = up.Stdout
cmd.Stderr = up.Stderr
cmd.Stdin = os.Stdin
err = cmd.Run()
up.Logf("msiexec uninstall: %v", err)
}
return err
}
func msiUUIDForVersion(ver string) string {
arch := runtime.GOARCH
if arch == "386" {
arch = "x86"
}
track, err := versionToTrack(ver)
if err != nil {
track = UnstableTrack
}
msiURL := fmt.Sprintf("https://pkgs.tailscale.com/%s/tailscale-setup-%s-%s.msi", track, ver, arch)
return "{" + strings.ToUpper(uuid.NewSHA1(uuid.NameSpaceURL, []byte(msiURL)).String()) + "}"
}
func (up *Updater) switchOutputToFile() (io.Closer, error) {
var logFilePath string
exePath, err := os.Executable()
if err != nil {
logFilePath = filepath.Join(os.TempDir(), "tailscale-updater.log")
} else {
logFilePath = strings.TrimSuffix(exePath, ".exe") + ".log"
}
up.Logf("writing update output to %q", logFilePath)
logFile, err := os.Create(logFilePath)
if err != nil {
return nil, err
}
up.Logf = func(m string, args ...any) {
fmt.Fprintf(logFile, m+"\n", args...)
}
up.Stdout = logFile
up.Stderr = logFile
return logFile, nil
}

View File

@@ -47,7 +47,7 @@ func main() {
it := codegen.NewImportTracker(pkg.Types)
buf := new(bytes.Buffer)
for _, typeName := range typeNames {
typ, ok := namedTypes[typeName]
typ, ok := namedTypes[typeName].(*types.Named)
if !ok {
log.Fatalf("could not find type %s", typeName)
}
@@ -78,7 +78,11 @@ func main() {
w(" return false")
w("}")
}
cloneOutput := pkg.Name + "_clone.go"
cloneOutput := pkg.Name + "_clone"
if *flagBuildTags == "test" {
cloneOutput += "_test"
}
cloneOutput += ".go"
if err := codegen.WritePackageFile("tailscale.com/cmd/cloner", pkg, cloneOutput, it, buf); err != nil {
log.Fatal(err)
}
@@ -91,16 +95,19 @@ func gen(buf *bytes.Buffer, it *codegen.ImportTracker, typ *types.Named) {
}
name := typ.Obj().Name()
typeParams := typ.Origin().TypeParams()
_, typeParamNames := codegen.FormatTypeParams(typeParams, it)
nameWithParams := name + typeParamNames
fmt.Fprintf(buf, "// Clone makes a deep copy of %s.\n", name)
fmt.Fprintf(buf, "// The result aliases no memory with the original.\n")
fmt.Fprintf(buf, "func (src *%s) Clone() *%s {\n", name, name)
fmt.Fprintf(buf, "func (src *%s) Clone() *%s {\n", nameWithParams, nameWithParams)
writef := func(format string, args ...any) {
fmt.Fprintf(buf, "\t"+format+"\n", args...)
}
writef("if src == nil {")
writef("\treturn nil")
writef("}")
writef("dst := new(%s)", name)
writef("dst := new(%s)", nameWithParams)
writef("*dst = *src")
for i := range t.NumFields() {
fname := t.Field(i).Name()
@@ -108,7 +115,7 @@ func gen(buf *bytes.Buffer, it *codegen.ImportTracker, typ *types.Named) {
if !codegen.ContainsPointers(ft) || codegen.HasNoClone(t.Tag(i)) {
continue
}
if named, _ := ft.(*types.Named); named != nil {
if named, _ := codegen.NamedTypeOf(ft); named != nil {
if codegen.IsViewType(ft) {
writef("dst.%s = src.%s", fname, fname)
continue
@@ -126,16 +133,23 @@ func gen(buf *bytes.Buffer, it *codegen.ImportTracker, typ *types.Named) {
writef("dst.%s = make([]%s, len(src.%s))", fname, n, fname)
writef("for i := range dst.%s {", fname)
if ptr, isPtr := ft.Elem().(*types.Pointer); isPtr {
if _, isBasic := ptr.Elem().Underlying().(*types.Basic); isBasic {
it.Import("tailscale.com/types/ptr")
writef("if src.%s[i] == nil { dst.%s[i] = nil } else {", fname, fname)
writef("\tdst.%s[i] = ptr.To(*src.%s[i])", fname, fname)
writef("}")
writef("if src.%s[i] == nil { dst.%s[i] = nil } else {", fname, fname)
if codegen.ContainsPointers(ptr.Elem()) {
if _, isIface := ptr.Elem().Underlying().(*types.Interface); isIface {
it.Import("tailscale.com/types/ptr")
writef("\tdst.%s[i] = ptr.To((*src.%s[i]).Clone())", fname, fname)
} else {
writef("\tdst.%s[i] = src.%s[i].Clone()", fname, fname)
}
} else {
writef("\tdst.%s[i] = src.%s[i].Clone()", fname, fname)
it.Import("tailscale.com/types/ptr")
writef("\tdst.%s[i] = ptr.To(*src.%s[i])", fname, fname)
}
writef("}")
} else if ft.Elem().String() == "encoding/json.RawMessage" {
writef("\tdst.%s[i] = append(src.%s[i][:0:0], src.%s[i]...)", fname, fname, fname)
} else if _, isIface := ft.Elem().Underlying().(*types.Interface); isIface {
writef("\tdst.%s[i] = src.%s[i].Clone()", fname, fname)
} else {
writef("\tdst.%s[i] = *src.%s[i].Clone()", fname, fname)
}
@@ -145,14 +159,19 @@ func gen(buf *bytes.Buffer, it *codegen.ImportTracker, typ *types.Named) {
writef("dst.%s = append(src.%s[:0:0], src.%s...)", fname, fname, fname)
}
case *types.Pointer:
if named, _ := ft.Elem().(*types.Named); named != nil && codegen.ContainsPointers(ft.Elem()) {
base := ft.Elem()
hasPtrs := codegen.ContainsPointers(base)
if named, _ := codegen.NamedTypeOf(base); named != nil && hasPtrs {
writef("dst.%s = src.%s.Clone()", fname, fname)
continue
}
it.Import("tailscale.com/types/ptr")
writef("if dst.%s != nil {", fname)
writef("\tdst.%s = ptr.To(*src.%s)", fname, fname)
if codegen.ContainsPointers(ft.Elem()) {
if _, isIface := base.Underlying().(*types.Interface); isIface && hasPtrs {
writef("\tdst.%s = ptr.To((*src.%s).Clone())", fname, fname)
} else if !hasPtrs {
writef("\tdst.%s = ptr.To(*src.%s)", fname, fname)
} else {
writef("\t" + `panic("TODO pointers in pointers")`)
}
writef("}")
@@ -172,18 +191,50 @@ func gen(buf *bytes.Buffer, it *codegen.ImportTracker, typ *types.Named) {
writef("if dst.%s != nil {", fname)
writef("\tdst.%s = map[%s]%s{}", fname, it.QualifiedName(ft.Key()), it.QualifiedName(elem))
writef("\tfor k, v := range src.%s {", fname)
switch elem.(type) {
switch elem := elem.Underlying().(type) {
case *types.Pointer:
writef("\t\tdst.%s[k] = v.Clone()", fname)
writef("\t\tif v == nil { dst.%s[k] = nil } else {", fname)
if base := elem.Elem().Underlying(); codegen.ContainsPointers(base) {
if _, isIface := base.(*types.Interface); isIface {
it.Import("tailscale.com/types/ptr")
writef("\t\t\tdst.%s[k] = ptr.To((*v).Clone())", fname)
} else {
writef("\t\t\tdst.%s[k] = v.Clone()", fname)
}
} else {
it.Import("tailscale.com/types/ptr")
writef("\t\t\tdst.%s[k] = ptr.To(*v)", fname)
}
writef("}")
case *types.Interface:
if cloneResultType := methodResultType(elem, "Clone"); cloneResultType != nil {
if _, isPtr := cloneResultType.(*types.Pointer); isPtr {
writef("\t\tdst.%s[k] = *(v.Clone())", fname)
} else {
writef("\t\tdst.%s[k] = v.Clone()", fname)
}
} else {
writef(`panic("%s (%v) does not have a Clone method")`, fname, elem)
}
default:
writef("\t\tdst.%s[k] = *(v.Clone())", fname)
}
writef("\t}")
writef("}")
} else {
it.Import("maps")
writef("\tdst.%s = maps.Clone(src.%s)", fname, fname)
}
case *types.Interface:
// If ft is an interface with a "Clone() ft" method, it can be used to clone the field.
// This includes scenarios where ft is a constrained type parameter.
if cloneResultType := methodResultType(ft, "Clone"); cloneResultType.Underlying() == ft {
writef("dst.%s = src.%s.Clone()", fname, fname)
continue
}
writef(`panic("%s (%v) does not have a compatible Clone method")`, fname, ft)
default:
writef(`panic("TODO: %s (%T)")`, fname, ft)
}
@@ -191,7 +242,7 @@ func gen(buf *bytes.Buffer, it *codegen.ImportTracker, typ *types.Named) {
writef("return dst")
fmt.Fprintf(buf, "}\n\n")
buf.Write(codegen.AssertStructUnchanged(t, name, "Clone", it))
buf.Write(codegen.AssertStructUnchanged(t, name, typeParams, "Clone", it))
}
// hasBasicUnderlying reports true when typ.Underlying() is a slice or a map.
@@ -203,3 +254,15 @@ func hasBasicUnderlying(typ types.Type) bool {
return false
}
}
func methodResultType(typ types.Type, method string) types.Type {
viewMethod := codegen.LookupMethod(typ, method)
if viewMethod == nil {
return nil
}
sig, ok := viewMethod.Type().(*types.Signature)
if !ok || sig.Results().Len() != 1 {
return nil
}
return sig.Results().At(0).Type()
}

View File

@@ -3,6 +3,7 @@
//go:generate go run tailscale.com/cmd/cloner -clonefunc=true -type SliceContainer
// Package clonerex is an example package for the cloner tool.
package clonerex
type SliceContainer struct {

View File

@@ -0,0 +1,262 @@
// Copyright (c) Tailscale Inc & AUTHORS
// SPDX-License-Identifier: BSD-3-Clause
//go:build linux
package main
import (
"context"
"fmt"
"log"
"net"
"net/netip"
"os"
"path/filepath"
"strings"
"tailscale.com/util/linuxfw"
)
// ensureIPForwarding enables IPv4/IPv6 forwarding for the container.
func ensureIPForwarding(root, clusterProxyTargetIP, tailnetTargetIP, tailnetTargetFQDN string, routes *string) error {
var (
v4Forwarding, v6Forwarding bool
)
if clusterProxyTargetIP != "" {
proxyIP, err := netip.ParseAddr(clusterProxyTargetIP)
if err != nil {
return fmt.Errorf("invalid cluster destination IP: %v", err)
}
if proxyIP.Is4() {
v4Forwarding = true
} else {
v6Forwarding = true
}
}
if tailnetTargetIP != "" {
proxyIP, err := netip.ParseAddr(tailnetTargetIP)
if err != nil {
return fmt.Errorf("invalid tailnet destination IP: %v", err)
}
if proxyIP.Is4() {
v4Forwarding = true
} else {
v6Forwarding = true
}
}
// Currently we only proxy traffic to the IPv4 address of the tailnet
// target.
if tailnetTargetFQDN != "" {
v4Forwarding = true
}
if routes != nil && *routes != "" {
for _, route := range strings.Split(*routes, ",") {
cidr, err := netip.ParsePrefix(route)
if err != nil {
return fmt.Errorf("invalid subnet route: %v", err)
}
if cidr.Addr().Is4() {
v4Forwarding = true
} else {
v6Forwarding = true
}
}
}
return enableIPForwarding(v4Forwarding, v6Forwarding, root)
}
func enableIPForwarding(v4Forwarding, v6Forwarding bool, root string) error {
var paths []string
if v4Forwarding {
paths = append(paths, filepath.Join(root, "proc/sys/net/ipv4/ip_forward"))
}
if v6Forwarding {
paths = append(paths, filepath.Join(root, "proc/sys/net/ipv6/conf/all/forwarding"))
}
// In some common configurations (e.g. default docker,
// kubernetes), the container environment denies write access to
// most sysctls, including IP forwarding controls. Check the
// sysctl values before trying to change them, so that we
// gracefully do nothing if the container's already been set up
// properly by e.g. a k8s initContainer.
for _, path := range paths {
bs, err := os.ReadFile(path)
if err != nil {
return fmt.Errorf("reading %q: %w", path, err)
}
if v := strings.TrimSpace(string(bs)); v != "1" {
if err := os.WriteFile(path, []byte("1"), 0644); err != nil {
return fmt.Errorf("enabling %q: %w", path, err)
}
}
}
return nil
}
func installEgressForwardingRule(_ context.Context, dstStr string, tsIPs []netip.Prefix, nfr linuxfw.NetfilterRunner) error {
dst, err := netip.ParseAddr(dstStr)
if err != nil {
return err
}
var local netip.Addr
for _, pfx := range tsIPs {
if !pfx.IsSingleIP() {
continue
}
if pfx.Addr().Is4() != dst.Is4() {
continue
}
local = pfx.Addr()
break
}
if !local.IsValid() {
return fmt.Errorf("no tailscale IP matching family of %s found in %v", dstStr, tsIPs)
}
if err := nfr.DNATNonTailscaleTraffic("tailscale0", dst); err != nil {
return fmt.Errorf("installing egress proxy rules: %w", err)
}
if err := nfr.EnsureSNATForDst(local, dst); err != nil {
return fmt.Errorf("installing egress proxy rules: %w", err)
}
if err := nfr.ClampMSSToPMTU("tailscale0", dst); err != nil {
return fmt.Errorf("installing egress proxy rules: %w", err)
}
return nil
}
// installTSForwardingRuleForDestination accepts a destination address and a
// list of node's tailnet addresses, sets up rules to forward traffic for
// destination to the tailnet IP matching the destination IP family.
// Destination can be Pod IP of this node.
func installTSForwardingRuleForDestination(_ context.Context, dstFilter string, tsIPs []netip.Prefix, nfr linuxfw.NetfilterRunner) error {
dst, err := netip.ParseAddr(dstFilter)
if err != nil {
return err
}
var local netip.Addr
for _, pfx := range tsIPs {
if !pfx.IsSingleIP() {
continue
}
if pfx.Addr().Is4() != dst.Is4() {
continue
}
local = pfx.Addr()
break
}
if !local.IsValid() {
return fmt.Errorf("no tailscale IP matching family of %s found in %v", dstFilter, tsIPs)
}
if err := nfr.AddDNATRule(dst, local); err != nil {
return fmt.Errorf("installing rule for forwarding traffic to tailnet IP: %w", err)
}
return nil
}
func installIngressForwardingRule(_ context.Context, dstStr string, tsIPs []netip.Prefix, nfr linuxfw.NetfilterRunner) error {
dst, err := netip.ParseAddr(dstStr)
if err != nil {
return err
}
var local netip.Addr
proxyHasIPv4Address := false
for _, pfx := range tsIPs {
if !pfx.IsSingleIP() {
continue
}
if pfx.Addr().Is4() {
proxyHasIPv4Address = true
}
if pfx.Addr().Is4() != dst.Is4() {
continue
}
local = pfx.Addr()
break
}
if proxyHasIPv4Address && dst.Is6() {
log.Printf("Warning: proxy backend ClusterIP is an IPv6 address and the proxy has a IPv4 tailnet address. You might need to disable IPv4 address allocation for the proxy for forwarding to work. See https://github.com/tailscale/tailscale/issues/12156")
}
if !local.IsValid() {
return fmt.Errorf("no tailscale IP matching family of %s found in %v", dstStr, tsIPs)
}
if err := nfr.AddDNATRule(local, dst); err != nil {
return fmt.Errorf("installing ingress proxy rules: %w", err)
}
if err := nfr.ClampMSSToPMTU("tailscale0", dst); err != nil {
return fmt.Errorf("installing ingress proxy rules: %w", err)
}
return nil
}
func installIngressForwardingRuleForDNSTarget(_ context.Context, backendAddrs []net.IP, tsIPs []netip.Prefix, nfr linuxfw.NetfilterRunner) error {
var (
tsv4 netip.Addr
tsv6 netip.Addr
v4Backends []netip.Addr
v6Backends []netip.Addr
)
for _, pfx := range tsIPs {
if pfx.IsSingleIP() && pfx.Addr().Is4() {
tsv4 = pfx.Addr()
continue
}
if pfx.IsSingleIP() && pfx.Addr().Is6() {
tsv6 = pfx.Addr()
continue
}
}
// TODO: log if more than one backend address is found and firewall is
// in nftables mode that only the first IP will be used.
for _, ip := range backendAddrs {
if ip.To4() != nil {
v4Backends = append(v4Backends, netip.AddrFrom4([4]byte(ip.To4())))
}
if ip.To16() != nil {
v6Backends = append(v6Backends, netip.AddrFrom16([16]byte(ip.To16())))
}
}
// Enable IP forwarding here as opposed to at the start of containerboot
// as the IPv4/IPv6 requirements might have changed.
// For Kubernetes operator proxies, forwarding for both IPv4 and IPv6 is
// enabled by an init container, so in practice enabling forwarding here
// is only needed if this proxy has been configured by manually setting
// TS_EXPERIMENTAL_DEST_DNS_NAME env var for a containerboot instance.
if err := enableIPForwarding(len(v4Backends) != 0, len(v6Backends) != 0, ""); err != nil {
log.Printf("[unexpected] failed to ensure IP forwarding: %v", err)
}
updateFirewall := func(dst netip.Addr, backendTargets []netip.Addr) error {
if err := nfr.DNATWithLoadBalancer(dst, backendTargets); err != nil {
return fmt.Errorf("installing DNAT rules for ingress backends %+#v: %w", backendTargets, err)
}
// The backend might advertize MSS higher than that of the
// tailscale interfaces. Clamp MSS of packets going out via
// tailscale0 interface to its MTU to prevent broken connections
// in environments where path MTU discovery is not working.
if err := nfr.ClampMSSToPMTU("tailscale0", dst); err != nil {
return fmt.Errorf("adding rule to clamp traffic via tailscale0: %v", err)
}
return nil
}
if len(v4Backends) != 0 {
if !tsv4.IsValid() {
log.Printf("backend targets %v contain at least one IPv4 address, but this node's Tailscale IPs do not contain a valid IPv4 address: %v", backendAddrs, tsIPs)
} else if err := updateFirewall(tsv4, v4Backends); err != nil {
return fmt.Errorf("Installing IPv4 firewall rules: %w", err)
}
}
if len(v6Backends) != 0 && !tsv6.IsValid() {
if !tsv6.IsValid() {
log.Printf("backend targets %v contain at least one IPv6 address, but this node's Tailscale IPs do not contain a valid IPv6 address: %v", backendAddrs, tsIPs)
} else if !nfr.HasIPV6NAT() {
log.Printf("backend targets %v contain at least one IPv6 address, but the chosen firewall mode does not support IPv6 NAT", backendAddrs)
} else if err := updateFirewall(tsv6, v6Backends); err != nil {
return fmt.Errorf("Installing IPv6 firewall rules: %w", err)
}
}
return nil
}

View File

@@ -0,0 +1,51 @@
// Copyright (c) Tailscale Inc & AUTHORS
// SPDX-License-Identifier: BSD-3-Clause
//go:build linux
package main
import (
"log"
"net"
"net/http"
"sync"
)
// healthz is a simple health check server, if enabled it returns 200 OK if
// this tailscale node currently has at least one tailnet IP address else
// returns 503.
type healthz struct {
sync.Mutex
hasAddrs bool
}
func (h *healthz) ServeHTTP(w http.ResponseWriter, r *http.Request) {
h.Lock()
defer h.Unlock()
if h.hasAddrs {
w.Write([]byte("ok"))
} else {
http.Error(w, "node currently has no tailscale IPs", http.StatusInternalServerError)
}
}
// runHealthz runs a simple HTTP health endpoint on /healthz, listening on the
// provided address. A containerized tailscale instance is considered healthy if
// it has at least one tailnet IP address.
func runHealthz(addr string, h *healthz) {
lis, err := net.Listen("tcp", addr)
if err != nil {
log.Fatalf("error listening on the provided health endpoint address %q: %v", addr, err)
}
mux := http.NewServeMux()
mux.Handle("/healthz", h)
log.Printf("Running healthcheck endpoint at %s/healthz", addr)
hs := &http.Server{Handler: mux}
go func() {
if err := hs.Serve(lis); err != nil {
log.Fatalf("failed running health endpoint: %v", err)
}
}()
}

View File

@@ -8,21 +8,21 @@ package main
import (
"context"
"encoding/json"
"errors"
"fmt"
"log"
"net/http"
"net/netip"
"os"
"tailscale.com/kube"
"tailscale.com/kube/kubeapi"
"tailscale.com/kube/kubeclient"
"tailscale.com/tailcfg"
)
// storeDeviceID writes deviceID to 'device_id' data field of the named
// Kubernetes Secret.
func storeDeviceID(ctx context.Context, secretName string, deviceID tailcfg.StableNodeID) error {
s := &kube.Secret{
s := &kubeapi.Secret{
Data: map[string][]byte{
"device_id": []byte(deviceID),
},
@@ -42,7 +42,7 @@ func storeDeviceEndpoints(ctx context.Context, secretName string, fqdn string, a
return err
}
s := &kube.Secret{
s := &kubeapi.Secret{
Data: map[string][]byte{
"device_fqdn": []byte(fqdn),
"device_ips": deviceIPs,
@@ -55,14 +55,14 @@ func storeDeviceEndpoints(ctx context.Context, secretName string, fqdn string, a
// secret. No-op if there is no authkey in the secret.
func deleteAuthKey(ctx context.Context, secretName string) error {
// m is a JSON Patch data structure, see https://jsonpatch.com/ or RFC 6902.
m := []kube.JSONPatch{
m := []kubeclient.JSONPatch{
{
Op: "remove",
Path: "/data/authkey",
},
}
if err := kc.JSONPatchSecret(ctx, secretName, m); err != nil {
if s, ok := err.(*kube.Status); ok && s.Code == http.StatusUnprocessableEntity {
if s, ok := err.(*kubeapi.Status); ok && s.Code == http.StatusUnprocessableEntity {
// This is kubernetes-ese for "the field you asked to
// delete already doesn't exist", aka no-op.
return nil
@@ -72,66 +72,16 @@ func deleteAuthKey(ctx context.Context, secretName string) error {
return nil
}
var kc kube.Client
// setupKube is responsible for doing any necessary configuration and checks to
// ensure that tailscale state storage and authentication mechanism will work on
// Kubernetes.
func (cfg *settings) setupKube(ctx context.Context) error {
if cfg.KubeSecret == "" {
return nil
}
canPatch, canCreate, err := kc.CheckSecretPermissions(ctx, cfg.KubeSecret)
if err != nil {
return fmt.Errorf("Some Kubernetes permissions are missing, please check your RBAC configuration: %v", err)
}
cfg.KubernetesCanPatch = canPatch
s, err := kc.GetSecret(ctx, cfg.KubeSecret)
if err != nil && kube.IsNotFoundErr(err) && !canCreate {
return fmt.Errorf("Tailscale state Secret %s does not exist and we don't have permissions to create it. "+
"If you intend to store tailscale state elsewhere than a Kubernetes Secret, "+
"you can explicitly set TS_KUBE_SECRET env var to an empty string. "+
"Else ensure that RBAC is set up that allows the service account associated with this installation to create Secrets.", cfg.KubeSecret)
} else if err != nil && !kube.IsNotFoundErr(err) {
return fmt.Errorf("Getting Tailscale state Secret %s: %v", cfg.KubeSecret, err)
}
if cfg.AuthKey == "" && !isOneStepConfig(cfg) {
if s == nil {
log.Print("TS_AUTHKEY not provided and kube secret does not exist, login will be interactive if needed.")
return nil
}
keyBytes, _ := s.Data["authkey"]
key := string(keyBytes)
if key != "" {
// This behavior of pulling authkeys from kube secrets was added
// at the same time as the patch permission, so we can enforce
// that we must be able to patch out the authkey after
// authenticating if you want to use this feature. This avoids
// us having to deal with the case where we might leave behind
// an unnecessary reusable authkey in a secret, like a rake in
// the grass.
if !cfg.KubernetesCanPatch {
return errors.New("authkey found in TS_KUBE_SECRET, but the pod doesn't have patch permissions on the secret to manage the authkey.")
}
cfg.AuthKey = key
} else {
log.Print("No authkey found in kube secret and TS_AUTHKEY not provided, login will be interactive if needed.")
}
}
return nil
}
var kc kubeclient.Client
func initKubeClient(root string) {
if root != "/" {
// If we are running in a test, we need to set the root path to the fake
// service account directory.
kube.SetRootPathForTesting(root)
kubeclient.SetRootPathForTesting(root)
}
var err error
kc, err = kube.New()
kc, err = kubeclient.New()
if err != nil {
log.Fatalf("Error creating kube client: %v", err)
}

View File

@@ -11,7 +11,8 @@ import (
"testing"
"github.com/google/go-cmp/cmp"
"tailscale.com/kube"
"tailscale.com/kube/kubeapi"
"tailscale.com/kube/kubeclient"
)
func TestSetupKube(t *testing.T) {
@@ -20,7 +21,7 @@ func TestSetupKube(t *testing.T) {
cfg *settings
wantErr bool
wantCfg *settings
kc kube.Client
kc kubeclient.Client
}{
{
name: "TS_AUTHKEY set, state Secret exists",
@@ -28,11 +29,11 @@ func TestSetupKube(t *testing.T) {
AuthKey: "foo",
KubeSecret: "foo",
},
kc: &kube.FakeClient{
kc: &kubeclient.FakeClient{
CheckSecretPermissionsImpl: func(context.Context, string) (bool, bool, error) {
return false, false, nil
},
GetSecretImpl: func(context.Context, string) (*kube.Secret, error) {
GetSecretImpl: func(context.Context, string) (*kubeapi.Secret, error) {
return nil, nil
},
},
@@ -47,12 +48,12 @@ func TestSetupKube(t *testing.T) {
AuthKey: "foo",
KubeSecret: "foo",
},
kc: &kube.FakeClient{
kc: &kubeclient.FakeClient{
CheckSecretPermissionsImpl: func(context.Context, string) (bool, bool, error) {
return false, true, nil
},
GetSecretImpl: func(context.Context, string) (*kube.Secret, error) {
return nil, &kube.Status{Code: 404}
GetSecretImpl: func(context.Context, string) (*kubeapi.Secret, error) {
return nil, &kubeapi.Status{Code: 404}
},
},
wantCfg: &settings{
@@ -66,12 +67,12 @@ func TestSetupKube(t *testing.T) {
AuthKey: "foo",
KubeSecret: "foo",
},
kc: &kube.FakeClient{
kc: &kubeclient.FakeClient{
CheckSecretPermissionsImpl: func(context.Context, string) (bool, bool, error) {
return false, false, nil
},
GetSecretImpl: func(context.Context, string) (*kube.Secret, error) {
return nil, &kube.Status{Code: 404}
GetSecretImpl: func(context.Context, string) (*kubeapi.Secret, error) {
return nil, &kubeapi.Status{Code: 404}
},
},
wantCfg: &settings{
@@ -86,12 +87,12 @@ func TestSetupKube(t *testing.T) {
AuthKey: "foo",
KubeSecret: "foo",
},
kc: &kube.FakeClient{
kc: &kubeclient.FakeClient{
CheckSecretPermissionsImpl: func(context.Context, string) (bool, bool, error) {
return false, false, nil
},
GetSecretImpl: func(context.Context, string) (*kube.Secret, error) {
return nil, &kube.Status{Code: 403}
GetSecretImpl: func(context.Context, string) (*kubeapi.Secret, error) {
return nil, &kubeapi.Status{Code: 403}
},
},
wantCfg: &settings{
@@ -110,7 +111,7 @@ func TestSetupKube(t *testing.T) {
AuthKey: "foo",
KubeSecret: "foo",
},
kc: &kube.FakeClient{
kc: &kubeclient.FakeClient{
CheckSecretPermissionsImpl: func(context.Context, string) (bool, bool, error) {
return false, false, errors.New("broken")
},
@@ -126,12 +127,12 @@ func TestSetupKube(t *testing.T) {
wantCfg: &settings{
KubeSecret: "foo",
},
kc: &kube.FakeClient{
kc: &kubeclient.FakeClient{
CheckSecretPermissionsImpl: func(context.Context, string) (bool, bool, error) {
return false, true, nil
},
GetSecretImpl: func(context.Context, string) (*kube.Secret, error) {
return nil, &kube.Status{Code: 404}
GetSecretImpl: func(context.Context, string) (*kubeapi.Secret, error) {
return nil, &kubeapi.Status{Code: 404}
},
},
},
@@ -144,12 +145,12 @@ func TestSetupKube(t *testing.T) {
wantCfg: &settings{
KubeSecret: "foo",
},
kc: &kube.FakeClient{
kc: &kubeclient.FakeClient{
CheckSecretPermissionsImpl: func(context.Context, string) (bool, bool, error) {
return false, false, nil
},
GetSecretImpl: func(context.Context, string) (*kube.Secret, error) {
return &kube.Secret{}, nil
GetSecretImpl: func(context.Context, string) (*kubeapi.Secret, error) {
return &kubeapi.Secret{}, nil
},
},
},
@@ -158,12 +159,12 @@ func TestSetupKube(t *testing.T) {
cfg: &settings{
KubeSecret: "foo",
},
kc: &kube.FakeClient{
kc: &kubeclient.FakeClient{
CheckSecretPermissionsImpl: func(context.Context, string) (bool, bool, error) {
return false, false, nil
},
GetSecretImpl: func(context.Context, string) (*kube.Secret, error) {
return &kube.Secret{Data: map[string][]byte{"authkey": []byte("foo")}}, nil
GetSecretImpl: func(context.Context, string) (*kubeapi.Secret, error) {
return &kubeapi.Secret{Data: map[string][]byte{"authkey": []byte("foo")}}, nil
},
},
wantCfg: &settings{
@@ -176,12 +177,12 @@ func TestSetupKube(t *testing.T) {
cfg: &settings{
KubeSecret: "foo",
},
kc: &kube.FakeClient{
kc: &kubeclient.FakeClient{
CheckSecretPermissionsImpl: func(context.Context, string) (bool, bool, error) {
return true, false, nil
},
GetSecretImpl: func(context.Context, string) (*kube.Secret, error) {
return &kube.Secret{Data: map[string][]byte{"authkey": []byte("foo")}}, nil
GetSecretImpl: func(context.Context, string) (*kubeapi.Secret, error) {
return &kubeapi.Secret{Data: map[string][]byte{"authkey": []byte("foo")}}, nil
},
},
wantCfg: &settings{

View File

@@ -52,6 +52,12 @@
// ${TS_CERT_DOMAIN}, it will be replaced with the value of the available FQDN.
// It cannot be used in conjunction with TS_DEST_IP. The file is watched for changes,
// and will be re-applied when it changes.
// - TS_HEALTHCHECK_ADDR_PORT: if specified, an HTTP health endpoint will be
// served at /healthz at the provided address, which should be in form [<address>]:<port>.
// If not set, no health check will be run. If set to :<port>, addr will default to 0.0.0.0
// The health endpoint will return 200 OK if this node has at least one tailnet IP address,
// otherwise returns 503.
// NB: the health criteria might change in the future.
// - TS_EXPERIMENTAL_VERSIONED_CONFIG_DIR: if specified, a path to a
// directory that containers tailscaled config in file. The config file needs to be
// named cap-<current-tailscaled-cap>.hujson. If this is set, TS_HOSTNAME,
@@ -86,9 +92,7 @@
package main
import (
"bytes"
"context"
"encoding/json"
"errors"
"fmt"
"io/fs"
@@ -97,24 +101,18 @@ import (
"net"
"net/netip"
"os"
"os/exec"
"os/signal"
"path"
"path/filepath"
"reflect"
"slices"
"strconv"
"strings"
"sync"
"sync/atomic"
"syscall"
"time"
"github.com/fsnotify/fsnotify"
"golang.org/x/sys/unix"
"tailscale.com/client/tailscale"
"tailscale.com/ipn"
"tailscale.com/ipn/conffile"
kubeutils "tailscale.com/k8s-operator"
"tailscale.com/tailcfg"
"tailscale.com/types/logger"
@@ -133,34 +131,9 @@ func newNetfilterRunner(logf logger.Logf) (linuxfw.NetfilterRunner, error) {
func main() {
log.SetPrefix("boot: ")
tailscale.I_Acknowledge_This_API_Is_Unstable = true
cfg := &settings{
AuthKey: defaultEnvs([]string{"TS_AUTHKEY", "TS_AUTH_KEY"}, ""),
Hostname: defaultEnv("TS_HOSTNAME", ""),
Routes: defaultEnvStringPointer("TS_ROUTES"),
ServeConfigPath: defaultEnv("TS_SERVE_CONFIG", ""),
ProxyTargetIP: defaultEnv("TS_DEST_IP", ""),
ProxyTargetDNSName: defaultEnv("TS_EXPERIMENTAL_DEST_DNS_NAME", ""),
TailnetTargetIP: defaultEnv("TS_TAILNET_TARGET_IP", ""),
TailnetTargetFQDN: defaultEnv("TS_TAILNET_TARGET_FQDN", ""),
DaemonExtraArgs: defaultEnv("TS_TAILSCALED_EXTRA_ARGS", ""),
ExtraArgs: defaultEnv("TS_EXTRA_ARGS", ""),
InKubernetes: os.Getenv("KUBERNETES_SERVICE_HOST") != "",
UserspaceMode: defaultBool("TS_USERSPACE", true),
StateDir: defaultEnv("TS_STATE_DIR", ""),
AcceptDNS: defaultEnvBoolPointer("TS_ACCEPT_DNS"),
KubeSecret: defaultEnv("TS_KUBE_SECRET", "tailscale"),
SOCKSProxyAddr: defaultEnv("TS_SOCKS5_SERVER", ""),
HTTPProxyAddr: defaultEnv("TS_OUTBOUND_HTTP_PROXY_LISTEN", ""),
Socket: defaultEnv("TS_SOCKET", "/tmp/tailscaled.sock"),
AuthOnce: defaultBool("TS_AUTH_ONCE", false),
Root: defaultEnv("TS_TEST_ONLY_ROOT", "/"),
TailscaledConfigFilePath: tailscaledConfigFilePath(),
AllowProxyingClusterTrafficViaIngress: defaultBool("EXPERIMENTAL_ALLOW_PROXYING_CLUSTER_TRAFFIC_VIA_INGRESS", false),
PodIP: defaultEnv("POD_IP", ""),
EnableForwardingOptimizations: defaultBool("TS_EXPERIMENTAL_ENABLE_FORWARDING_OPTIMIZATIONS", false),
}
if err := cfg.validate(); err != nil {
cfg, err := configFromEnv()
if err != nil {
log.Fatalf("invalid configuration: %v", err)
}
@@ -275,10 +248,8 @@ authLoop:
switch *n.State {
case ipn.NeedsLogin:
if isOneStepConfig(cfg) {
// This could happen if this is the
// first time tailscaled was run for
// this device and the auth key was not
// passed via the configfile.
// This could happen if this is the first time tailscaled was run for this
// device and the auth key was not passed via the configfile.
log.Fatalf("invalid state: tailscaled daemon started with a config file, but tailscale is not logged in: ensure you pass a valid auth key in the config file.")
}
if err := authTailscale(); err != nil {
@@ -349,6 +320,9 @@ authLoop:
certDomain = new(atomic.Pointer[string])
certDomainChanged = make(chan bool, 1)
h = &healthz{} // http server for the healthz endpoint
healthzRunner = sync.OnceFunc(func() { runHealthz(cfg.HealthCheckAddrPort, h) })
)
if cfg.ServeConfigPath != "" {
go watchServeConfigChanges(ctx, cfg.ServeConfigPath, certDomainChanged, certDomain, client)
@@ -373,6 +347,9 @@ authLoop:
}
})
)
// egressSvcsErrorChan will get an error sent to it if this containerboot instance is configured to expose 1+
// egress services in HA mode and errored.
var egressSvcsErrorChan = make(chan error)
defer t.Stop()
// resetTimer resets timer for when to next attempt to resolve the DNS
// name for the proxy configured with TS_EXPERIMENTAL_DEST_DNS_NAME. The
@@ -398,6 +375,7 @@ authLoop:
failedResolveAttempts++
}
var egressSvcsNotify chan ipn.Notify
notifyChan := make(chan ipn.Notify)
errChan := make(chan error)
go func() {
@@ -475,19 +453,25 @@ runLoop:
egressAddrs = node.Addresses().AsSlice()
newCurentEgressIPs = deephash.Hash(&egressAddrs)
egressIPsHaveChanged = newCurentEgressIPs != currentEgressIPs
if egressIPsHaveChanged && len(egressAddrs) != 0 {
// The firewall rules get (re-)installed:
// - on startup
// - when the tailnet IPs of the tailnet target have changed
// - when the tailnet IPs of this node have changed
if (egressIPsHaveChanged || ipsHaveChanged) && len(egressAddrs) != 0 {
var rulesInstalled bool
for _, egressAddr := range egressAddrs {
ea := egressAddr.Addr()
// TODO (irbekrm): make it work for IPv6 too.
if ea.Is6() {
log.Println("Not installing egress forwarding rules for IPv6 as this is currently not supported")
continue
}
log.Printf("Installing forwarding rules for destination %v", ea.String())
if err := installEgressForwardingRule(ctx, ea.String(), addrs, nfr); err != nil {
log.Fatalf("installing egress proxy rules for destination %s: %v", ea.String(), err)
if ea.Is4() || (ea.Is6() && nfr.HasIPV6NAT()) {
rulesInstalled = true
log.Printf("Installing forwarding rules for destination %v", ea.String())
if err := installEgressForwardingRule(ctx, ea.String(), addrs, nfr); err != nil {
log.Fatalf("installing egress proxy rules for destination %s: %v", ea.String(), err)
}
}
}
if !rulesInstalled {
log.Fatalf("no forwarding rules for egress addresses %v, host supports IPv6: %v", egressAddrs, nfr.HasIPV6NAT())
}
}
currentEgressIPs = newCurentEgressIPs
}
@@ -563,31 +547,57 @@ runLoop:
log.Fatalf("storing device IPs and FQDN in Kubernetes Secret: %v", err)
}
}
if cfg.HealthCheckAddrPort != "" {
h.Lock()
h.hasAddrs = len(addrs) != 0
h.Unlock()
healthzRunner()
}
if egressSvcsNotify != nil {
egressSvcsNotify <- n
}
}
if !startupTasksDone {
// For containerboot instances that act as TCP
// proxies (proxying traffic to an endpoint
// passed via one of the env vars that
// containerbot reads) and store state in a
// Kubernetes Secret, we consider startup tasks
// done at the point when device info has been
// successfully stored to state Secret.
// For all other containerboot instances, if we
// just get to this point the startup tasks can
// be considered done.
// For containerboot instances that act as TCP proxies (proxying traffic to an endpoint
// passed via one of the env vars that containerboot reads) and store state in a
// Kubernetes Secret, we consider startup tasks done at the point when device info has
// been successfully stored to state Secret. For all other containerboot instances, if
// we just get to this point the startup tasks can be considered done.
if !isL3Proxy(cfg) || !hasKubeStateStore(cfg) || (currentDeviceEndpoints != deephash.Sum{} && currentDeviceID != deephash.Sum{}) {
// This log message is used in tests to detect when all
// post-auth configuration is done.
log.Println("Startup complete, waiting for shutdown signal")
startupTasksDone = true
// Wait on tailscaled process. It won't
// be cleaned up by default when the
// container exits as it is not PID1.
// TODO (irbekrm): perhaps we can
// replace the reaper by a running
// cmd.Wait in a goroutine immediately
// after starting tailscaled?
// Configure egress proxy. Egress proxy will set up firewall rules to proxy
// traffic to tailnet targets configured in the provided configuration file. It
// will then continuously monitor the config file and netmap updates and
// reconfigure the firewall rules as needed. If any of its operations fail, it
// will crash this node.
if cfg.EgressSvcsCfgPath != "" {
log.Printf("configuring egress proxy using configuration file at %s", cfg.EgressSvcsCfgPath)
egressSvcsNotify = make(chan ipn.Notify)
ep := egressProxy{
cfgPath: cfg.EgressSvcsCfgPath,
nfr: nfr,
kc: kc,
stateSecret: cfg.KubeSecret,
netmapChan: egressSvcsNotify,
podIPv4: cfg.PodIPv4,
tailnetAddrs: addrs,
}
go func() {
if err := ep.run(ctx, n); err != nil {
egressSvcsErrorChan <- err
}
}()
}
// Wait on tailscaled process. It won't be cleaned up by default when the
// container exits as it is not PID1. TODO (irbekrm): perhaps we can replace the
// reaper by a running cmd.Wait in a goroutine immediately after starting
// tailscaled?
reaper := func() {
defer wg.Done()
for {
@@ -625,226 +635,13 @@ runLoop:
}
backendAddrs = newBackendAddrs
resetTimer(false)
case e := <-egressSvcsErrorChan:
log.Fatalf("egress proxy failed: %v", e)
}
}
wg.Wait()
}
// watchServeConfigChanges watches path for changes, and when it sees one, reads
// the serve config from it, replacing ${TS_CERT_DOMAIN} with certDomain, and
// applies it to lc. It exits when ctx is canceled. cdChanged is a channel that
// is written to when the certDomain changes, causing the serve config to be
// re-read and applied.
func watchServeConfigChanges(ctx context.Context, path string, cdChanged <-chan bool, certDomainAtomic *atomic.Pointer[string], lc *tailscale.LocalClient) {
if certDomainAtomic == nil {
panic("cd must not be nil")
}
var tickChan <-chan time.Time
var eventChan <-chan fsnotify.Event
if w, err := fsnotify.NewWatcher(); err != nil {
log.Printf("failed to create fsnotify watcher, timer-only mode: %v", err)
ticker := time.NewTicker(5 * time.Second)
defer ticker.Stop()
tickChan = ticker.C
} else {
defer w.Close()
if err := w.Add(filepath.Dir(path)); err != nil {
log.Fatalf("failed to add fsnotify watch: %v", err)
}
eventChan = w.Events
}
var certDomain string
var prevServeConfig *ipn.ServeConfig
for {
select {
case <-ctx.Done():
return
case <-cdChanged:
certDomain = *certDomainAtomic.Load()
case <-tickChan:
case <-eventChan:
// We can't do any reasonable filtering on the event because of how
// k8s handles these mounts. So just re-read the file and apply it
// if it's changed.
}
if certDomain == "" {
continue
}
sc, err := readServeConfig(path, certDomain)
if err != nil {
log.Fatalf("failed to read serve config: %v", err)
}
if prevServeConfig != nil && reflect.DeepEqual(sc, prevServeConfig) {
continue
}
log.Printf("Applying serve config")
if err := lc.SetServeConfig(ctx, sc); err != nil {
log.Fatalf("failed to set serve config: %v", err)
}
prevServeConfig = sc
}
}
// readServeConfig reads the ipn.ServeConfig from path, replacing
// ${TS_CERT_DOMAIN} with certDomain.
func readServeConfig(path, certDomain string) (*ipn.ServeConfig, error) {
if path == "" {
return nil, nil
}
j, err := os.ReadFile(path)
if err != nil {
return nil, err
}
j = bytes.ReplaceAll(j, []byte("${TS_CERT_DOMAIN}"), []byte(certDomain))
var sc ipn.ServeConfig
if err := json.Unmarshal(j, &sc); err != nil {
return nil, err
}
return &sc, nil
}
func startTailscaled(ctx context.Context, cfg *settings) (*tailscale.LocalClient, *os.Process, error) {
args := tailscaledArgs(cfg)
// tailscaled runs without context, since it needs to persist
// beyond the startup timeout in ctx.
cmd := exec.Command("tailscaled", args...)
cmd.Stdout = os.Stdout
cmd.Stderr = os.Stderr
cmd.SysProcAttr = &syscall.SysProcAttr{
Setpgid: true,
}
log.Printf("Starting tailscaled")
if err := cmd.Start(); err != nil {
return nil, nil, fmt.Errorf("starting tailscaled failed: %v", err)
}
// Wait for the socket file to appear, otherwise API ops will racily fail.
log.Printf("Waiting for tailscaled socket")
for {
if ctx.Err() != nil {
log.Fatalf("Timed out waiting for tailscaled socket")
}
_, err := os.Stat(cfg.Socket)
if errors.Is(err, fs.ErrNotExist) {
time.Sleep(100 * time.Millisecond)
continue
} else if err != nil {
log.Fatalf("Waiting for tailscaled socket: %v", err)
}
break
}
tsClient := &tailscale.LocalClient{
Socket: cfg.Socket,
UseSocketOnly: true,
}
return tsClient, cmd.Process, nil
}
// tailscaledArgs uses cfg to construct the argv for tailscaled.
func tailscaledArgs(cfg *settings) []string {
args := []string{"--socket=" + cfg.Socket}
switch {
case cfg.InKubernetes && cfg.KubeSecret != "":
args = append(args, "--state=kube:"+cfg.KubeSecret)
if cfg.StateDir == "" {
cfg.StateDir = "/tmp"
}
fallthrough
case cfg.StateDir != "":
args = append(args, "--statedir="+cfg.StateDir)
default:
args = append(args, "--state=mem:", "--statedir=/tmp")
}
if cfg.UserspaceMode {
args = append(args, "--tun=userspace-networking")
} else if err := ensureTunFile(cfg.Root); err != nil {
log.Fatalf("ensuring that /dev/net/tun exists: %v", err)
}
if cfg.SOCKSProxyAddr != "" {
args = append(args, "--socks5-server="+cfg.SOCKSProxyAddr)
}
if cfg.HTTPProxyAddr != "" {
args = append(args, "--outbound-http-proxy-listen="+cfg.HTTPProxyAddr)
}
if cfg.TailscaledConfigFilePath != "" {
args = append(args, "--config="+cfg.TailscaledConfigFilePath)
}
if cfg.DaemonExtraArgs != "" {
args = append(args, strings.Fields(cfg.DaemonExtraArgs)...)
}
return args
}
// tailscaleUp uses cfg to run 'tailscale up' everytime containerboot starts, or
// if TS_AUTH_ONCE is set, only the first time containerboot starts.
func tailscaleUp(ctx context.Context, cfg *settings) error {
args := []string{"--socket=" + cfg.Socket, "up"}
if cfg.AcceptDNS != nil && *cfg.AcceptDNS {
args = append(args, "--accept-dns=true")
} else {
args = append(args, "--accept-dns=false")
}
if cfg.AuthKey != "" {
args = append(args, "--authkey="+cfg.AuthKey)
}
// --advertise-routes can be passed an empty string to configure a
// device (that might have previously advertised subnet routes) to not
// advertise any routes. Respect an empty string passed by a user and
// use it to explicitly unset the routes.
if cfg.Routes != nil {
args = append(args, "--advertise-routes="+*cfg.Routes)
}
if cfg.Hostname != "" {
args = append(args, "--hostname="+cfg.Hostname)
}
if cfg.ExtraArgs != "" {
args = append(args, strings.Fields(cfg.ExtraArgs)...)
}
log.Printf("Running 'tailscale up'")
cmd := exec.CommandContext(ctx, "tailscale", args...)
cmd.Stdout = os.Stdout
cmd.Stderr = os.Stderr
if err := cmd.Run(); err != nil {
return fmt.Errorf("tailscale up failed: %v", err)
}
return nil
}
// tailscaleSet uses cfg to run 'tailscale set' to set any known configuration
// options that are passed in via environment variables. This is run after the
// node is in Running state and only if TS_AUTH_ONCE is set.
func tailscaleSet(ctx context.Context, cfg *settings) error {
args := []string{"--socket=" + cfg.Socket, "set"}
if cfg.AcceptDNS != nil && *cfg.AcceptDNS {
args = append(args, "--accept-dns=true")
} else {
args = append(args, "--accept-dns=false")
}
// --advertise-routes can be passed an empty string to configure a
// device (that might have previously advertised subnet routes) to not
// advertise any routes. Respect an empty string passed by a user and
// use it to explicitly unset the routes.
if cfg.Routes != nil {
args = append(args, "--advertise-routes="+*cfg.Routes)
}
if cfg.Hostname != "" {
args = append(args, "--hostname="+cfg.Hostname)
}
log.Printf("Running 'tailscale set'")
cmd := exec.CommandContext(ctx, "tailscale", args...)
cmd.Stdout = os.Stdout
cmd.Stderr = os.Stderr
if err := cmd.Run(); err != nil {
return fmt.Errorf("tailscale set failed: %v", err)
}
return nil
}
// ensureTunFile checks that /dev/net/tun exists, creating it if
// missing.
func ensureTunFile(root string) error {
@@ -864,344 +661,6 @@ func ensureTunFile(root string) error {
return nil
}
// ensureIPForwarding enables IPv4/IPv6 forwarding for the container.
func ensureIPForwarding(root, clusterProxyTargetIP, tailnetTargetIP, tailnetTargetFQDN string, routes *string) error {
var (
v4Forwarding, v6Forwarding bool
)
if clusterProxyTargetIP != "" {
proxyIP, err := netip.ParseAddr(clusterProxyTargetIP)
if err != nil {
return fmt.Errorf("invalid cluster destination IP: %v", err)
}
if proxyIP.Is4() {
v4Forwarding = true
} else {
v6Forwarding = true
}
}
if tailnetTargetIP != "" {
proxyIP, err := netip.ParseAddr(tailnetTargetIP)
if err != nil {
return fmt.Errorf("invalid tailnet destination IP: %v", err)
}
if proxyIP.Is4() {
v4Forwarding = true
} else {
v6Forwarding = true
}
}
// Currently we only proxy traffic to the IPv4 address of the tailnet
// target.
if tailnetTargetFQDN != "" {
v4Forwarding = true
}
if routes != nil && *routes != "" {
for _, route := range strings.Split(*routes, ",") {
cidr, err := netip.ParsePrefix(route)
if err != nil {
return fmt.Errorf("invalid subnet route: %v", err)
}
if cidr.Addr().Is4() {
v4Forwarding = true
} else {
v6Forwarding = true
}
}
}
return enableIPForwarding(v4Forwarding, v6Forwarding, root)
}
func enableIPForwarding(v4Forwarding, v6Forwarding bool, root string) error {
var paths []string
if v4Forwarding {
paths = append(paths, filepath.Join(root, "proc/sys/net/ipv4/ip_forward"))
}
if v6Forwarding {
paths = append(paths, filepath.Join(root, "proc/sys/net/ipv6/conf/all/forwarding"))
}
// In some common configurations (e.g. default docker,
// kubernetes), the container environment denies write access to
// most sysctls, including IP forwarding controls. Check the
// sysctl values before trying to change them, so that we
// gracefully do nothing if the container's already been set up
// properly by e.g. a k8s initContainer.
for _, path := range paths {
bs, err := os.ReadFile(path)
if err != nil {
return fmt.Errorf("reading %q: %w", path, err)
}
if v := strings.TrimSpace(string(bs)); v != "1" {
if err := os.WriteFile(path, []byte("1"), 0644); err != nil {
return fmt.Errorf("enabling %q: %w", path, err)
}
}
}
return nil
}
func installEgressForwardingRule(ctx context.Context, dstStr string, tsIPs []netip.Prefix, nfr linuxfw.NetfilterRunner) error {
dst, err := netip.ParseAddr(dstStr)
if err != nil {
return err
}
var local netip.Addr
for _, pfx := range tsIPs {
if !pfx.IsSingleIP() {
continue
}
if pfx.Addr().Is4() != dst.Is4() {
continue
}
local = pfx.Addr()
break
}
if !local.IsValid() {
return fmt.Errorf("no tailscale IP matching family of %s found in %v", dstStr, tsIPs)
}
if err := nfr.DNATNonTailscaleTraffic("tailscale0", dst); err != nil {
return fmt.Errorf("installing egress proxy rules: %w", err)
}
if err := nfr.AddSNATRuleForDst(local, dst); err != nil {
return fmt.Errorf("installing egress proxy rules: %w", err)
}
if err := nfr.ClampMSSToPMTU("tailscale0", dst); err != nil {
return fmt.Errorf("installing egress proxy rules: %w", err)
}
return nil
}
// installTSForwardingRuleForDestination accepts a destination address and a
// list of node's tailnet addresses, sets up rules to forward traffic for
// destination to the tailnet IP matching the destination IP family.
// Destination can be Pod IP of this node.
func installTSForwardingRuleForDestination(ctx context.Context, dstFilter string, tsIPs []netip.Prefix, nfr linuxfw.NetfilterRunner) error {
dst, err := netip.ParseAddr(dstFilter)
if err != nil {
return err
}
var local netip.Addr
for _, pfx := range tsIPs {
if !pfx.IsSingleIP() {
continue
}
if pfx.Addr().Is4() != dst.Is4() {
continue
}
local = pfx.Addr()
break
}
if !local.IsValid() {
return fmt.Errorf("no tailscale IP matching family of %s found in %v", dstFilter, tsIPs)
}
if err := nfr.AddDNATRule(dst, local); err != nil {
return fmt.Errorf("installing rule for forwarding traffic to tailnet IP: %w", err)
}
return nil
}
func installIngressForwardingRule(ctx context.Context, dstStr string, tsIPs []netip.Prefix, nfr linuxfw.NetfilterRunner) error {
dst, err := netip.ParseAddr(dstStr)
if err != nil {
return err
}
var local netip.Addr
proxyHasIPv4Address := false
for _, pfx := range tsIPs {
if !pfx.IsSingleIP() {
continue
}
if pfx.Addr().Is4() {
proxyHasIPv4Address = true
}
if pfx.Addr().Is4() != dst.Is4() {
continue
}
local = pfx.Addr()
break
}
if proxyHasIPv4Address && dst.Is6() {
log.Printf("Warning: proxy backend ClusterIP is an IPv6 address and the proxy has a IPv4 tailnet address. You might need to disable IPv4 address allocation for the proxy for forwarding to work. See https://github.com/tailscale/tailscale/issues/12156")
}
if !local.IsValid() {
return fmt.Errorf("no tailscale IP matching family of %s found in %v", dstStr, tsIPs)
}
if err := nfr.AddDNATRule(local, dst); err != nil {
return fmt.Errorf("installing ingress proxy rules: %w", err)
}
if err := nfr.ClampMSSToPMTU("tailscale0", dst); err != nil {
return fmt.Errorf("installing ingress proxy rules: %w", err)
}
return nil
}
func installIngressForwardingRuleForDNSTarget(ctx context.Context, backendAddrs []net.IP, tsIPs []netip.Prefix, nfr linuxfw.NetfilterRunner) error {
var (
tsv4 netip.Addr
tsv6 netip.Addr
v4Backends []netip.Addr
v6Backends []netip.Addr
)
for _, pfx := range tsIPs {
if pfx.IsSingleIP() && pfx.Addr().Is4() {
tsv4 = pfx.Addr()
continue
}
if pfx.IsSingleIP() && pfx.Addr().Is6() {
tsv6 = pfx.Addr()
continue
}
}
// TODO: log if more than one backend address is found and firewall is
// in nftables mode that only the first IP will be used.
for _, ip := range backendAddrs {
if ip.To4() != nil {
v4Backends = append(v4Backends, netip.AddrFrom4([4]byte(ip.To4())))
}
if ip.To16() != nil {
v6Backends = append(v6Backends, netip.AddrFrom16([16]byte(ip.To16())))
}
}
// Enable IP forwarding here as opposed to at the start of containerboot
// as the IPv4/IPv6 requirements might have changed.
// For Kubernetes operator proxies, forwarding for both IPv4 and IPv6 is
// enabled by an init container, so in practice enabling forwarding here
// is only needed if this proxy has been configured by manually setting
// TS_EXPERIMENTAL_DEST_DNS_NAME env var for a containerboot instance.
if err := enableIPForwarding(len(v4Backends) != 0, len(v6Backends) != 0, ""); err != nil {
log.Printf("[unexpected] failed to ensure IP forwarding: %v", err)
}
updateFirewall := func(dst netip.Addr, backendTargets []netip.Addr) error {
if err := nfr.DNATWithLoadBalancer(dst, backendTargets); err != nil {
return fmt.Errorf("installing DNAT rules for ingress backends %+#v: %w", backendTargets, err)
}
// The backend might advertize MSS higher than that of the
// tailscale interfaces. Clamp MSS of packets going out via
// tailscale0 interface to its MTU to prevent broken connections
// in environments where path MTU discovery is not working.
if err := nfr.ClampMSSToPMTU("tailscale0", dst); err != nil {
return fmt.Errorf("adding rule to clamp traffic via tailscale0: %v", err)
}
return nil
}
if len(v4Backends) != 0 {
if !tsv4.IsValid() {
log.Printf("backend targets %v contain at least one IPv4 address, but this node's Tailscale IPs do not contain a valid IPv4 address: %v", backendAddrs, tsIPs)
} else if err := updateFirewall(tsv4, v4Backends); err != nil {
return fmt.Errorf("Installing IPv4 firewall rules: %w", err)
}
}
if len(v6Backends) != 0 && !tsv6.IsValid() {
if !tsv6.IsValid() {
log.Printf("backend targets %v contain at least one IPv6 address, but this node's Tailscale IPs do not contain a valid IPv6 address: %v", backendAddrs, tsIPs)
} else if !nfr.HasIPV6NAT() {
log.Printf("backend targets %v contain at least one IPv6 address, but the chosen firewall mode does not support IPv6 NAT", backendAddrs)
} else if err := updateFirewall(tsv6, v6Backends); err != nil {
return fmt.Errorf("Installing IPv6 firewall rules: %w", err)
}
}
return nil
}
// settings is all the configuration for containerboot.
type settings struct {
AuthKey string
Hostname string
Routes *string
// ProxyTargetIP is the destination IP to which all incoming
// Tailscale traffic should be proxied. If empty, no proxying
// is done. This is typically a locally reachable IP.
ProxyTargetIP string
// ProxyTargetDNSName is a DNS name to whose backing IP addresses all
// incoming Tailscale traffic should be proxied.
ProxyTargetDNSName string
// TailnetTargetIP is the destination IP to which all incoming
// non-Tailscale traffic should be proxied. This is typically a
// Tailscale IP.
TailnetTargetIP string
// TailnetTargetFQDN is an MagicDNS name to which all incoming
// non-Tailscale traffic should be proxied. This must be a full Tailnet
// node FQDN.
TailnetTargetFQDN string
ServeConfigPath string
DaemonExtraArgs string
ExtraArgs string
InKubernetes bool
UserspaceMode bool
StateDir string
AcceptDNS *bool
KubeSecret string
SOCKSProxyAddr string
HTTPProxyAddr string
Socket string
AuthOnce bool
Root string
KubernetesCanPatch bool
TailscaledConfigFilePath string
EnableForwardingOptimizations bool
// If set to true and, if this containerboot instance is a Kubernetes
// ingress proxy, set up rules to forward incoming cluster traffic to be
// forwarded to the ingress target in cluster.
AllowProxyingClusterTrafficViaIngress bool
// PodIP is the IP of the Pod if running in Kubernetes. This is used
// when setting up rules to proxy cluster traffic to cluster ingress
// target.
PodIP string
}
func (s *settings) validate() error {
if s.TailscaledConfigFilePath != "" {
dir, file := path.Split(s.TailscaledConfigFilePath)
if _, err := os.Stat(dir); err != nil {
return fmt.Errorf("error validating whether directory with tailscaled config file %s exists: %w", dir, err)
}
if _, err := os.Stat(s.TailscaledConfigFilePath); err != nil {
return fmt.Errorf("error validating whether tailscaled config directory %q contains tailscaled config for current capability version %q: %w. If this is a Tailscale Kubernetes operator proxy, please ensure that the version of the operator is not older than the version of the proxy", dir, file, err)
}
if _, err := conffile.Load(s.TailscaledConfigFilePath); err != nil {
return fmt.Errorf("error validating tailscaled configfile contents: %w", err)
}
}
if s.ProxyTargetIP != "" && s.UserspaceMode {
return errors.New("TS_DEST_IP is not supported with TS_USERSPACE")
}
if s.ProxyTargetDNSName != "" && s.UserspaceMode {
return errors.New("TS_EXPERIMENTAL_DEST_DNS_NAME is not supported with TS_USERSPACE")
}
if s.ProxyTargetDNSName != "" && s.ProxyTargetIP != "" {
return errors.New("TS_EXPERIMENTAL_DEST_DNS_NAME and TS_DEST_IP cannot both be set")
}
if s.TailnetTargetIP != "" && s.UserspaceMode {
return errors.New("TS_TAILNET_TARGET_IP is not supported with TS_USERSPACE")
}
if s.TailnetTargetFQDN != "" && s.UserspaceMode {
return errors.New("TS_TAILNET_TARGET_FQDN is not supported with TS_USERSPACE")
}
if s.TailnetTargetFQDN != "" && s.TailnetTargetIP != "" {
return errors.New("Both TS_TAILNET_TARGET_IP and TS_TAILNET_FQDN cannot be set")
}
if s.TailscaledConfigFilePath != "" && (s.AcceptDNS != nil || s.AuthKey != "" || s.Routes != nil || s.ExtraArgs != "" || s.Hostname != "") {
return errors.New("TS_EXPERIMENTAL_VERSIONED_CONFIG_DIR cannot be set in combination with TS_HOSTNAME, TS_EXTRA_ARGS, TS_AUTHKEY, TS_ROUTES, TS_ACCEPT_DNS.")
}
if s.AllowProxyingClusterTrafficViaIngress && s.UserspaceMode {
return errors.New("EXPERIMENTAL_ALLOW_PROXYING_CLUSTER_TRAFFIC_VIA_INGRESS is not supported in userspace mode")
}
if s.AllowProxyingClusterTrafficViaIngress && s.ServeConfigPath == "" {
return errors.New("EXPERIMENTAL_ALLOW_PROXYING_CLUSTER_TRAFFIC_VIA_INGRESS is set but this is not a cluster ingress proxy")
}
if s.AllowProxyingClusterTrafficViaIngress && s.PodIP == "" {
return errors.New("EXPERIMENTAL_ALLOW_PROXYING_CLUSTER_TRAFFIC_VIA_INGRESS is set but POD_IP is not set")
}
if s.EnableForwardingOptimizations && s.UserspaceMode {
return errors.New("TS_EXPERIMENTAL_ENABLE_FORWARDING_OPTIMIZATIONS is not supported in userspace mode")
}
return nil
}
func resolveDNS(ctx context.Context, name string) ([]net.IP, error) {
// TODO (irbekrm): look at using recursive.Resolver instead to resolve
// the DNS names as well as retrieve TTLs. It looks though that this
@@ -1224,57 +683,6 @@ func resolveDNS(ctx context.Context, name string) ([]net.IP, error) {
return append(ip4s, ip6s...), nil
}
// defaultEnv returns the value of the given envvar name, or defVal if
// unset.
func defaultEnv(name, defVal string) string {
if v, ok := os.LookupEnv(name); ok {
return v
}
return defVal
}
// defaultEnvStringPointer returns a pointer to the given envvar value if set, else
// returns nil. This is useful in cases where we need to distinguish between a
// variable being set to empty string vs unset.
func defaultEnvStringPointer(name string) *string {
if v, ok := os.LookupEnv(name); ok {
return &v
}
return nil
}
// defaultEnvBoolPointer returns a pointer to the given envvar value if set, else
// returns nil. This is useful in cases where we need to distinguish between a
// variable being explicitly set to false vs unset.
func defaultEnvBoolPointer(name string) *bool {
v := os.Getenv(name)
ret, err := strconv.ParseBool(v)
if err != nil {
return nil
}
return &ret
}
func defaultEnvs(names []string, defVal string) string {
for _, name := range names {
if v, ok := os.LookupEnv(name); ok {
return v
}
}
return defVal
}
// defaultBool returns the boolean value of the given envvar name, or
// defVal if unset or not a bool.
func defaultBool(name string, defVal bool) bool {
v := os.Getenv(name)
ret, err := strconv.ParseBool(v)
if err != nil {
return defVal
}
return ret
}
// contextWithExitSignalWatch watches for SIGTERM/SIGINT signals. It returns a
// context that gets cancelled when a signal is received and a cancel function
// that can be called to free the resources when the watch should be stopped.
@@ -1297,43 +705,6 @@ func contextWithExitSignalWatch() (context.Context, func()) {
return ctx, f
}
// isTwoStepConfigAuthOnce returns true if the Tailscale node should be configured
// in two steps and login should only happen once.
// Step 1: run 'tailscaled'
// Step 2):
// A) if this is the first time starting this node run 'tailscale up --authkey <authkey> <config opts>'
// B) if this is not the first time starting this node run 'tailscale set <config opts>'.
func isTwoStepConfigAuthOnce(cfg *settings) bool {
return cfg.AuthOnce && cfg.TailscaledConfigFilePath == ""
}
// isTwoStepConfigAlwaysAuth returns true if the Tailscale node should be configured
// in two steps and we should log in every time it starts.
// Step 1: run 'tailscaled'
// Step 2): run 'tailscale up --authkey <authkey> <config opts>'
func isTwoStepConfigAlwaysAuth(cfg *settings) bool {
return !cfg.AuthOnce && cfg.TailscaledConfigFilePath == ""
}
// isOneStepConfig returns true if the Tailscale node should always be ran and
// configured in a single step by running 'tailscaled <config opts>'
func isOneStepConfig(cfg *settings) bool {
return cfg.TailscaledConfigFilePath != ""
}
// isL3Proxy returns true if the Tailscale node needs to be configured to act
// as an L3 proxy, proxying to an endpoint provided via one of the config env
// vars.
func isL3Proxy(cfg *settings) bool {
return cfg.ProxyTargetIP != "" || cfg.ProxyTargetDNSName != "" || cfg.TailnetTargetIP != "" || cfg.TailnetTargetFQDN != "" || cfg.AllowProxyingClusterTrafficViaIngress
}
// hasKubeStateStore returns true if the state must be stored in a Kubernetes
// Secret.
func hasKubeStateStore(cfg *settings) bool {
return cfg.InKubernetes && cfg.KubernetesCanPatch && cfg.KubeSecret != ""
}
// tailscaledConfigFilePath returns the path to the tailscaled config file that
// should be used for the current capability version. It is determined by the
// TS_EXPERIMENTAL_VERSIONED_CONFIG_DIR environment variable and looks for a
@@ -1359,7 +730,6 @@ func tailscaledConfigFilePath() string {
}
cv, err := kubeutils.CapVerFromFileName(e.Name())
if err != nil {
log.Printf("skipping file %q in tailscaled config directory %q: %v", e.Name(), dir, err)
continue
}
if cv > maxCompatVer && cv <= tailcfg.CurrentCapabilityVersion {
@@ -1367,8 +737,9 @@ func tailscaledConfigFilePath() string {
}
}
if maxCompatVer == -1 {
log.Fatalf("no tailscaled config file found in %q for current capability version %q", dir, tailcfg.CurrentCapabilityVersion)
log.Fatalf("no tailscaled config file found in %q for current capability version %d", dir, tailcfg.CurrentCapabilityVersion)
}
log.Printf("Using tailscaled config file %q for capability version %q", maxCompatVer, tailcfg.CurrentCapabilityVersion)
return path.Join(dir, kubeutils.TailscaledConfigFileNameForCap(maxCompatVer))
filePath := filepath.Join(dir, kubeutils.TailscaledConfigFileName(maxCompatVer))
log.Printf("Using tailscaled config file %q to match current capability version %d", filePath, tailcfg.CurrentCapabilityVersion)
return filePath
}

View File

@@ -52,7 +52,7 @@ func TestContainerBoot(t *testing.T) {
}
defer kube.Close()
tailscaledConf := &ipn.ConfigVAlpha{AuthKey: func(s string) *string { return &s }("foo"), Version: "alpha0"}
tailscaledConf := &ipn.ConfigVAlpha{AuthKey: ptr.To("foo"), Version: "alpha0"}
tailscaledConfBytes, err := json.Marshal(tailscaledConf)
if err != nil {
t.Fatalf("error unmarshaling tailscaled config: %v", err)
@@ -116,6 +116,9 @@ func TestContainerBoot(t *testing.T) {
// WantFiles files that should exist in the container and their
// contents.
WantFiles map[string]string
// WantFatalLog is the fatal log message we expect from containerboot.
// If set for a phase, the test will finish on that phase.
WantFatalLog string
}
runningNotify := &ipn.Notify{
State: ptr.To(ipn.Running),
@@ -349,12 +352,57 @@ func TestContainerBoot(t *testing.T) {
"/usr/bin/tailscaled --socket=/tmp/tailscaled.sock --state=mem: --statedir=/tmp",
"/usr/bin/tailscale --socket=/tmp/tailscaled.sock up --accept-dns=false --authkey=tskey-key",
},
WantFiles: map[string]string{
"proc/sys/net/ipv4/ip_forward": "1",
"proc/sys/net/ipv6/conf/all/forwarding": "0",
},
},
{
Notify: runningNotify,
},
},
},
{
Name: "egress_proxy_fqdn_ipv6_target_on_ipv4_host",
Env: map[string]string{
"TS_AUTHKEY": "tskey-key",
"TS_TAILNET_TARGET_FQDN": "ipv6-node.test.ts.net", // resolves to IPv6 address
"TS_USERSPACE": "false",
"TS_TEST_FAKE_NETFILTER_6": "false",
},
Phases: []phase{
{
WantCmds: []string{
"/usr/bin/tailscaled --socket=/tmp/tailscaled.sock --state=mem: --statedir=/tmp",
"/usr/bin/tailscale --socket=/tmp/tailscaled.sock up --accept-dns=false --authkey=tskey-key",
},
WantFiles: map[string]string{
"proc/sys/net/ipv4/ip_forward": "1",
"proc/sys/net/ipv6/conf/all/forwarding": "0",
},
},
{
Notify: &ipn.Notify{
State: ptr.To(ipn.Running),
NetMap: &netmap.NetworkMap{
SelfNode: (&tailcfg.Node{
StableID: tailcfg.StableNodeID("myID"),
Name: "test-node.test.ts.net",
Addresses: []netip.Prefix{netip.MustParsePrefix("100.64.0.1/32")},
}).View(),
Peers: []tailcfg.NodeView{
(&tailcfg.Node{
StableID: tailcfg.StableNodeID("ipv6ID"),
Name: "ipv6-node.test.ts.net",
Addresses: []netip.Prefix{netip.MustParsePrefix("::1/128")},
}).View(),
},
},
},
WantFatalLog: "no forwarding rules for egress addresses [::1/128], host supports IPv6: false",
},
},
},
{
Name: "authkey_once",
Env: map[string]string{
@@ -697,6 +745,25 @@ func TestContainerBoot(t *testing.T) {
var wantCmds []string
for i, p := range test.Phases {
lapi.Notify(p.Notify)
if p.WantFatalLog != "" {
err := tstest.WaitFor(2*time.Second, func() error {
state, err := cmd.Process.Wait()
if err != nil {
return err
}
if state.ExitCode() != 1 {
return fmt.Errorf("process exited with code %d but wanted %d", state.ExitCode(), 1)
}
waitLogLine(t, time.Second, cbOut, p.WantFatalLog)
return nil
})
if err != nil {
t.Fatal(err)
}
// Early test return, we don't expect the successful startup log message.
return
}
wantCmds = append(wantCmds, p.WantCmds...)
waitArgs(t, 2*time.Second, d, argFile, strings.Join(wantCmds, "\n"))
err := tstest.WaitFor(2*time.Second, func() error {

View File

@@ -0,0 +1,96 @@
// Copyright (c) Tailscale Inc & AUTHORS
// SPDX-License-Identifier: BSD-3-Clause
//go:build linux
package main
import (
"bytes"
"context"
"encoding/json"
"log"
"os"
"path/filepath"
"reflect"
"sync/atomic"
"time"
"github.com/fsnotify/fsnotify"
"tailscale.com/client/tailscale"
"tailscale.com/ipn"
)
// watchServeConfigChanges watches path for changes, and when it sees one, reads
// the serve config from it, replacing ${TS_CERT_DOMAIN} with certDomain, and
// applies it to lc. It exits when ctx is canceled. cdChanged is a channel that
// is written to when the certDomain changes, causing the serve config to be
// re-read and applied.
func watchServeConfigChanges(ctx context.Context, path string, cdChanged <-chan bool, certDomainAtomic *atomic.Pointer[string], lc *tailscale.LocalClient) {
if certDomainAtomic == nil {
panic("cd must not be nil")
}
var tickChan <-chan time.Time
var eventChan <-chan fsnotify.Event
if w, err := fsnotify.NewWatcher(); err != nil {
log.Printf("failed to create fsnotify watcher, timer-only mode: %v", err)
ticker := time.NewTicker(5 * time.Second)
defer ticker.Stop()
tickChan = ticker.C
} else {
defer w.Close()
if err := w.Add(filepath.Dir(path)); err != nil {
log.Fatalf("failed to add fsnotify watch: %v", err)
}
eventChan = w.Events
}
var certDomain string
var prevServeConfig *ipn.ServeConfig
for {
select {
case <-ctx.Done():
return
case <-cdChanged:
certDomain = *certDomainAtomic.Load()
case <-tickChan:
case <-eventChan:
// We can't do any reasonable filtering on the event because of how
// k8s handles these mounts. So just re-read the file and apply it
// if it's changed.
}
if certDomain == "" {
continue
}
sc, err := readServeConfig(path, certDomain)
if err != nil {
log.Fatalf("failed to read serve config: %v", err)
}
if prevServeConfig != nil && reflect.DeepEqual(sc, prevServeConfig) {
continue
}
log.Printf("Applying serve config")
if err := lc.SetServeConfig(ctx, sc); err != nil {
log.Fatalf("failed to set serve config: %v", err)
}
prevServeConfig = sc
}
}
// readServeConfig reads the ipn.ServeConfig from path, replacing
// ${TS_CERT_DOMAIN} with certDomain.
func readServeConfig(path, certDomain string) (*ipn.ServeConfig, error) {
if path == "" {
return nil, nil
}
j, err := os.ReadFile(path)
if err != nil {
return nil, err
}
j = bytes.ReplaceAll(j, []byte("${TS_CERT_DOMAIN}"), []byte(certDomain))
var sc ipn.ServeConfig
if err := json.Unmarshal(j, &sc); err != nil {
return nil, err
}
return &sc, nil
}

View File

@@ -0,0 +1,571 @@
// Copyright (c) Tailscale Inc & AUTHORS
// SPDX-License-Identifier: BSD-3-Clause
//go:build linux
package main
import (
"context"
"encoding/json"
"errors"
"fmt"
"log"
"net/netip"
"os"
"path/filepath"
"reflect"
"strings"
"time"
"github.com/fsnotify/fsnotify"
"tailscale.com/ipn"
"tailscale.com/kube/egressservices"
"tailscale.com/kube/kubeclient"
"tailscale.com/tailcfg"
"tailscale.com/util/linuxfw"
"tailscale.com/util/mak"
)
const tailscaleTunInterface = "tailscale0"
// This file contains functionality to run containerboot as a proxy that can
// route cluster traffic to one or more tailnet targets, based on portmapping
// rules read from a configfile. Currently (9/2024) this is only used for the
// Kubernetes operator egress proxies.
// egressProxy knows how to configure firewall rules to route cluster traffic to
// one or more tailnet services.
type egressProxy struct {
cfgPath string // path to egress service config file
nfr linuxfw.NetfilterRunner // never nil
kc kubeclient.Client // never nil
stateSecret string // name of the kube state Secret
netmapChan chan ipn.Notify // chan to receive netmap updates on
podIPv4 string // never empty string, currently only IPv4 is supported
// tailnetFQDNs is the egress service FQDN to tailnet IP mappings that
// were last used to configure firewall rules for this proxy.
// TODO(irbekrm): target addresses are also stored in the state Secret.
// Evaluate whether we should retrieve them from there and not store in
// memory at all.
targetFQDNs map[string][]netip.Prefix
// used to configure firewall rules.
tailnetAddrs []netip.Prefix
}
// run configures egress proxy firewall rules and ensures that the firewall rules are reconfigured when:
// - the mounted egress config has changed
// - the proxy's tailnet IP addresses have changed
// - tailnet IPs have changed for any backend targets specified by tailnet FQDN
func (ep *egressProxy) run(ctx context.Context, n ipn.Notify) error {
var tickChan <-chan time.Time
var eventChan <-chan fsnotify.Event
// TODO (irbekrm): take a look if this can be pulled into a single func
// shared with serve config loader.
if w, err := fsnotify.NewWatcher(); err != nil {
log.Printf("failed to create fsnotify watcher, timer-only mode: %v", err)
ticker := time.NewTicker(5 * time.Second)
defer ticker.Stop()
tickChan = ticker.C
} else {
defer w.Close()
if err := w.Add(filepath.Dir(ep.cfgPath)); err != nil {
return fmt.Errorf("failed to add fsnotify watch: %w", err)
}
eventChan = w.Events
}
if err := ep.sync(ctx, n); err != nil {
return err
}
for {
var err error
select {
case <-ctx.Done():
return nil
case <-tickChan:
err = ep.sync(ctx, n)
case <-eventChan:
log.Printf("config file change detected, ensuring firewall config is up to date...")
err = ep.sync(ctx, n)
case n = <-ep.netmapChan:
shouldResync := ep.shouldResync(n)
if shouldResync {
log.Printf("netmap change detected, ensuring firewall config is up to date...")
err = ep.sync(ctx, n)
}
}
if err != nil {
return fmt.Errorf("error syncing egress service config: %w", err)
}
}
}
// sync triggers an egress proxy config resync. The resync calculates the diff between config and status to determine if
// any firewall rules need to be updated. Currently using status in state Secret as a reference for what is the current
// firewall configuration is good enough because - the status is keyed by the Pod IP - we crash the Pod on errors such
// as failed firewall update
func (ep *egressProxy) sync(ctx context.Context, n ipn.Notify) error {
cfgs, err := ep.getConfigs()
if err != nil {
return fmt.Errorf("error retrieving egress service configs: %w", err)
}
status, err := ep.getStatus(ctx)
if err != nil {
return fmt.Errorf("error retrieving current egress proxy status: %w", err)
}
newStatus, err := ep.syncEgressConfigs(cfgs, status, n)
if err != nil {
return fmt.Errorf("error syncing egress service configs: %w", err)
}
if !servicesStatusIsEqual(newStatus, status) {
if err := ep.setStatus(ctx, newStatus, n); err != nil {
return fmt.Errorf("error setting egress proxy status: %w", err)
}
}
return nil
}
// addrsHaveChanged returns true if the provided netmap update contains tailnet address change for this proxy node.
// Netmap must not be nil.
func (ep *egressProxy) addrsHaveChanged(n ipn.Notify) bool {
return !reflect.DeepEqual(ep.tailnetAddrs, n.NetMap.SelfNode.Addresses())
}
// syncEgressConfigs adds and deletes firewall rules to match the desired
// configuration. It uses the provided status to determine what is currently
// applied and updates the status after a successful sync.
func (ep *egressProxy) syncEgressConfigs(cfgs *egressservices.Configs, status *egressservices.Status, n ipn.Notify) (*egressservices.Status, error) {
if !(wantsServicesConfigured(cfgs) || hasServicesConfigured(status)) {
return nil, nil
}
// Delete unnecessary services.
if err := ep.deleteUnnecessaryServices(cfgs, status); err != nil {
return nil, fmt.Errorf("error deleting services: %w", err)
}
newStatus := &egressservices.Status{}
if !wantsServicesConfigured(cfgs) {
return newStatus, nil
}
// Add new services, update rules for any that have changed.
rulesPerSvcToAdd := make(map[string][]rule, 0)
rulesPerSvcToDelete := make(map[string][]rule, 0)
for svcName, cfg := range *cfgs {
tailnetTargetIPs, err := ep.tailnetTargetIPsForSvc(cfg, n)
if err != nil {
return nil, fmt.Errorf("error determining tailnet target IPs: %w", err)
}
rulesToAdd, rulesToDelete, err := updatesForCfg(svcName, cfg, status, tailnetTargetIPs)
if err != nil {
return nil, fmt.Errorf("error validating service changes: %v", err)
}
log.Printf("syncegressservices: looking at svc %s rulesToAdd %d rulesToDelete %d", svcName, len(rulesToAdd), len(rulesToDelete))
if len(rulesToAdd) != 0 {
mak.Set(&rulesPerSvcToAdd, svcName, rulesToAdd)
}
if len(rulesToDelete) != 0 {
mak.Set(&rulesPerSvcToDelete, svcName, rulesToDelete)
}
if len(rulesToAdd) != 0 || ep.addrsHaveChanged(n) {
// For each tailnet target, set up SNAT from the local tailnet device address of the matching
// family.
for _, t := range tailnetTargetIPs {
var local netip.Addr
for _, pfx := range n.NetMap.SelfNode.Addresses().All() {
if !pfx.IsSingleIP() {
continue
}
if pfx.Addr().Is4() != t.Is4() {
continue
}
local = pfx.Addr()
break
}
if !local.IsValid() {
return nil, fmt.Errorf("no valid local IP: %v", local)
}
if err := ep.nfr.EnsureSNATForDst(local, t); err != nil {
return nil, fmt.Errorf("error setting up SNAT rule: %w", err)
}
}
}
// Update the status. Status will be written back to the state Secret by the caller.
mak.Set(&newStatus.Services, svcName, &egressservices.ServiceStatus{TailnetTargetIPs: tailnetTargetIPs, TailnetTarget: cfg.TailnetTarget, Ports: cfg.Ports})
}
// Actually apply the firewall rules.
if err := ensureRulesAdded(rulesPerSvcToAdd, ep.nfr); err != nil {
return nil, fmt.Errorf("error adding rules: %w", err)
}
if err := ensureRulesDeleted(rulesPerSvcToDelete, ep.nfr); err != nil {
return nil, fmt.Errorf("error deleting rules: %w", err)
}
return newStatus, nil
}
// updatesForCfg calculates any rules that need to be added or deleted for an individucal egress service config.
func updatesForCfg(svcName string, cfg egressservices.Config, status *egressservices.Status, tailnetTargetIPs []netip.Addr) ([]rule, []rule, error) {
rulesToAdd := make([]rule, 0)
rulesToDelete := make([]rule, 0)
currentConfig, ok := lookupCurrentConfig(svcName, status)
// If no rules for service are present yet, add them all.
if !ok {
for _, t := range tailnetTargetIPs {
for ports := range cfg.Ports {
log.Printf("syncegressservices: svc %s adding port %v", svcName, ports)
rulesToAdd = append(rulesToAdd, rule{tailnetPort: ports.TargetPort, containerPort: ports.MatchPort, protocol: ports.Protocol, tailnetIP: t})
}
}
return rulesToAdd, rulesToDelete, nil
}
// If there are no backend targets available, delete any currently configured rules.
if len(tailnetTargetIPs) == 0 {
log.Printf("tailnet target for egress service %s does not have any backend addresses, deleting all rules", svcName)
for _, ip := range currentConfig.TailnetTargetIPs {
for ports := range currentConfig.Ports {
rulesToDelete = append(rulesToAdd, rule{tailnetPort: ports.TargetPort, containerPort: ports.MatchPort, protocol: ports.Protocol, tailnetIP: ip})
}
}
return rulesToAdd, rulesToDelete, nil
}
// If there are rules present for backend targets that no longer match, delete them.
for _, ip := range currentConfig.TailnetTargetIPs {
var found bool
for _, wantsIP := range tailnetTargetIPs {
if reflect.DeepEqual(ip, wantsIP) {
found = true
break
}
}
if !found {
for ports := range currentConfig.Ports {
rulesToDelete = append(rulesToDelete, rule{tailnetPort: ports.TargetPort, containerPort: ports.MatchPort, protocol: ports.Protocol, tailnetIP: ip})
}
}
}
// Sync rules for the currently wanted backend targets.
for _, ip := range tailnetTargetIPs {
// If the backend target is not yet present in status, add all rules.
var found bool
for _, gotIP := range currentConfig.TailnetTargetIPs {
if reflect.DeepEqual(ip, gotIP) {
found = true
break
}
}
if !found {
for ports := range cfg.Ports {
rulesToAdd = append(rulesToAdd, rule{tailnetPort: ports.TargetPort, containerPort: ports.MatchPort, protocol: ports.Protocol, tailnetIP: ip})
}
continue
}
// If the backend target is present in status, check that the
// currently applied rules are up to date.
// Delete any current portmappings that are no longer present in config.
for port := range currentConfig.Ports {
if _, ok := cfg.Ports[port]; ok {
continue
}
rulesToDelete = append(rulesToDelete, rule{tailnetPort: port.TargetPort, containerPort: port.MatchPort, protocol: port.Protocol, tailnetIP: ip})
}
// Add any new portmappings.
for port := range cfg.Ports {
if _, ok := currentConfig.Ports[port]; ok {
continue
}
rulesToAdd = append(rulesToAdd, rule{tailnetPort: port.TargetPort, containerPort: port.MatchPort, protocol: port.Protocol, tailnetIP: ip})
}
}
return rulesToAdd, rulesToDelete, nil
}
// deleteUnneccessaryServices ensure that any services found on status, but not
// present in config are deleted.
func (ep *egressProxy) deleteUnnecessaryServices(cfgs *egressservices.Configs, status *egressservices.Status) error {
if !hasServicesConfigured(status) {
return nil
}
if !wantsServicesConfigured(cfgs) {
for svcName, svc := range status.Services {
log.Printf("service %s is no longer required, deleting", svcName)
if err := ensureServiceDeleted(svcName, svc, ep.nfr); err != nil {
return fmt.Errorf("error deleting service %s: %w", svcName, err)
}
}
return nil
}
for svcName, svc := range status.Services {
if _, ok := (*cfgs)[svcName]; !ok {
log.Printf("service %s is no longer required, deleting", svcName)
if err := ensureServiceDeleted(svcName, svc, ep.nfr); err != nil {
return fmt.Errorf("error deleting service %s: %w", svcName, err)
}
// TODO (irbekrm): also delete the SNAT rule here
}
}
return nil
}
// getConfigs gets the mounted egress service configuration.
func (ep *egressProxy) getConfigs() (*egressservices.Configs, error) {
j, err := os.ReadFile(ep.cfgPath)
if os.IsNotExist(err) {
return nil, nil
}
if err != nil {
return nil, err
}
if len(j) == 0 || string(j) == "" {
return nil, nil
}
cfg := &egressservices.Configs{}
if err := json.Unmarshal(j, &cfg); err != nil {
return nil, err
}
return cfg, nil
}
// getStatus gets the current status of the configured firewall. The current
// status is stored in state Secret. Returns nil status if no status that
// applies to the current proxy Pod was found. Uses the Pod IP to determine if a
// status found in the state Secret applies to this proxy Pod.
func (ep *egressProxy) getStatus(ctx context.Context) (*egressservices.Status, error) {
secret, err := ep.kc.GetSecret(ctx, ep.stateSecret)
if err != nil {
return nil, fmt.Errorf("error retrieving state secret: %w", err)
}
status := &egressservices.Status{}
raw, ok := secret.Data[egressservices.KeyEgressServices]
if !ok {
return nil, nil
}
if err := json.Unmarshal([]byte(raw), status); err != nil {
return nil, fmt.Errorf("error unmarshalling previous config: %w", err)
}
if reflect.DeepEqual(status.PodIPv4, ep.podIPv4) {
return status, nil
}
return nil, nil
}
// setStatus writes egress proxy's currently configured firewall to the state
// Secret and updates proxy's tailnet addresses.
func (ep *egressProxy) setStatus(ctx context.Context, status *egressservices.Status, n ipn.Notify) error {
// Pod IP is used to determine if a stored status applies to THIS proxy Pod.
if status == nil {
status = &egressservices.Status{}
}
status.PodIPv4 = ep.podIPv4
secret, err := ep.kc.GetSecret(ctx, ep.stateSecret)
if err != nil {
return fmt.Errorf("error retrieving state Secret: %w", err)
}
bs, err := json.Marshal(status)
if err != nil {
return fmt.Errorf("error marshalling service config: %w", err)
}
secret.Data[egressservices.KeyEgressServices] = bs
patch := kubeclient.JSONPatch{
Op: "replace",
Path: fmt.Sprintf("/data/%s", egressservices.KeyEgressServices),
Value: bs,
}
if err := ep.kc.JSONPatchSecret(ctx, ep.stateSecret, []kubeclient.JSONPatch{patch}); err != nil {
return fmt.Errorf("error patching state Secret: %w", err)
}
ep.tailnetAddrs = n.NetMap.SelfNode.Addresses().AsSlice()
return nil
}
// tailnetTargetIPsForSvc returns the tailnet IPs to which traffic for this
// egress service should be proxied. The egress service can be configured by IP
// or by FQDN. If it's configured by IP, just return that. If it's configured by
// FQDN, resolve the FQDN and return the resolved IPs. It checks if the
// netfilter runner supports IPv6 NAT and skips any IPv6 addresses if it
// doesn't.
func (ep *egressProxy) tailnetTargetIPsForSvc(svc egressservices.Config, n ipn.Notify) (addrs []netip.Addr, err error) {
if svc.TailnetTarget.IP != "" {
addr, err := netip.ParseAddr(svc.TailnetTarget.IP)
if err != nil {
return nil, fmt.Errorf("error parsing tailnet target IP: %w", err)
}
if addr.Is6() && !ep.nfr.HasIPV6NAT() {
log.Printf("tailnet target is an IPv6 address, but this host does not support IPv6 in the chosen firewall mode. This will probably not work.")
return addrs, nil
}
return []netip.Addr{addr}, nil
}
if svc.TailnetTarget.FQDN == "" {
return nil, errors.New("unexpected egress service config- neither tailnet target IP nor FQDN is set")
}
if n.NetMap == nil {
log.Printf("netmap is not available, unable to determine backend addresses for %s", svc.TailnetTarget.FQDN)
return addrs, nil
}
var (
node tailcfg.NodeView
nodeFound bool
)
for _, nn := range n.NetMap.Peers {
if equalFQDNs(nn.Name(), svc.TailnetTarget.FQDN) {
node = nn
nodeFound = true
break
}
}
if nodeFound {
for _, addr := range node.Addresses().AsSlice() {
if addr.Addr().Is6() && !ep.nfr.HasIPV6NAT() {
log.Printf("tailnet target %v is an IPv6 address, but this host does not support IPv6 in the chosen firewall mode, skipping.", addr.Addr().String())
continue
}
addrs = append(addrs, addr.Addr())
}
// Egress target endpoints configured via FQDN are stored, so
// that we can determine if a netmap update should trigger a
// resync.
mak.Set(&ep.targetFQDNs, svc.TailnetTarget.FQDN, node.Addresses().AsSlice())
}
return addrs, nil
}
// shouldResync parses netmap update and returns true if the update contains
// changes for which the egress proxy's firewall should be reconfigured.
func (ep *egressProxy) shouldResync(n ipn.Notify) bool {
if n.NetMap == nil {
return false
}
// If proxy's tailnet addresses have changed, resync.
if !reflect.DeepEqual(n.NetMap.SelfNode.Addresses().AsSlice(), ep.tailnetAddrs) {
log.Printf("node addresses have changed, trigger egress config resync")
ep.tailnetAddrs = n.NetMap.SelfNode.Addresses().AsSlice()
return true
}
// If the IPs for any of the egress services configured via FQDN have
// changed, resync.
for fqdn, ips := range ep.targetFQDNs {
for _, nn := range n.NetMap.Peers {
if equalFQDNs(nn.Name(), fqdn) {
if !reflect.DeepEqual(ips, nn.Addresses().AsSlice()) {
log.Printf("backend addresses for egress target %q have changed old IPs %v, new IPs %v trigger egress config resync", nn.Name(), ips, nn.Addresses().AsSlice())
}
return true
}
}
}
return false
}
// ensureServiceDeleted ensures that any rules for an egress service are removed
// from the firewall configuration.
func ensureServiceDeleted(svcName string, svc *egressservices.ServiceStatus, nfr linuxfw.NetfilterRunner) error {
// Note that the portmap is needed for iptables based firewall only.
// Nftables group rules for a service in a chain, so there is no need to
// specify individual portmapping based rules.
pms := make([]linuxfw.PortMap, 0)
for pm := range svc.Ports {
pms = append(pms, linuxfw.PortMap{MatchPort: pm.MatchPort, TargetPort: pm.TargetPort, Protocol: pm.Protocol})
}
if err := nfr.DeleteSvc(svcName, tailscaleTunInterface, svc.TailnetTargetIPs, pms); err != nil {
return fmt.Errorf("error deleting service %s: %w", svcName, err)
}
return nil
}
// ensureRulesAdded ensures that all portmapping rules are added to the firewall
// configuration. For any rules that already exist, calling this function is a
// no-op. In case of nftables, a service consists of one or two (one per IP
// family) chains that conain the portmapping rules for the service and the
// chains as needed when this function is called.
func ensureRulesAdded(rulesPerSvc map[string][]rule, nfr linuxfw.NetfilterRunner) error {
for svc, rules := range rulesPerSvc {
for _, rule := range rules {
log.Printf("ensureRulesAdded svc %s tailnetTarget %s container port %d tailnet port %d protocol %s", svc, rule.tailnetIP, rule.containerPort, rule.tailnetPort, rule.protocol)
if err := nfr.EnsurePortMapRuleForSvc(svc, tailscaleTunInterface, rule.tailnetIP, linuxfw.PortMap{MatchPort: rule.containerPort, TargetPort: rule.tailnetPort, Protocol: rule.protocol}); err != nil {
return fmt.Errorf("error ensuring rule: %w", err)
}
}
}
return nil
}
// ensureRulesDeleted ensures that the given rules are deleted from the firewall
// configuration. For any rules that do not exist, calling this funcion is a
// no-op.
func ensureRulesDeleted(rulesPerSvc map[string][]rule, nfr linuxfw.NetfilterRunner) error {
for svc, rules := range rulesPerSvc {
for _, rule := range rules {
log.Printf("ensureRulesDeleted svc %s tailnetTarget %s container port %d tailnet port %d protocol %s", svc, rule.tailnetIP, rule.containerPort, rule.tailnetPort, rule.protocol)
if err := nfr.DeletePortMapRuleForSvc(svc, tailscaleTunInterface, rule.tailnetIP, linuxfw.PortMap{MatchPort: rule.containerPort, TargetPort: rule.tailnetPort, Protocol: rule.protocol}); err != nil {
return fmt.Errorf("error deleting rule: %w", err)
}
}
}
return nil
}
func lookupCurrentConfig(svcName string, status *egressservices.Status) (*egressservices.ServiceStatus, bool) {
if status == nil || len(status.Services) == 0 {
return nil, false
}
c, ok := status.Services[svcName]
return c, ok
}
func equalFQDNs(s, s1 string) bool {
s, _ = strings.CutSuffix(s, ".")
s1, _ = strings.CutSuffix(s1, ".")
return strings.EqualFold(s, s1)
}
// rule contains configuration for an egress proxy firewall rule.
type rule struct {
containerPort uint16 // port to match incoming traffic
tailnetPort uint16 // tailnet service port
tailnetIP netip.Addr // tailnet service IP
protocol string
}
func wantsServicesConfigured(cfgs *egressservices.Configs) bool {
return cfgs != nil && len(*cfgs) != 0
}
func hasServicesConfigured(status *egressservices.Status) bool {
return status != nil && len(status.Services) != 0
}
func servicesStatusIsEqual(st, st1 *egressservices.Status) bool {
if st == nil && st1 == nil {
return true
}
if st == nil || st1 == nil {
return false
}
st.PodIPv4 = ""
st1.PodIPv4 = ""
return reflect.DeepEqual(*st, *st1)
}

View File

@@ -0,0 +1,175 @@
// Copyright (c) Tailscale Inc & AUTHORS
// SPDX-License-Identifier: BSD-3-Clause
//go:build linux
package main
import (
"net/netip"
"reflect"
"testing"
"tailscale.com/kube/egressservices"
)
func Test_updatesForSvc(t *testing.T) {
tailnetIPv4, tailnetIPv6 := netip.MustParseAddr("100.99.99.99"), netip.MustParseAddr("fd7a:115c:a1e0::701:b62a")
tailnetIPv4_1, tailnetIPv6_1 := netip.MustParseAddr("100.88.88.88"), netip.MustParseAddr("fd7a:115c:a1e0::4101:512f")
ports := map[egressservices.PortMap]struct{}{{Protocol: "tcp", MatchPort: 4003, TargetPort: 80}: {}}
ports1 := map[egressservices.PortMap]struct{}{{Protocol: "udp", MatchPort: 4004, TargetPort: 53}: {}}
ports2 := map[egressservices.PortMap]struct{}{{Protocol: "tcp", MatchPort: 4003, TargetPort: 80}: {},
{Protocol: "tcp", MatchPort: 4005, TargetPort: 443}: {}}
fqdnSpec := egressservices.Config{
TailnetTarget: egressservices.TailnetTarget{FQDN: "test"},
Ports: ports,
}
fqdnSpec1 := egressservices.Config{
TailnetTarget: egressservices.TailnetTarget{FQDN: "test"},
Ports: ports1,
}
fqdnSpec2 := egressservices.Config{
TailnetTarget: egressservices.TailnetTarget{IP: tailnetIPv4.String()},
Ports: ports,
}
fqdnSpec3 := egressservices.Config{
TailnetTarget: egressservices.TailnetTarget{IP: tailnetIPv4.String()},
Ports: ports2,
}
r := rule{containerPort: 4003, tailnetPort: 80, protocol: "tcp", tailnetIP: tailnetIPv4}
r1 := rule{containerPort: 4003, tailnetPort: 80, protocol: "tcp", tailnetIP: tailnetIPv6}
r2 := rule{tailnetPort: 53, containerPort: 4004, protocol: "udp", tailnetIP: tailnetIPv4}
r3 := rule{tailnetPort: 53, containerPort: 4004, protocol: "udp", tailnetIP: tailnetIPv6}
r4 := rule{containerPort: 4003, tailnetPort: 80, protocol: "tcp", tailnetIP: tailnetIPv4_1}
r5 := rule{containerPort: 4003, tailnetPort: 80, protocol: "tcp", tailnetIP: tailnetIPv6_1}
r6 := rule{containerPort: 4005, tailnetPort: 443, protocol: "tcp", tailnetIP: tailnetIPv4}
tests := []struct {
name string
svcName string
tailnetTargetIPs []netip.Addr
podIP string
spec egressservices.Config
status *egressservices.Status
wantRulesToAdd []rule
wantRulesToDelete []rule
}{
{
name: "add_fqdn_svc_that_does_not_yet_exist",
svcName: "test",
tailnetTargetIPs: []netip.Addr{tailnetIPv4, tailnetIPv6},
spec: fqdnSpec,
status: &egressservices.Status{},
wantRulesToAdd: []rule{r, r1},
wantRulesToDelete: []rule{},
},
{
name: "fqdn_svc_already_exists",
svcName: "test",
tailnetTargetIPs: []netip.Addr{tailnetIPv4, tailnetIPv6},
spec: fqdnSpec,
status: &egressservices.Status{
Services: map[string]*egressservices.ServiceStatus{"test": {
TailnetTargetIPs: []netip.Addr{tailnetIPv4, tailnetIPv6},
TailnetTarget: egressservices.TailnetTarget{FQDN: "test"},
Ports: ports,
}}},
wantRulesToAdd: []rule{},
wantRulesToDelete: []rule{},
},
{
name: "fqdn_svc_already_exists_add_port_remove_port",
svcName: "test",
tailnetTargetIPs: []netip.Addr{tailnetIPv4, tailnetIPv6},
spec: fqdnSpec1,
status: &egressservices.Status{
Services: map[string]*egressservices.ServiceStatus{"test": {
TailnetTargetIPs: []netip.Addr{tailnetIPv4, tailnetIPv6},
TailnetTarget: egressservices.TailnetTarget{FQDN: "test"},
Ports: ports,
}}},
wantRulesToAdd: []rule{r2, r3},
wantRulesToDelete: []rule{r, r1},
},
{
name: "fqdn_svc_already_exists_change_fqdn_backend_ips",
svcName: "test",
tailnetTargetIPs: []netip.Addr{tailnetIPv4_1, tailnetIPv6_1},
spec: fqdnSpec,
status: &egressservices.Status{
Services: map[string]*egressservices.ServiceStatus{"test": {
TailnetTargetIPs: []netip.Addr{tailnetIPv4, tailnetIPv6},
TailnetTarget: egressservices.TailnetTarget{FQDN: "test"},
Ports: ports,
}}},
wantRulesToAdd: []rule{r4, r5},
wantRulesToDelete: []rule{r, r1},
},
{
name: "add_ip_service",
svcName: "test",
tailnetTargetIPs: []netip.Addr{tailnetIPv4},
spec: fqdnSpec2,
status: &egressservices.Status{},
wantRulesToAdd: []rule{r},
wantRulesToDelete: []rule{},
},
{
name: "add_ip_service_already_exists",
svcName: "test",
tailnetTargetIPs: []netip.Addr{tailnetIPv4},
spec: fqdnSpec2,
status: &egressservices.Status{
Services: map[string]*egressservices.ServiceStatus{"test": {
TailnetTargetIPs: []netip.Addr{tailnetIPv4},
TailnetTarget: egressservices.TailnetTarget{IP: tailnetIPv4.String()},
Ports: ports,
}}},
wantRulesToAdd: []rule{},
wantRulesToDelete: []rule{},
},
{
name: "ip_service_add_port",
svcName: "test",
tailnetTargetIPs: []netip.Addr{tailnetIPv4},
spec: fqdnSpec3,
status: &egressservices.Status{
Services: map[string]*egressservices.ServiceStatus{"test": {
TailnetTargetIPs: []netip.Addr{tailnetIPv4},
TailnetTarget: egressservices.TailnetTarget{IP: tailnetIPv4.String()},
Ports: ports,
}}},
wantRulesToAdd: []rule{r6},
wantRulesToDelete: []rule{},
},
{
name: "ip_service_delete_port",
svcName: "test",
tailnetTargetIPs: []netip.Addr{tailnetIPv4},
spec: fqdnSpec,
status: &egressservices.Status{
Services: map[string]*egressservices.ServiceStatus{"test": {
TailnetTargetIPs: []netip.Addr{tailnetIPv4},
TailnetTarget: egressservices.TailnetTarget{IP: tailnetIPv4.String()},
Ports: ports2,
}}},
wantRulesToAdd: []rule{},
wantRulesToDelete: []rule{r6},
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
gotRulesToAdd, gotRulesToDelete, err := updatesForCfg(tt.svcName, tt.spec, tt.status, tt.tailnetTargetIPs)
if err != nil {
t.Errorf("updatesForSvc() unexpected error %v", err)
return
}
if !reflect.DeepEqual(gotRulesToAdd, tt.wantRulesToAdd) {
t.Errorf("updatesForSvc() got rulesToAdd = \n%v\n want rulesToAdd \n%v", gotRulesToAdd, tt.wantRulesToAdd)
}
if !reflect.DeepEqual(gotRulesToDelete, tt.wantRulesToDelete) {
t.Errorf("updatesForSvc() got rulesToDelete = \n%v\n want rulesToDelete \n%v", gotRulesToDelete, tt.wantRulesToDelete)
}
})
}
}

View File

@@ -0,0 +1,324 @@
// Copyright (c) Tailscale Inc & AUTHORS
// SPDX-License-Identifier: BSD-3-Clause
//go:build linux
package main
import (
"context"
"errors"
"fmt"
"log"
"net/netip"
"os"
"path"
"strconv"
"strings"
"tailscale.com/ipn/conffile"
"tailscale.com/kube/kubeclient"
)
// settings is all the configuration for containerboot.
type settings struct {
AuthKey string
Hostname string
Routes *string
// ProxyTargetIP is the destination IP to which all incoming
// Tailscale traffic should be proxied. If empty, no proxying
// is done. This is typically a locally reachable IP.
ProxyTargetIP string
// ProxyTargetDNSName is a DNS name to whose backing IP addresses all
// incoming Tailscale traffic should be proxied.
ProxyTargetDNSName string
// TailnetTargetIP is the destination IP to which all incoming
// non-Tailscale traffic should be proxied. This is typically a
// Tailscale IP.
TailnetTargetIP string
// TailnetTargetFQDN is an MagicDNS name to which all incoming
// non-Tailscale traffic should be proxied. This must be a full Tailnet
// node FQDN.
TailnetTargetFQDN string
ServeConfigPath string
DaemonExtraArgs string
ExtraArgs string
InKubernetes bool
UserspaceMode bool
StateDir string
AcceptDNS *bool
KubeSecret string
SOCKSProxyAddr string
HTTPProxyAddr string
Socket string
AuthOnce bool
Root string
KubernetesCanPatch bool
TailscaledConfigFilePath string
EnableForwardingOptimizations bool
// If set to true and, if this containerboot instance is a Kubernetes
// ingress proxy, set up rules to forward incoming cluster traffic to be
// forwarded to the ingress target in cluster.
AllowProxyingClusterTrafficViaIngress bool
// PodIP is the IP of the Pod if running in Kubernetes. This is used
// when setting up rules to proxy cluster traffic to cluster ingress
// target.
// Deprecated: use PodIPv4, PodIPv6 instead to support dual stack clusters
PodIP string
PodIPv4 string
PodIPv6 string
HealthCheckAddrPort string
EgressSvcsCfgPath string
}
func configFromEnv() (*settings, error) {
cfg := &settings{
AuthKey: defaultEnvs([]string{"TS_AUTHKEY", "TS_AUTH_KEY"}, ""),
Hostname: defaultEnv("TS_HOSTNAME", ""),
Routes: defaultEnvStringPointer("TS_ROUTES"),
ServeConfigPath: defaultEnv("TS_SERVE_CONFIG", ""),
ProxyTargetIP: defaultEnv("TS_DEST_IP", ""),
ProxyTargetDNSName: defaultEnv("TS_EXPERIMENTAL_DEST_DNS_NAME", ""),
TailnetTargetIP: defaultEnv("TS_TAILNET_TARGET_IP", ""),
TailnetTargetFQDN: defaultEnv("TS_TAILNET_TARGET_FQDN", ""),
DaemonExtraArgs: defaultEnv("TS_TAILSCALED_EXTRA_ARGS", ""),
ExtraArgs: defaultEnv("TS_EXTRA_ARGS", ""),
InKubernetes: os.Getenv("KUBERNETES_SERVICE_HOST") != "",
UserspaceMode: defaultBool("TS_USERSPACE", true),
StateDir: defaultEnv("TS_STATE_DIR", ""),
AcceptDNS: defaultEnvBoolPointer("TS_ACCEPT_DNS"),
KubeSecret: defaultEnv("TS_KUBE_SECRET", "tailscale"),
SOCKSProxyAddr: defaultEnv("TS_SOCKS5_SERVER", ""),
HTTPProxyAddr: defaultEnv("TS_OUTBOUND_HTTP_PROXY_LISTEN", ""),
Socket: defaultEnv("TS_SOCKET", "/tmp/tailscaled.sock"),
AuthOnce: defaultBool("TS_AUTH_ONCE", false),
Root: defaultEnv("TS_TEST_ONLY_ROOT", "/"),
TailscaledConfigFilePath: tailscaledConfigFilePath(),
AllowProxyingClusterTrafficViaIngress: defaultBool("EXPERIMENTAL_ALLOW_PROXYING_CLUSTER_TRAFFIC_VIA_INGRESS", false),
PodIP: defaultEnv("POD_IP", ""),
EnableForwardingOptimizations: defaultBool("TS_EXPERIMENTAL_ENABLE_FORWARDING_OPTIMIZATIONS", false),
HealthCheckAddrPort: defaultEnv("TS_HEALTHCHECK_ADDR_PORT", ""),
EgressSvcsCfgPath: defaultEnv("TS_EGRESS_SERVICES_CONFIG_PATH", ""),
}
podIPs, ok := os.LookupEnv("POD_IPS")
if ok {
ips := strings.Split(podIPs, ",")
if len(ips) > 2 {
return nil, fmt.Errorf("POD_IPs can contain at most 2 IPs, got %d (%v)", len(ips), ips)
}
for _, ip := range ips {
parsed, err := netip.ParseAddr(ip)
if err != nil {
return nil, fmt.Errorf("error parsing IP address %s: %w", ip, err)
}
if parsed.Is4() {
cfg.PodIPv4 = parsed.String()
continue
}
cfg.PodIPv6 = parsed.String()
}
}
if err := cfg.validate(); err != nil {
return nil, fmt.Errorf("invalid configuration: %v", err)
}
return cfg, nil
}
func (s *settings) validate() error {
if s.TailscaledConfigFilePath != "" {
dir, file := path.Split(s.TailscaledConfigFilePath)
if _, err := os.Stat(dir); err != nil {
return fmt.Errorf("error validating whether directory with tailscaled config file %s exists: %w", dir, err)
}
if _, err := os.Stat(s.TailscaledConfigFilePath); err != nil {
return fmt.Errorf("error validating whether tailscaled config directory %q contains tailscaled config for current capability version %q: %w. If this is a Tailscale Kubernetes operator proxy, please ensure that the version of the operator is not older than the version of the proxy", dir, file, err)
}
if _, err := conffile.Load(s.TailscaledConfigFilePath); err != nil {
return fmt.Errorf("error validating tailscaled configfile contents: %w", err)
}
}
if s.ProxyTargetIP != "" && s.UserspaceMode {
return errors.New("TS_DEST_IP is not supported with TS_USERSPACE")
}
if s.ProxyTargetDNSName != "" && s.UserspaceMode {
return errors.New("TS_EXPERIMENTAL_DEST_DNS_NAME is not supported with TS_USERSPACE")
}
if s.ProxyTargetDNSName != "" && s.ProxyTargetIP != "" {
return errors.New("TS_EXPERIMENTAL_DEST_DNS_NAME and TS_DEST_IP cannot both be set")
}
if s.TailnetTargetIP != "" && s.UserspaceMode {
return errors.New("TS_TAILNET_TARGET_IP is not supported with TS_USERSPACE")
}
if s.TailnetTargetFQDN != "" && s.UserspaceMode {
return errors.New("TS_TAILNET_TARGET_FQDN is not supported with TS_USERSPACE")
}
if s.TailnetTargetFQDN != "" && s.TailnetTargetIP != "" {
return errors.New("Both TS_TAILNET_TARGET_IP and TS_TAILNET_FQDN cannot be set")
}
if s.TailscaledConfigFilePath != "" && (s.AcceptDNS != nil || s.AuthKey != "" || s.Routes != nil || s.ExtraArgs != "" || s.Hostname != "") {
return errors.New("TS_EXPERIMENTAL_VERSIONED_CONFIG_DIR cannot be set in combination with TS_HOSTNAME, TS_EXTRA_ARGS, TS_AUTHKEY, TS_ROUTES, TS_ACCEPT_DNS.")
}
if s.AllowProxyingClusterTrafficViaIngress && s.UserspaceMode {
return errors.New("EXPERIMENTAL_ALLOW_PROXYING_CLUSTER_TRAFFIC_VIA_INGRESS is not supported in userspace mode")
}
if s.AllowProxyingClusterTrafficViaIngress && s.ServeConfigPath == "" {
return errors.New("EXPERIMENTAL_ALLOW_PROXYING_CLUSTER_TRAFFIC_VIA_INGRESS is set but this is not a cluster ingress proxy")
}
if s.AllowProxyingClusterTrafficViaIngress && s.PodIP == "" {
return errors.New("EXPERIMENTAL_ALLOW_PROXYING_CLUSTER_TRAFFIC_VIA_INGRESS is set but POD_IP is not set")
}
if s.EnableForwardingOptimizations && s.UserspaceMode {
return errors.New("TS_EXPERIMENTAL_ENABLE_FORWARDING_OPTIMIZATIONS is not supported in userspace mode")
}
if s.HealthCheckAddrPort != "" {
if _, err := netip.ParseAddrPort(s.HealthCheckAddrPort); err != nil {
return fmt.Errorf("error parsing TS_HEALTH_CHECK_ADDR_PORT value %q: %w", s.HealthCheckAddrPort, err)
}
}
return nil
}
// setupKube is responsible for doing any necessary configuration and checks to
// ensure that tailscale state storage and authentication mechanism will work on
// Kubernetes.
func (cfg *settings) setupKube(ctx context.Context) error {
if cfg.KubeSecret == "" {
return nil
}
canPatch, canCreate, err := kc.CheckSecretPermissions(ctx, cfg.KubeSecret)
if err != nil {
return fmt.Errorf("some Kubernetes permissions are missing, please check your RBAC configuration: %v", err)
}
cfg.KubernetesCanPatch = canPatch
s, err := kc.GetSecret(ctx, cfg.KubeSecret)
if err != nil {
if !kubeclient.IsNotFoundErr(err) {
return fmt.Errorf("getting Tailscale state Secret %s: %v", cfg.KubeSecret, err)
}
if !canCreate {
return fmt.Errorf("tailscale state Secret %s does not exist and we don't have permissions to create it. "+
"If you intend to store tailscale state elsewhere than a Kubernetes Secret, "+
"you can explicitly set TS_KUBE_SECRET env var to an empty string. "+
"Else ensure that RBAC is set up that allows the service account associated with this installation to create Secrets.", cfg.KubeSecret)
}
}
// Return early if we already have an auth key.
if cfg.AuthKey != "" || isOneStepConfig(cfg) {
return nil
}
if s == nil {
log.Print("TS_AUTHKEY not provided and state Secret does not exist, login will be interactive if needed.")
return nil
}
keyBytes, _ := s.Data["authkey"]
key := string(keyBytes)
if key != "" {
// Enforce that we must be able to patch out the authkey after
// authenticating if you want to use this feature. This avoids
// us having to deal with the case where we might leave behind
// an unnecessary reusable authkey in a secret, like a rake in
// the grass.
if !cfg.KubernetesCanPatch {
return errors.New("authkey found in TS_KUBE_SECRET, but the pod doesn't have patch permissions on the Secret to manage the authkey.")
}
cfg.AuthKey = key
}
log.Print("No authkey found in state Secret and TS_AUTHKEY not provided, login will be interactive if needed.")
return nil
}
// isTwoStepConfigAuthOnce returns true if the Tailscale node should be configured
// in two steps and login should only happen once.
// Step 1: run 'tailscaled'
// Step 2):
// A) if this is the first time starting this node run 'tailscale up --authkey <authkey> <config opts>'
// B) if this is not the first time starting this node run 'tailscale set <config opts>'.
func isTwoStepConfigAuthOnce(cfg *settings) bool {
return cfg.AuthOnce && cfg.TailscaledConfigFilePath == ""
}
// isTwoStepConfigAlwaysAuth returns true if the Tailscale node should be configured
// in two steps and we should log in every time it starts.
// Step 1: run 'tailscaled'
// Step 2): run 'tailscale up --authkey <authkey> <config opts>'
func isTwoStepConfigAlwaysAuth(cfg *settings) bool {
return !cfg.AuthOnce && cfg.TailscaledConfigFilePath == ""
}
// isOneStepConfig returns true if the Tailscale node should always be ran and
// configured in a single step by running 'tailscaled <config opts>'
func isOneStepConfig(cfg *settings) bool {
return cfg.TailscaledConfigFilePath != ""
}
// isL3Proxy returns true if the Tailscale node needs to be configured to act
// as an L3 proxy, proxying to an endpoint provided via one of the config env
// vars.
func isL3Proxy(cfg *settings) bool {
return cfg.ProxyTargetIP != "" || cfg.ProxyTargetDNSName != "" || cfg.TailnetTargetIP != "" || cfg.TailnetTargetFQDN != "" || cfg.AllowProxyingClusterTrafficViaIngress || cfg.EgressSvcsCfgPath != ""
}
// hasKubeStateStore returns true if the state must be stored in a Kubernetes
// Secret.
func hasKubeStateStore(cfg *settings) bool {
return cfg.InKubernetes && cfg.KubernetesCanPatch && cfg.KubeSecret != ""
}
// defaultEnv returns the value of the given envvar name, or defVal if
// unset.
func defaultEnv(name, defVal string) string {
if v, ok := os.LookupEnv(name); ok {
return v
}
return defVal
}
// defaultEnvStringPointer returns a pointer to the given envvar value if set, else
// returns nil. This is useful in cases where we need to distinguish between a
// variable being set to empty string vs unset.
func defaultEnvStringPointer(name string) *string {
if v, ok := os.LookupEnv(name); ok {
return &v
}
return nil
}
// defaultEnvBoolPointer returns a pointer to the given envvar value if set, else
// returns nil. This is useful in cases where we need to distinguish between a
// variable being explicitly set to false vs unset.
func defaultEnvBoolPointer(name string) *bool {
v := os.Getenv(name)
ret, err := strconv.ParseBool(v)
if err != nil {
return nil
}
return &ret
}
func defaultEnvs(names []string, defVal string) string {
for _, name := range names {
if v, ok := os.LookupEnv(name); ok {
return v
}
}
return defVal
}
// defaultBool returns the boolean value of the given envvar name, or
// defVal if unset or not a bool.
func defaultBool(name string, defVal bool) bool {
v := os.Getenv(name)
ret, err := strconv.ParseBool(v)
if err != nil {
return defVal
}
return ret
}

View File

@@ -0,0 +1,162 @@
// Copyright (c) Tailscale Inc & AUTHORS
// SPDX-License-Identifier: BSD-3-Clause
//go:build linux
package main
import (
"context"
"errors"
"fmt"
"io/fs"
"log"
"os"
"os/exec"
"strings"
"syscall"
"time"
"tailscale.com/client/tailscale"
)
func startTailscaled(ctx context.Context, cfg *settings) (*tailscale.LocalClient, *os.Process, error) {
args := tailscaledArgs(cfg)
// tailscaled runs without context, since it needs to persist
// beyond the startup timeout in ctx.
cmd := exec.Command("tailscaled", args...)
cmd.Stdout = os.Stdout
cmd.Stderr = os.Stderr
cmd.SysProcAttr = &syscall.SysProcAttr{
Setpgid: true,
}
log.Printf("Starting tailscaled")
if err := cmd.Start(); err != nil {
return nil, nil, fmt.Errorf("starting tailscaled failed: %v", err)
}
// Wait for the socket file to appear, otherwise API ops will racily fail.
log.Printf("Waiting for tailscaled socket")
for {
if ctx.Err() != nil {
log.Fatalf("Timed out waiting for tailscaled socket")
}
_, err := os.Stat(cfg.Socket)
if errors.Is(err, fs.ErrNotExist) {
time.Sleep(100 * time.Millisecond)
continue
} else if err != nil {
log.Fatalf("Waiting for tailscaled socket: %v", err)
}
break
}
tsClient := &tailscale.LocalClient{
Socket: cfg.Socket,
UseSocketOnly: true,
}
return tsClient, cmd.Process, nil
}
// tailscaledArgs uses cfg to construct the argv for tailscaled.
func tailscaledArgs(cfg *settings) []string {
args := []string{"--socket=" + cfg.Socket}
switch {
case cfg.InKubernetes && cfg.KubeSecret != "":
args = append(args, "--state=kube:"+cfg.KubeSecret)
if cfg.StateDir == "" {
cfg.StateDir = "/tmp"
}
fallthrough
case cfg.StateDir != "":
args = append(args, "--statedir="+cfg.StateDir)
default:
args = append(args, "--state=mem:", "--statedir=/tmp")
}
if cfg.UserspaceMode {
args = append(args, "--tun=userspace-networking")
} else if err := ensureTunFile(cfg.Root); err != nil {
log.Fatalf("ensuring that /dev/net/tun exists: %v", err)
}
if cfg.SOCKSProxyAddr != "" {
args = append(args, "--socks5-server="+cfg.SOCKSProxyAddr)
}
if cfg.HTTPProxyAddr != "" {
args = append(args, "--outbound-http-proxy-listen="+cfg.HTTPProxyAddr)
}
if cfg.TailscaledConfigFilePath != "" {
args = append(args, "--config="+cfg.TailscaledConfigFilePath)
}
if cfg.DaemonExtraArgs != "" {
args = append(args, strings.Fields(cfg.DaemonExtraArgs)...)
}
return args
}
// tailscaleUp uses cfg to run 'tailscale up' everytime containerboot starts, or
// if TS_AUTH_ONCE is set, only the first time containerboot starts.
func tailscaleUp(ctx context.Context, cfg *settings) error {
args := []string{"--socket=" + cfg.Socket, "up"}
if cfg.AcceptDNS != nil && *cfg.AcceptDNS {
args = append(args, "--accept-dns=true")
} else {
args = append(args, "--accept-dns=false")
}
if cfg.AuthKey != "" {
args = append(args, "--authkey="+cfg.AuthKey)
}
// --advertise-routes can be passed an empty string to configure a
// device (that might have previously advertised subnet routes) to not
// advertise any routes. Respect an empty string passed by a user and
// use it to explicitly unset the routes.
if cfg.Routes != nil {
args = append(args, "--advertise-routes="+*cfg.Routes)
}
if cfg.Hostname != "" {
args = append(args, "--hostname="+cfg.Hostname)
}
if cfg.ExtraArgs != "" {
args = append(args, strings.Fields(cfg.ExtraArgs)...)
}
log.Printf("Running 'tailscale up'")
cmd := exec.CommandContext(ctx, "tailscale", args...)
cmd.Stdout = os.Stdout
cmd.Stderr = os.Stderr
if err := cmd.Run(); err != nil {
return fmt.Errorf("tailscale up failed: %v", err)
}
return nil
}
// tailscaleSet uses cfg to run 'tailscale set' to set any known configuration
// options that are passed in via environment variables. This is run after the
// node is in Running state and only if TS_AUTH_ONCE is set.
func tailscaleSet(ctx context.Context, cfg *settings) error {
args := []string{"--socket=" + cfg.Socket, "set"}
if cfg.AcceptDNS != nil && *cfg.AcceptDNS {
args = append(args, "--accept-dns=true")
} else {
args = append(args, "--accept-dns=false")
}
// --advertise-routes can be passed an empty string to configure a
// device (that might have previously advertised subnet routes) to not
// advertise any routes. Respect an empty string passed by a user and
// use it to explicitly unset the routes.
if cfg.Routes != nil {
args = append(args, "--advertise-routes="+*cfg.Routes)
}
if cfg.Hostname != "" {
args = append(args, "--hostname="+cfg.Hostname)
}
log.Printf("Running 'tailscale set'")
cmd := exec.CommandContext(ctx, "tailscale", args...)
cmd.Stdout = os.Stdout
cmd.Stderr = os.Stderr
if err := cmd.Run(); err != nil {
return fmt.Errorf("tailscale set failed: %v", err)
}
return nil
}

View File

@@ -2,7 +2,8 @@
This is the code for the [Tailscale DERP server](https://tailscale.com/kb/1232/derp-servers).
In general, you should not need to nor want to run this code. The overwhelming majority of Tailscale users (both individuals and companies) do not.
In general, you should not need to or want to run this code. The overwhelming
majority of Tailscale users (both individuals and companies) do not.
In the happy path, Tailscale establishes direct connections between peers and
data plane traffic flows directly between them, without using DERP for more than
@@ -11,7 +12,7 @@ find yourself wanting DERP for more bandwidth, the real problem is usually the
network configuration of your Tailscale node(s), making sure that Tailscale can
get direction connections via some mechanism.
But if you've decided or been advised to run your own `derper`, then read on.
If you've decided or been advised to run your own `derper`, then read on.
## Caveats
@@ -28,7 +29,10 @@ But if you've decided or been advised to run your own `derper`, then read on.
* You must build and update the `cmd/derper` binary yourself. There are no
packages. Use `go install tailscale.com/cmd/derper@latest` with the latest
version of Go.
version of Go. You should update this binary approximately as regularly as
you update Tailscale nodes. If using `--verify-clients`, the `derper` binary
and `tailscaled` binary on the machine must be built from the same git revision.
(It might work otherwise, but they're developed and only tested together.)
* The DERP protocol does a protocol switch inside TLS from HTTP to a custom
bidirectional binary protocol. It is thus incompatible with many HTTP proxies.
@@ -55,7 +59,7 @@ rely on its DNS which might be broken and dependent on DERP to get back up.
* Monitor your DERP servers with [`cmd/derpprobe`](../derpprobe/).
* If using `--verify-clients`, a `tailscaled` must be running alongside the
`derper`.
`derper`, and all clients must be visible to the derper tailscaled in the ACL.
* If using `--verify-clients`, a `tailscaled` must also be running alongside
your `derpprobe`, and `derpprobe` needs to use `--derp-map=local`.
@@ -72,3 +76,34 @@ rely on its DNS which might be broken and dependent on DERP to get back up.
* Don't rate-limit UDP STUN packets.
* Don't rate-limit outbound TCP traffic (only inbound).
## Diagnostics
This is not a complete guide on DERP diagnostics.
Running your own DERP services requires exeprtise in multi-layer network and
application diagnostics. As the DERP runs multiple protocols at multiple layers
and is not a regular HTTP(s) server you will need expertise in correlative
analysis to diagnose the most tricky problems. There is no "plain text" or
"open" mode of operation for DERP.
* The debug handler is accessible at URL path `/debug/`. It is only accessible
over localhost or from a Tailscale IP address.
* Go pprof can be accessed via the debug handler at `/debug/pprof/`
* Prometheus compatible metrics can be gathered from the debug handler at
`/debug/varz`.
* `cmd/stunc` in the Tailscale repository provides a basic tool for diagnosing
issues with STUN.
* `cmd/derpprobe` provides a service for monitoring DERP cluster health.
* `tailscale debug derp` and `tailscale netcheck` provide additional client
driven diagnostic information for DERP communications.
* Tailscale logs may provide insight for certain problems, such as if DERPs are
unreachable or peers are regularly not reachable in their DERP home regions.
There are many possible misconfiguration causes for these problems, but
regular log entries are a good first indicator that there is a problem.

View File

@@ -7,9 +7,19 @@ tailscale.com/cmd/derper dependencies: (generated by github.com/tailscale/depawa
W 💣 github.com/alexbrainman/sspi/negotiate from tailscale.com/net/tshttpproxy
github.com/beorn7/perks/quantile from github.com/prometheus/client_golang/prometheus
💣 github.com/cespare/xxhash/v2 from github.com/prometheus/client_golang/prometheus
github.com/coder/websocket from tailscale.com/cmd/derper+
github.com/coder/websocket/internal/errd from github.com/coder/websocket
github.com/coder/websocket/internal/util from github.com/coder/websocket
github.com/coder/websocket/internal/xsync from github.com/coder/websocket
L github.com/coreos/go-iptables/iptables from tailscale.com/util/linuxfw
W 💣 github.com/dblohm7/wingoes from tailscale.com/util/winutil
github.com/fxamacker/cbor/v2 from tailscale.com/tka
github.com/go-json-experiment/json from tailscale.com/types/opt+
github.com/go-json-experiment/json/internal from github.com/go-json-experiment/json+
github.com/go-json-experiment/json/internal/jsonflags from github.com/go-json-experiment/json+
github.com/go-json-experiment/json/internal/jsonopts from github.com/go-json-experiment/json+
github.com/go-json-experiment/json/internal/jsonwire from github.com/go-json-experiment/json+
github.com/go-json-experiment/json/jsontext from github.com/go-json-experiment/json+
github.com/golang/groupcache/lru from tailscale.com/net/dnscache
L github.com/google/nftables from tailscale.com/util/linuxfw
L 💣 github.com/google/nftables/alignedbuff from github.com/google/nftables/xt
@@ -42,7 +52,7 @@ tailscale.com/cmd/derper dependencies: (generated by github.com/tailscale/depawa
W github.com/tailscale/go-winio/internal/stringbuffer from github.com/tailscale/go-winio/internal/fs
W github.com/tailscale/go-winio/pkg/guid from github.com/tailscale/go-winio+
L 💣 github.com/tailscale/netlink from tailscale.com/util/linuxfw
L 💣 github.com/vishvananda/netlink/nl from github.com/tailscale/netlink
L 💣 github.com/tailscale/netlink/nl from github.com/tailscale/netlink
L github.com/vishvananda/netns from github.com/tailscale/netlink+
github.com/x448/float16 from github.com/fxamacker/cbor/v2
💣 go4.org/mem from tailscale.com/client/tailscale+
@@ -76,10 +86,6 @@ tailscale.com/cmd/derper dependencies: (generated by github.com/tailscale/depawa
google.golang.org/protobuf/runtime/protoiface from google.golang.org/protobuf/internal/impl+
google.golang.org/protobuf/runtime/protoimpl from github.com/prometheus/client_model/go+
google.golang.org/protobuf/types/known/timestamppb from github.com/prometheus/client_golang/prometheus+
nhooyr.io/websocket from tailscale.com/cmd/derper+
nhooyr.io/websocket/internal/errd from nhooyr.io/websocket
nhooyr.io/websocket/internal/util from nhooyr.io/websocket
nhooyr.io/websocket/internal/xsync from nhooyr.io/websocket
tailscale.com from tailscale.com/version
tailscale.com/atomicfile from tailscale.com/cmd/derper+
tailscale.com/client/tailscale from tailscale.com/derp
@@ -93,35 +99,37 @@ tailscale.com/cmd/derper dependencies: (generated by github.com/tailscale/depawa
tailscale.com/hostinfo from tailscale.com/net/netmon+
tailscale.com/ipn from tailscale.com/client/tailscale
tailscale.com/ipn/ipnstate from tailscale.com/client/tailscale+
tailscale.com/kube/kubetypes from tailscale.com/envknob
tailscale.com/metrics from tailscale.com/cmd/derper+
tailscale.com/net/dnscache from tailscale.com/derp/derphttp
tailscale.com/net/ktimeout from tailscale.com/cmd/derper
tailscale.com/net/netaddr from tailscale.com/ipn+
tailscale.com/net/netknob from tailscale.com/net/netns
💣 tailscale.com/net/netmon from tailscale.com/derp/derphttp+
tailscale.com/net/netns from tailscale.com/derp/derphttp
💣 tailscale.com/net/netns from tailscale.com/derp/derphttp
tailscale.com/net/netutil from tailscale.com/client/tailscale
tailscale.com/net/sockstats from tailscale.com/derp/derphttp
tailscale.com/net/stun from tailscale.com/net/stunserver
tailscale.com/net/stunserver from tailscale.com/cmd/derper
L tailscale.com/net/tcpinfo from tailscale.com/derp
tailscale.com/net/tlsdial from tailscale.com/derp/derphttp
tailscale.com/net/tlsdial/blockblame from tailscale.com/net/tlsdial
tailscale.com/net/tsaddr from tailscale.com/ipn+
💣 tailscale.com/net/tshttpproxy from tailscale.com/derp/derphttp+
tailscale.com/net/wsconn from tailscale.com/cmd/derper+
tailscale.com/net/wsconn from tailscale.com/cmd/derper
tailscale.com/paths from tailscale.com/client/tailscale
💣 tailscale.com/safesocket from tailscale.com/client/tailscale
tailscale.com/syncs from tailscale.com/cmd/derper+
tailscale.com/tailcfg from tailscale.com/client/tailscale+
tailscale.com/tka from tailscale.com/client/tailscale+
W tailscale.com/tsconst from tailscale.com/net/netmon
W tailscale.com/tsconst from tailscale.com/net/netmon+
tailscale.com/tstime from tailscale.com/derp+
tailscale.com/tstime/mono from tailscale.com/tstime/rate
tailscale.com/tstime/rate from tailscale.com/derp
tailscale.com/tsweb from tailscale.com/cmd/derper
tailscale.com/tsweb/promvarz from tailscale.com/tsweb
tailscale.com/tsweb/varz from tailscale.com/tsweb+
tailscale.com/types/dnstype from tailscale.com/tailcfg
tailscale.com/types/dnstype from tailscale.com/tailcfg+
tailscale.com/types/empty from tailscale.com/ipn
tailscale.com/types/ipproto from tailscale.com/tailcfg+
tailscale.com/types/key from tailscale.com/client/tailscale+
@@ -132,6 +140,7 @@ tailscale.com/cmd/derper dependencies: (generated by github.com/tailscale/depawa
tailscale.com/types/persist from tailscale.com/ipn
tailscale.com/types/preftype from tailscale.com/ipn
tailscale.com/types/ptr from tailscale.com/hostinfo+
tailscale.com/types/result from tailscale.com/util/lineiter
tailscale.com/types/structs from tailscale.com/ipn+
tailscale.com/types/tkatype from tailscale.com/client/tailscale+
tailscale.com/types/views from tailscale.com/ipn+
@@ -140,11 +149,13 @@ tailscale.com/cmd/derper dependencies: (generated by github.com/tailscale/depawa
tailscale.com/util/cloudenv from tailscale.com/hostinfo+
W tailscale.com/util/cmpver from tailscale.com/net/tshttpproxy
tailscale.com/util/ctxkey from tailscale.com/tsweb+
💣 tailscale.com/util/deephash from tailscale.com/util/syspolicy/setting
L 💣 tailscale.com/util/dirwalk from tailscale.com/metrics
tailscale.com/util/dnsname from tailscale.com/hostinfo+
tailscale.com/util/fastuuid from tailscale.com/tsweb
💣 tailscale.com/util/hashx from tailscale.com/util/deephash
tailscale.com/util/httpm from tailscale.com/client/tailscale
tailscale.com/util/lineread from tailscale.com/hostinfo+
tailscale.com/util/lineiter from tailscale.com/hostinfo+
L tailscale.com/util/linuxfw from tailscale.com/net/netns
tailscale.com/util/mak from tailscale.com/health+
tailscale.com/util/multierr from tailscale.com/health+
@@ -153,8 +164,17 @@ tailscale.com/cmd/derper dependencies: (generated by github.com/tailscale/depawa
tailscale.com/util/singleflight from tailscale.com/net/dnscache
tailscale.com/util/slicesx from tailscale.com/cmd/derper+
tailscale.com/util/syspolicy from tailscale.com/ipn
tailscale.com/util/syspolicy/internal from tailscale.com/util/syspolicy/setting+
tailscale.com/util/syspolicy/internal/loggerx from tailscale.com/util/syspolicy/internal/metrics+
tailscale.com/util/syspolicy/internal/metrics from tailscale.com/util/syspolicy/source
tailscale.com/util/syspolicy/rsop from tailscale.com/util/syspolicy
tailscale.com/util/syspolicy/setting from tailscale.com/util/syspolicy+
tailscale.com/util/syspolicy/source from tailscale.com/util/syspolicy+
tailscale.com/util/testenv from tailscale.com/util/syspolicy+
tailscale.com/util/usermetric from tailscale.com/health
tailscale.com/util/vizerror from tailscale.com/tailcfg+
W 💣 tailscale.com/util/winutil from tailscale.com/hostinfo+
W 💣 tailscale.com/util/winutil/gp from tailscale.com/util/syspolicy/source
W 💣 tailscale.com/util/winutil/winenv from tailscale.com/hostinfo+
tailscale.com/version from tailscale.com/derp+
tailscale.com/version/distro from tailscale.com/envknob+
@@ -165,15 +185,17 @@ tailscale.com/cmd/derper dependencies: (generated by github.com/tailscale/depawa
golang.org/x/crypto/blake2b from golang.org/x/crypto/argon2+
golang.org/x/crypto/blake2s from tailscale.com/tka
golang.org/x/crypto/chacha20 from golang.org/x/crypto/chacha20poly1305
golang.org/x/crypto/chacha20poly1305 from crypto/tls
golang.org/x/crypto/chacha20poly1305 from crypto/tls+
golang.org/x/crypto/cryptobyte from crypto/ecdsa+
golang.org/x/crypto/cryptobyte/asn1 from crypto/ecdsa+
golang.org/x/crypto/curve25519 from golang.org/x/crypto/nacl/box+
golang.org/x/crypto/hkdf from crypto/tls
golang.org/x/crypto/hkdf from crypto/tls+
golang.org/x/crypto/nacl/box from tailscale.com/types/key
golang.org/x/crypto/nacl/secretbox from golang.org/x/crypto/nacl/box
golang.org/x/crypto/salsa20/salsa from golang.org/x/crypto/nacl/box+
golang.org/x/crypto/sha3 from crypto/internal/mlkem768+
W golang.org/x/exp/constraints from tailscale.com/util/winutil
golang.org/x/exp/maps from tailscale.com/util/syspolicy/setting+
L golang.org/x/net/bpf from github.com/mdlayher/netlink+
golang.org/x/net/dns/dnsmessage from net+
golang.org/x/net/http/httpguts from net/http
@@ -234,7 +256,7 @@ tailscale.com/cmd/derper dependencies: (generated by github.com/tailscale/depawa
encoding/pem from crypto/tls+
errors from bufio+
expvar from github.com/prometheus/client_golang/prometheus+
flag from tailscale.com/cmd/derper
flag from tailscale.com/cmd/derper+
fmt from compress/flate+
go/token from google.golang.org/protobuf/internal/strs
hash from crypto+
@@ -242,9 +264,11 @@ tailscale.com/cmd/derper dependencies: (generated by github.com/tailscale/depawa
hash/fnv from google.golang.org/protobuf/internal/detrand
hash/maphash from go4.org/mem
html from net/http/pprof+
html/template from tailscale.com/cmd/derper
io from bufio+
io/fs from crypto/x509+
io/ioutil from github.com/mitchellh/go-ps+
iter from maps+
log from expvar+
log/internal from log
maps from tailscale.com/ipn+
@@ -260,14 +284,14 @@ tailscale.com/cmd/derper dependencies: (generated by github.com/tailscale/depawa
net/http from expvar+
net/http/httptrace from net/http+
net/http/internal from net/http
net/http/pprof from tailscale.com/tsweb+
net/http/pprof from tailscale.com/tsweb
net/netip from go4.org/netipx+
net/textproto from golang.org/x/net/http/httpguts+
net/url from crypto/x509+
os from crypto/rand+
os/exec from github.com/coreos/go-iptables/iptables+
os/signal from tailscale.com/cmd/derper
W os/user from tailscale.com/util/winutil
W os/user from tailscale.com/util/winutil+
path from github.com/prometheus/client_golang/prometheus/internal+
path/filepath from crypto/x509+
reflect from crypto/x509+
@@ -285,7 +309,10 @@ tailscale.com/cmd/derper dependencies: (generated by github.com/tailscale/depawa
sync/atomic from context+
syscall from crypto/rand+
text/tabwriter from runtime/pprof
text/template from html/template
text/template/parse from html/template+
time from compress/gzip+
unicode from bytes+
unicode/utf16 from crypto/x509+
unicode/utf8 from bufio+
unique from net/netip

View File

@@ -2,6 +2,12 @@
// SPDX-License-Identifier: BSD-3-Clause
// The derper binary is a simple DERP server.
//
// For more information, see:
//
// - About: https://tailscale.com/kb/1232/derp-servers
// - Protocol & Go docs: https://pkg.go.dev/tailscale.com/derp
// - Running a DERP server: https://github.com/tailscale/tailscale/tree/main/cmd/derper#derp
package main // import "tailscale.com/cmd/derper"
import (
@@ -13,6 +19,7 @@ import (
"expvar"
"flag"
"fmt"
"html/template"
"io"
"log"
"math"
@@ -22,6 +29,9 @@ import (
"os/signal"
"path/filepath"
"regexp"
"runtime"
runtimemetrics "runtime/metrics"
"strconv"
"strings"
"syscall"
"time"
@@ -203,27 +213,23 @@ func main() {
tsweb.AddBrowserHeaders(w)
w.Header().Set("Content-Type", "text/html; charset=utf-8")
w.WriteHeader(200)
io.WriteString(w, `<html><body>
<h1>DERP</h1>
<p>
This is a
<a href="https://tailscale.com/">Tailscale</a>
<a href="https://pkg.go.dev/tailscale.com/derp">DERP</a>
server.
</p>
`)
if !*runDERP {
io.WriteString(w, `<p>Status: <b>disabled</b></p>`)
}
if tsweb.AllowDebugAccess(r) {
io.WriteString(w, "<p>Debug info at <a href='/debug/'>/debug/</a>.</p>\n")
err := homePageTemplate.Execute(w, templateData{
ShowAbuseInfo: validProdHostname.MatchString(*hostname),
Disabled: !*runDERP,
AllowDebug: tsweb.AllowDebugAccess(r),
})
if err != nil {
if r.Context().Err() == nil {
log.Printf("homePageTemplate.Execute: %v", err)
}
return
}
}))
mux.Handle("/robots.txt", http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
tsweb.AddBrowserHeaders(w)
io.WriteString(w, "User-agent: *\nDisallow: /\n")
}))
mux.Handle("/generate_204", http.HandlerFunc(serveNoContent))
mux.Handle("/generate_204", http.HandlerFunc(derphttp.ServeNoContent))
debug := tsweb.Debugger(mux)
debug.KV("TLS hostname", *hostname)
debug.KV("Mesh key", s.HasMeshKey())
@@ -236,6 +242,20 @@ func main() {
}
}))
debug.Handle("traffic", "Traffic check", http.HandlerFunc(s.ServeDebugTraffic))
debug.Handle("set-mutex-profile-fraction", "SetMutexProfileFraction", http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
s := r.FormValue("rate")
if s == "" || r.Header.Get("Sec-Debug") != "derp" {
http.Error(w, "To set, use: curl -HSec-Debug:derp 'http://derp/debug/set-mutex-profile-fraction?rate=100'", http.StatusBadRequest)
return
}
v, err := strconv.Atoi(s)
if err != nil {
http.Error(w, "bad rate value", http.StatusBadRequest)
return
}
old := runtime.SetMutexProfileFraction(v)
fmt.Fprintf(w, "mutex changed from %v to %v\n", old, v)
}))
// Longer lived DERP connections send an application layer keepalive. Note
// if the keepalive is hit, the user timeout will take precedence over the
@@ -309,7 +329,7 @@ func main() {
if *httpPort > -1 {
go func() {
port80mux := http.NewServeMux()
port80mux.HandleFunc("/generate_204", serveNoContent)
port80mux.HandleFunc("/generate_204", derphttp.ServeNoContent)
port80mux.Handle("/", certManager.HTTPHandler(tsweb.Port80Handler{Main: mux}))
port80srv := &http.Server{
Addr: net.JoinHostPort(listenHost, fmt.Sprintf("%d", *httpPort)),
@@ -350,31 +370,6 @@ func main() {
}
}
const (
noContentChallengeHeader = "X-Tailscale-Challenge"
noContentResponseHeader = "X-Tailscale-Response"
)
// For captive portal detection
func serveNoContent(w http.ResponseWriter, r *http.Request) {
if challenge := r.Header.Get(noContentChallengeHeader); challenge != "" {
badChar := strings.IndexFunc(challenge, func(r rune) bool {
return !isChallengeChar(r)
}) != -1
if len(challenge) <= 64 && !badChar {
w.Header().Set(noContentResponseHeader, "response "+challenge)
}
}
w.WriteHeader(http.StatusNoContent)
}
func isChallengeChar(c rune) bool {
// Semi-randomly chosen as a limited set of valid characters
return ('a' <= c && c <= 'z') || ('A' <= c && c <= 'Z') ||
('0' <= c && c <= '9') ||
c == '.' || c == '-' || c == '_'
}
var validProdHostname = regexp.MustCompile(`^derp([^.]*)\.tailscale\.com\.?$`)
func prodAutocertHostPolicy(_ context.Context, host string) error {
@@ -452,3 +447,65 @@ func (l *rateLimitedListener) Accept() (net.Conn, error) {
l.numAccepts.Add(1)
return cn, nil
}
func init() {
expvar.Publish("go_sync_mutex_wait_seconds", expvar.Func(func() any {
const name = "/sync/mutex/wait/total:seconds" // Go 1.20+
var s [1]runtimemetrics.Sample
s[0].Name = name
runtimemetrics.Read(s[:])
if v := s[0].Value; v.Kind() == runtimemetrics.KindFloat64 {
return v.Float64()
}
return 0
}))
}
type templateData struct {
ShowAbuseInfo bool
Disabled bool
AllowDebug bool
}
// homePageTemplate renders the home page using [templateData].
var homePageTemplate = template.Must(template.New("home").Parse(`<html><body>
<h1>DERP</h1>
<p>
This is a <a href="https://tailscale.com/">Tailscale</a> DERP server.
</p>
<p>
It provides STUN, interactive connectivity establishment, and relaying of end-to-end encrypted traffic
for Tailscale clients.
</p>
{{if .ShowAbuseInfo }}
<p>
If you suspect abuse, please contact <a href="mailto:security@tailscale.com">security@tailscale.com</a>.
</p>
{{end}}
<p>
Documentation:
</p>
<ul>
{{if .ShowAbuseInfo }}
<li><a href="https://tailscale.com/security-policies">Tailscale Security Policies</a></li>
<li><a href="https://tailscale.com/tailscale-aup">Tailscale Acceptable Use Policies</a></li>
{{end}}
<li><a href="https://tailscale.com/kb/1232/derp-servers">About DERP</a></li>
<li><a href="https://pkg.go.dev/tailscale.com/derp">Protocol & Go docs</a></li>
<li><a href="https://github.com/tailscale/tailscale/tree/main/cmd/derper#derp">How to run a DERP server</a></li>
</ul>
{{if .Disabled}}
<p>Status: <b>disabled</b></p>
{{end}}
{{if .AllowDebug}}
<p>Debug info at <a href='/debug/'>/debug/</a>.</p>
{{end}}
</body>
</html>
`))

View File

@@ -4,12 +4,15 @@
package main
import (
"bytes"
"context"
"fmt"
"net/http"
"net/http/httptest"
"strings"
"testing"
"tailscale.com/derp/derphttp"
"tailscale.com/tstest/deptest"
)
@@ -76,20 +79,20 @@ func TestNoContent(t *testing.T) {
t.Run(tt.name, func(t *testing.T) {
req, _ := http.NewRequest("GET", "https://localhost/generate_204", nil)
if tt.input != "" {
req.Header.Set(noContentChallengeHeader, tt.input)
req.Header.Set(derphttp.NoContentChallengeHeader, tt.input)
}
w := httptest.NewRecorder()
serveNoContent(w, req)
derphttp.ServeNoContent(w, req)
resp := w.Result()
if tt.want == "" {
if h, found := resp.Header[noContentResponseHeader]; found {
if h, found := resp.Header[derphttp.NoContentResponseHeader]; found {
t.Errorf("got %+v; expected no response header", h)
}
return
}
if got := resp.Header.Get(noContentResponseHeader); got != tt.want {
if got := resp.Header.Get(derphttp.NoContentResponseHeader); got != tt.want {
t.Errorf("got %q; want %q", got, tt.want)
}
})
@@ -109,3 +112,30 @@ func TestDeps(t *testing.T) {
},
}.Check(t)
}
func TestTemplate(t *testing.T) {
buf := &bytes.Buffer{}
err := homePageTemplate.Execute(buf, templateData{
ShowAbuseInfo: true,
Disabled: true,
AllowDebug: true,
})
if err != nil {
t.Fatal(err)
}
str := buf.String()
if !strings.Contains(str, "If you suspect abuse") {
t.Error("Output is missing abuse mailto")
}
if !strings.Contains(str, "Tailscale Security Policies") {
t.Error("Output is missing Tailscale Security Policies link")
}
if !strings.Contains(str, "Status:") {
t.Error("Output is missing disabled status")
}
if !strings.Contains(str, "Debug info") {
t.Error("Output is missing debug info")
}
fmt.Println(buf.String())
}

View File

@@ -9,14 +9,12 @@ import (
"fmt"
"log"
"net"
"net/netip"
"strings"
"time"
"tailscale.com/derp"
"tailscale.com/derp/derphttp"
"tailscale.com/net/netmon"
"tailscale.com/types/key"
"tailscale.com/types/logger"
)
@@ -71,8 +69,8 @@ func startMeshWithHost(s *derp.Server, host string) error {
return d.DialContext(ctx, network, addr)
})
add := func(k key.NodePublic, _ netip.AddrPort) { s.AddPacketForwarder(k, c) }
remove := func(k key.NodePublic) { s.RemovePacketForwarder(k, c) }
add := func(m derp.PeerPresentMessage) { s.AddPacketForwarder(m.Key, c) }
remove := func(m derp.PeerGoneMessage) { s.RemovePacketForwarder(m.Peer, c) }
go c.RunWatchConnectionLoop(context.Background(), s.PublicKey(), logf, add, remove)
return nil
}

View File

@@ -10,7 +10,7 @@ import (
"net/http"
"strings"
"nhooyr.io/websocket"
"github.com/coder/websocket"
"tailscale.com/derp"
"tailscale.com/net/wsconn"
)

View File

@@ -7,8 +7,6 @@ package main
import (
"flag"
"fmt"
"html"
"io"
"log"
"net/http"
"sort"
@@ -70,8 +68,18 @@ func main() {
}
mux := http.NewServeMux()
tsweb.Debugger(mux)
mux.HandleFunc("/", http.HandlerFunc(serveFunc(p)))
d := tsweb.Debugger(mux)
d.Handle("probe-run", "Run a probe", tsweb.StdHandler(tsweb.ReturnHandlerFunc(p.RunHandler), tsweb.HandlerOptions{Logf: log.Printf}))
mux.Handle("/", tsweb.StdHandler(p.StatusHandler(
prober.WithTitle("DERP Prober"),
prober.WithPageLink("Prober metrics", "/debug/varz"),
prober.WithProbeLink("Run Probe", "/debug/probe-run?name={{.Name}}"),
), tsweb.HandlerOptions{Logf: log.Printf}))
mux.Handle("/healthz", http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.Header().Set("Content-Type", "text/plain")
w.WriteHeader(http.StatusOK)
w.Write([]byte("ok\n"))
}))
log.Printf("Listening on %s", *listen)
log.Fatal(http.ListenAndServe(*listen, mux))
}
@@ -105,26 +113,3 @@ func getOverallStatus(p *prober.Prober) (o overallStatus) {
sort.Strings(o.good)
return
}
func serveFunc(p *prober.Prober) func(w http.ResponseWriter, r *http.Request) {
return func(w http.ResponseWriter, r *http.Request) {
st := getOverallStatus(p)
summary := "All good"
if (float64(len(st.bad)) / float64(len(st.bad)+len(st.good))) > 0.25 {
// Returning a 500 allows monitoring this server externally and configuring
// an alert on HTTP response code.
w.WriteHeader(500)
summary = fmt.Sprintf("%d problems", len(st.bad))
}
io.WriteString(w, "<html><head><style>.bad { font-weight: bold; color: #700; }</style></head>\n")
fmt.Fprintf(w, "<body><h1>derp probe</h1>\n%s:<ul>", summary)
for _, s := range st.bad {
fmt.Fprintf(w, "<li class=bad>%s</li>\n", html.EscapeString(s))
}
for _, s := range st.good {
fmt.Fprintf(w, "<li>%s</li>\n", html.EscapeString(s))
}
io.WriteString(w, "</ul></body></html>\n")
}
}

View File

@@ -51,6 +51,7 @@ func main() {
ctx := context.Background()
tsClient := tailscale.NewClient("-", nil)
tsClient.UserAgent = "tailscale-get-authkey"
tsClient.HTTPClient = credentials.Client(ctx)
tsClient.BaseURL = baseURL

View File

@@ -28,19 +28,20 @@ import (
)
var (
rootFlagSet = flag.NewFlagSet("gitops-pusher", flag.ExitOnError)
policyFname = rootFlagSet.String("policy-file", "./policy.hujson", "filename for policy file")
cacheFname = rootFlagSet.String("cache-file", "./version-cache.json", "filename for the previous known version hash")
timeout = rootFlagSet.Duration("timeout", 5*time.Minute, "timeout for the entire CI run")
githubSyntax = rootFlagSet.Bool("github-syntax", true, "use GitHub Action error syntax (https://docs.github.com/en/actions/using-workflows/workflow-commands-for-github-actions#setting-an-error-message)")
apiServer = rootFlagSet.String("api-server", "api.tailscale.com", "API server to contact")
rootFlagSet = flag.NewFlagSet("gitops-pusher", flag.ExitOnError)
policyFname = rootFlagSet.String("policy-file", "./policy.hujson", "filename for policy file")
cacheFname = rootFlagSet.String("cache-file", "./version-cache.json", "filename for the previous known version hash")
timeout = rootFlagSet.Duration("timeout", 5*time.Minute, "timeout for the entire CI run")
githubSyntax = rootFlagSet.Bool("github-syntax", true, "use GitHub Action error syntax (https://docs.github.com/en/actions/using-workflows/workflow-commands-for-github-actions#setting-an-error-message)")
apiServer = rootFlagSet.String("api-server", "api.tailscale.com", "API server to contact")
failOnManualEdits = rootFlagSet.Bool("fail-on-manual-edits", false, "fail if manual edits to the ACLs in the admin panel are detected; when set to false (the default) only a warning is printed")
)
func modifiedExternallyError() {
func modifiedExternallyError() error {
if *githubSyntax {
fmt.Printf("::warning file=%s,line=1,col=1,title=Policy File Modified Externally::The policy file was modified externally in the admin console.\n", *policyFname)
return fmt.Errorf("::warning file=%s,line=1,col=1,title=Policy File Modified Externally::The policy file was modified externally in the admin console.", *policyFname)
} else {
fmt.Printf("The policy file was modified externally in the admin console.\n")
return fmt.Errorf("The policy file was modified externally in the admin console.")
}
}
@@ -65,16 +66,22 @@ func apply(cache *Cache, client *http.Client, tailnet, apiKey string) func(conte
log.Printf("local: %s", localEtag)
log.Printf("cache: %s", cache.PrevETag)
if cache.PrevETag != controlEtag {
modifiedExternallyError()
}
if controlEtag == localEtag {
cache.PrevETag = localEtag
log.Println("no update needed, doing nothing")
return nil
}
if cache.PrevETag != controlEtag {
if err := modifiedExternallyError(); err != nil {
if *failOnManualEdits {
return err
} else {
fmt.Println(err)
}
}
}
if err := applyNewACL(ctx, client, tailnet, apiKey, *policyFname, controlEtag); err != nil {
return err
}
@@ -106,15 +113,21 @@ func test(cache *Cache, client *http.Client, tailnet, apiKey string) func(contex
log.Printf("local: %s", localEtag)
log.Printf("cache: %s", cache.PrevETag)
if cache.PrevETag != controlEtag {
modifiedExternallyError()
}
if controlEtag == localEtag {
log.Println("no updates found, doing nothing")
return nil
}
if cache.PrevETag != controlEtag {
if err := modifiedExternallyError(); err != nil {
if *failOnManualEdits {
return err
} else {
fmt.Println(err)
}
}
}
if err := testNewACLs(ctx, client, tailnet, apiKey, *policyFname); err != nil {
return err
}

View File

@@ -13,7 +13,8 @@ import (
"sync"
"time"
"github.com/pkg/errors"
"errors"
"go.uber.org/zap"
xslices "golang.org/x/exp/slices"
corev1 "k8s.io/api/core/v1"
@@ -26,6 +27,7 @@ import (
"sigs.k8s.io/controller-runtime/pkg/reconcile"
tsoperator "tailscale.com/k8s-operator"
tsapi "tailscale.com/k8s-operator/apis/v1alpha1"
"tailscale.com/kube/kubetypes"
"tailscale.com/tstime"
"tailscale.com/util/clientmetric"
"tailscale.com/util/set"
@@ -57,15 +59,18 @@ type ConnectorReconciler struct {
subnetRouters set.Slice[types.UID] // for subnet routers gauge
exitNodes set.Slice[types.UID] // for exit nodes gauge
appConnectors set.Slice[types.UID] // for app connectors gauge
}
var (
// gaugeConnectorResources tracks the overall number of Connectors currently managed by this operator instance.
gaugeConnectorResources = clientmetric.NewGauge("k8s_connector_resources")
gaugeConnectorResources = clientmetric.NewGauge(kubetypes.MetricConnectorResourceCount)
// gaugeConnectorSubnetRouterResources tracks the number of Connectors managed by this operator instance that are subnet routers.
gaugeConnectorSubnetRouterResources = clientmetric.NewGauge("k8s_connector_subnetrouter_resources")
gaugeConnectorSubnetRouterResources = clientmetric.NewGauge(kubetypes.MetricConnectorWithSubnetRouterCount)
// gaugeConnectorExitNodeResources tracks the number of Connectors currently managed by this operator instance that are exit nodes.
gaugeConnectorExitNodeResources = clientmetric.NewGauge("k8s_connector_exitnode_resources")
gaugeConnectorExitNodeResources = clientmetric.NewGauge(kubetypes.MetricConnectorWithExitNodeCount)
// gaugeConnectorAppConnectorResources tracks the number of Connectors currently managed by this operator instance that are app connectors.
gaugeConnectorAppConnectorResources = clientmetric.NewGauge(kubetypes.MetricConnectorWithAppConnectorCount)
)
func (a *ConnectorReconciler) Reconcile(ctx context.Context, req reconcile.Request) (res reconcile.Result, err error) {
@@ -107,13 +112,12 @@ func (a *ConnectorReconciler) Reconcile(ctx context.Context, req reconcile.Reque
oldCnStatus := cn.Status.DeepCopy()
setStatus := func(cn *tsapi.Connector, _ tsapi.ConditionType, status metav1.ConditionStatus, reason, message string) (reconcile.Result, error) {
tsoperator.SetConnectorCondition(cn, tsapi.ConnectorReady, status, reason, message, cn.Generation, a.clock, logger)
var updateErr error
if !apiequality.Semantic.DeepEqual(oldCnStatus, cn.Status) {
// An error encountered here should get returned by the Reconcile function.
if updateErr := a.Client.Status().Update(ctx, cn); updateErr != nil {
err = errors.Wrap(err, updateErr.Error())
}
updateErr = a.Client.Status().Update(ctx, cn)
}
return res, err
return res, errors.Join(err, updateErr)
}
if !slices.Contains(cn.Finalizers, FinalizerName) {
@@ -149,6 +153,9 @@ func (a *ConnectorReconciler) Reconcile(ctx context.Context, req reconcile.Reque
cn.Status.SubnetRoutes = cn.Spec.SubnetRouter.AdvertiseRoutes.Stringify()
return setStatus(cn, tsapi.ConnectorReady, metav1.ConditionTrue, reasonConnectorCreated, reasonConnectorCreated)
}
if cn.Spec.AppConnector != nil {
cn.Status.IsAppConnector = true
}
cn.Status.SubnetRoutes = ""
return setStatus(cn, tsapi.ConnectorReady, metav1.ConditionTrue, reasonConnectorCreated, reasonConnectorCreated)
}
@@ -188,23 +195,37 @@ func (a *ConnectorReconciler) maybeProvisionConnector(ctx context.Context, logge
sts.Connector.routes = cn.Spec.SubnetRouter.AdvertiseRoutes.Stringify()
}
if cn.Spec.AppConnector != nil {
sts.Connector.isAppConnector = true
if len(cn.Spec.AppConnector.Routes) != 0 {
sts.Connector.routes = cn.Spec.AppConnector.Routes.Stringify()
}
}
a.mu.Lock()
if sts.Connector.isExitNode {
if cn.Spec.ExitNode {
a.exitNodes.Add(cn.UID)
} else {
a.exitNodes.Remove(cn.UID)
}
if sts.Connector.routes != "" {
if cn.Spec.SubnetRouter != nil {
a.subnetRouters.Add(cn.GetUID())
} else {
a.subnetRouters.Remove(cn.GetUID())
}
if cn.Spec.AppConnector != nil {
a.appConnectors.Add(cn.GetUID())
} else {
a.appConnectors.Remove(cn.GetUID())
}
a.mu.Unlock()
gaugeConnectorSubnetRouterResources.Set(int64(a.subnetRouters.Len()))
gaugeConnectorExitNodeResources.Set(int64(a.exitNodes.Len()))
gaugeConnectorAppConnectorResources.Set(int64(a.appConnectors.Len()))
var connectors set.Slice[types.UID]
connectors.AddSlice(a.exitNodes.Slice())
connectors.AddSlice(a.subnetRouters.Slice())
connectors.AddSlice(a.appConnectors.Slice())
gaugeConnectorResources.Set(int64(connectors.Len()))
_, err := a.ssr.Provision(ctx, logger, sts)
@@ -247,12 +268,15 @@ func (a *ConnectorReconciler) maybeCleanupConnector(ctx context.Context, logger
a.mu.Lock()
a.subnetRouters.Remove(cn.UID)
a.exitNodes.Remove(cn.UID)
a.appConnectors.Remove(cn.UID)
a.mu.Unlock()
gaugeConnectorExitNodeResources.Set(int64(a.exitNodes.Len()))
gaugeConnectorSubnetRouterResources.Set(int64(a.subnetRouters.Len()))
gaugeConnectorAppConnectorResources.Set(int64(a.appConnectors.Len()))
var connectors set.Slice[types.UID]
connectors.AddSlice(a.exitNodes.Slice())
connectors.AddSlice(a.subnetRouters.Slice())
connectors.AddSlice(a.appConnectors.Slice())
gaugeConnectorResources.Set(int64(connectors.Len()))
return true, nil
}
@@ -261,8 +285,14 @@ func (a *ConnectorReconciler) validate(cn *tsapi.Connector) error {
// Connector fields are already validated at apply time with CEL validation
// on custom resource fields. The checks here are a backup in case the
// CEL validation breaks without us noticing.
if !(cn.Spec.SubnetRouter != nil || cn.Spec.ExitNode) {
return errors.New("invalid spec: a Connector must expose subnet routes or act as an exit node (or both)")
if cn.Spec.SubnetRouter == nil && !cn.Spec.ExitNode && cn.Spec.AppConnector == nil {
return errors.New("invalid spec: a Connector must be configured as at least one of subnet router, exit node or app connector")
}
if (cn.Spec.SubnetRouter != nil || cn.Spec.ExitNode) && cn.Spec.AppConnector != nil {
return errors.New("invalid spec: a Connector that is configured as an app connector must not be also configured as a subnet router or exit node")
}
if cn.Spec.AppConnector != nil {
return validateAppConnector(cn.Spec.AppConnector)
}
if cn.Spec.SubnetRouter == nil {
return nil
@@ -271,19 +301,27 @@ func (a *ConnectorReconciler) validate(cn *tsapi.Connector) error {
}
func validateSubnetRouter(sb *tsapi.SubnetRouter) error {
if len(sb.AdvertiseRoutes) < 1 {
if len(sb.AdvertiseRoutes) == 0 {
return errors.New("invalid subnet router spec: no routes defined")
}
var err error
for _, route := range sb.AdvertiseRoutes {
return validateRoutes(sb.AdvertiseRoutes)
}
func validateAppConnector(ac *tsapi.AppConnector) error {
return validateRoutes(ac.Routes)
}
func validateRoutes(routes tsapi.Routes) error {
var errs []error
for _, route := range routes {
pfx, e := netip.ParsePrefix(string(route))
if e != nil {
err = errors.Wrap(err, fmt.Sprintf("route %s is invalid: %v", route, err))
errs = append(errs, fmt.Errorf("route %v is invalid: %v", route, e))
continue
}
if pfx.Masked() != pfx {
err = errors.Wrap(err, fmt.Sprintf("route %s has non-address bits set; expected %s", pfx, pfx.Masked()))
errs = append(errs, fmt.Errorf("route %s has non-address bits set; expected %s", pfx, pfx.Masked()))
}
}
return err
return errors.Join(errs...)
}

View File

@@ -8,14 +8,17 @@ package main
import (
"context"
"testing"
"time"
"go.uber.org/zap"
appsv1 "k8s.io/api/apps/v1"
corev1 "k8s.io/api/core/v1"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
"k8s.io/apimachinery/pkg/types"
"k8s.io/client-go/tools/record"
"sigs.k8s.io/controller-runtime/pkg/client/fake"
tsapi "tailscale.com/k8s-operator/apis/v1alpha1"
"tailscale.com/kube/kubetypes"
"tailscale.com/tstest"
"tailscale.com/util/mak"
)
@@ -74,6 +77,7 @@ func TestConnector(t *testing.T) {
hostname: "test-connector",
isExitNode: true,
subnetRoutes: "10.40.0.0/14",
app: kubetypes.AppConnector,
}
expectEqual(t, fc, expectedSecret(t, fc, opts), nil)
expectEqual(t, fc, expectedSTS(t, fc, opts), removeHashAnnotation)
@@ -169,6 +173,7 @@ func TestConnector(t *testing.T) {
parentType: "connector",
subnetRoutes: "10.40.0.0/14",
hostname: "test-connector",
app: kubetypes.AppConnector,
}
expectEqual(t, fc, expectedSecret(t, fc, opts), nil)
expectEqual(t, fc, expectedSTS(t, fc, opts), removeHashAnnotation)
@@ -254,6 +259,7 @@ func TestConnectorWithProxyClass(t *testing.T) {
hostname: "test-connector",
isExitNode: true,
subnetRoutes: "10.40.0.0/14",
app: kubetypes.AppConnector,
}
expectEqual(t, fc, expectedSecret(t, fc, opts), nil)
expectEqual(t, fc, expectedSTS(t, fc, opts), removeHashAnnotation)
@@ -274,7 +280,7 @@ func TestConnectorWithProxyClass(t *testing.T) {
pc.Status = tsapi.ProxyClassStatus{
Conditions: []metav1.Condition{{
Status: metav1.ConditionTrue,
Type: string(tsapi.ProxyClassready),
Type: string(tsapi.ProxyClassReady),
ObservedGeneration: pc.Generation,
}}}
})
@@ -292,3 +298,100 @@ func TestConnectorWithProxyClass(t *testing.T) {
expectReconciled(t, cr, "", "test")
expectEqual(t, fc, expectedSTS(t, fc, opts), removeHashAnnotation)
}
func TestConnectorWithAppConnector(t *testing.T) {
// Setup
cn := &tsapi.Connector{
ObjectMeta: metav1.ObjectMeta{
Name: "test",
UID: types.UID("1234-UID"),
},
TypeMeta: metav1.TypeMeta{
Kind: tsapi.ConnectorKind,
APIVersion: "tailscale.io/v1alpha1",
},
Spec: tsapi.ConnectorSpec{
AppConnector: &tsapi.AppConnector{},
},
}
fc := fake.NewClientBuilder().
WithScheme(tsapi.GlobalScheme).
WithObjects(cn).
WithStatusSubresource(cn).
Build()
ft := &fakeTSClient{}
zl, err := zap.NewDevelopment()
if err != nil {
t.Fatal(err)
}
cl := tstest.NewClock(tstest.ClockOpts{})
fr := record.NewFakeRecorder(1)
cr := &ConnectorReconciler{
Client: fc,
clock: cl,
ssr: &tailscaleSTSReconciler{
Client: fc,
tsClient: ft,
defaultTags: []string{"tag:k8s"},
operatorNamespace: "operator-ns",
proxyImage: "tailscale/tailscale",
},
logger: zl.Sugar(),
recorder: fr,
}
// 1. Connector with app connnector is created and becomes ready
expectReconciled(t, cr, "", "test")
fullName, shortName := findGenName(t, fc, "", "test", "connector")
opts := configOpts{
stsName: shortName,
secretName: fullName,
parentType: "connector",
hostname: "test-connector",
app: kubetypes.AppConnector,
isAppConnector: true,
}
expectEqual(t, fc, expectedSecret(t, fc, opts), nil)
expectEqual(t, fc, expectedSTS(t, fc, opts), removeHashAnnotation)
// Connector's ready condition should be set to true
cn.ObjectMeta.Finalizers = append(cn.ObjectMeta.Finalizers, "tailscale.com/finalizer")
cn.Status.IsAppConnector = true
cn.Status.Conditions = []metav1.Condition{{
Type: string(tsapi.ConnectorReady),
Status: metav1.ConditionTrue,
LastTransitionTime: metav1.Time{Time: cl.Now().Truncate(time.Second)},
Reason: reasonConnectorCreated,
Message: reasonConnectorCreated,
}}
expectEqual(t, fc, cn, nil)
// 2. Connector with invalid app connector routes has status set to invalid
mustUpdate[tsapi.Connector](t, fc, "", "test", func(conn *tsapi.Connector) {
conn.Spec.AppConnector.Routes = tsapi.Routes{tsapi.Route("1.2.3.4/5")}
})
cn.Spec.AppConnector.Routes = tsapi.Routes{tsapi.Route("1.2.3.4/5")}
expectReconciled(t, cr, "", "test")
cn.Status.Conditions = []metav1.Condition{{
Type: string(tsapi.ConnectorReady),
Status: metav1.ConditionFalse,
LastTransitionTime: metav1.Time{Time: cl.Now().Truncate(time.Second)},
Reason: reasonConnectorInvalid,
Message: "Connector is invalid: route 1.2.3.4/5 has non-address bits set; expected 0.0.0.0/5",
}}
expectEqual(t, fc, cn, nil)
// 3. Connector with valid app connnector routes becomes ready
mustUpdate[tsapi.Connector](t, fc, "", "test", func(conn *tsapi.Connector) {
conn.Spec.AppConnector.Routes = tsapi.Routes{tsapi.Route("10.88.2.21/32")}
})
cn.Spec.AppConnector.Routes = tsapi.Routes{tsapi.Route("10.88.2.21/32")}
cn.Status.Conditions = []metav1.Condition{{
Type: string(tsapi.ConnectorReady),
Status: metav1.ConditionTrue,
LastTransitionTime: metav1.Time{Time: cl.Now().Truncate(time.Second)},
Reason: reasonConnectorCreated,
Message: reasonConnectorCreated,
}}
expectReconciled(t, cr, "", "test")
}

File diff suppressed because it is too large Load Diff

View File

@@ -77,6 +77,13 @@ spec:
value: "{{ .Values.apiServerProxyConfig.mode }}"
- name: PROXY_FIREWALL_MODE
value: {{ .Values.proxyConfig.firewallMode }}
{{- if .Values.proxyConfig.defaultProxyClass }}
- name: PROXY_DEFAULT_CLASS
value: {{ .Values.proxyConfig.defaultProxyClass }}
{{- end }}
{{- with .Values.operatorConfig.extraEnv }}
{{- toYaml . | nindent 12 }}
{{- end }}
volumeMounts:
- name: oauth
mountPath: /oauth

View File

@@ -14,19 +14,22 @@ metadata:
rules:
- apiGroups: [""]
resources: ["events", "services", "services/status"]
verbs: ["*"]
verbs: ["create","delete","deletecollection","get","list","patch","update","watch"]
- apiGroups: ["networking.k8s.io"]
resources: ["ingresses", "ingresses/status"]
verbs: ["*"]
verbs: ["create","delete","deletecollection","get","list","patch","update","watch"]
- apiGroups: ["networking.k8s.io"]
resources: ["ingressclasses"]
verbs: ["get", "list", "watch"]
- apiGroups: ["tailscale.com"]
resources: ["connectors", "connectors/status", "proxyclasses", "proxyclasses/status"]
resources: ["connectors", "connectors/status", "proxyclasses", "proxyclasses/status", "proxygroups", "proxygroups/status"]
verbs: ["get", "list", "watch", "update"]
- apiGroups: ["tailscale.com"]
resources: ["dnsconfigs", "dnsconfigs/status"]
verbs: ["get", "list", "watch", "update"]
- apiGroups: ["tailscale.com"]
resources: ["recorders", "recorders/status"]
verbs: ["get", "list", "watch", "update"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
@@ -49,13 +52,19 @@ metadata:
rules:
- apiGroups: [""]
resources: ["secrets", "serviceaccounts", "configmaps"]
verbs: ["*"]
verbs: ["create","delete","deletecollection","get","list","patch","update","watch"]
- apiGroups: [""]
resources: ["pods"]
verbs: ["get","list","watch"]
- apiGroups: ["apps"]
resources: ["statefulsets", "deployments"]
verbs: ["*"]
verbs: ["create","delete","deletecollection","get","list","patch","update","watch"]
- apiGroups: ["discovery.k8s.io"]
resources: ["endpointslices"]
verbs: ["get", "list", "watch"]
verbs: ["get", "list", "watch", "create", "update", "deletecollection"]
- apiGroups: ["rbac.authorization.k8s.io"]
resources: ["roles", "rolebindings"]
verbs: ["get", "create", "patch", "update", "list", "watch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding

View File

@@ -15,7 +15,7 @@ metadata:
rules:
- apiGroups: [""]
resources: ["secrets"]
verbs: ["*"]
verbs: ["create","delete","deletecollection","get","list","patch","update","watch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding

View File

@@ -48,14 +48,21 @@ operatorConfig:
securityContext: {}
extraEnv: []
# - name: EXTRA_VAR1
# value: "value1"
# - name: EXTRA_VAR2
# value: "value2"
# proxyConfig contains configuraton that will be applied to any ingress/egress
# proxies created by the operator.
# https://tailscale.com/kb/1236/kubernetes-operator/#cluster-ingress
# https://tailscale.com/kb/1236/kubernetes-operator/#cluster-egress
# https://tailscale.com/kb/1439/kubernetes-operator-cluster-ingress
# https://tailscale.com/kb/1438/kubernetes-operator-cluster-egress
# Note that this section contains only a few global configuration options and
# will not be updated with more configuration options in the future.
# If you need more configuration options, take a look at ProxyClass:
# https://tailscale.com/kb/1236/kubernetes-operator#cluster-resource-customization-using-proxyclass-custom-resource
# https://tailscale.com/kb/1445/kubernetes-operator-customization#cluster-resource-customization-using-proxyclass-custom-resource
proxyConfig:
image:
# Repository defaults to DockerHub, but images are also synced to ghcr.io/tailscale/tailscale.
@@ -71,10 +78,14 @@ proxyConfig:
# Note that if you pass multiple tags to this field via `--set` flag to helm upgrade/install commands you must escape the comma (for example, "tag:k8s-proxies\,tag:prod"). See https://github.com/helm/helm/issues/1556
defaultTags: "tag:k8s"
firewallMode: auto
# If defined, this proxy class will be used as the default proxy class for
# service and ingress resources that do not have a proxy class defined. It
# does not apply to Connector resources.
defaultProxyClass: ""
# apiServerProxyConfig allows to configure whether the operator should expose
# Kubernetes API server.
# https://tailscale.com/kb/1236/kubernetes-operator/#accessing-the-kubernetes-control-plane-using-an-api-server-proxy
# https://tailscale.com/kb/1437/kubernetes-operator-api-server-proxy
apiServerProxyConfig:
mode: "false" # "true", "false", "noauth"

View File

@@ -24,6 +24,10 @@ spec:
jsonPath: .status.isExitNode
name: IsExitNode
type: string
- description: Whether this Connector instance is an app connector.
jsonPath: .status.isAppConnector
name: IsAppConnector
type: string
- description: Status of the deployed Connector resources.
jsonPath: .status.conditions[?(@.type == "ConnectorReady")].reason
name: Status
@@ -37,7 +41,7 @@ spec:
exit node.
Connector is a cluster-scoped resource.
More info:
https://tailscale.com/kb/1236/kubernetes-operator#deploying-exit-nodes-and-subnet-routers-on-kubernetes-using-connector-custom-resource
https://tailscale.com/kb/1441/kubernetes-operator-connector
type: object
required:
- spec
@@ -66,10 +70,40 @@ spec:
https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#spec-and-status
type: object
properties:
appConnector:
description: |-
AppConnector defines whether the Connector device should act as a Tailscale app connector. A Connector that is
configured as an app connector cannot be a subnet router or an exit node. If this field is unset, the
Connector does not act as an app connector.
Note that you will need to manually configure the permissions and the domains for the app connector via the
Admin panel.
Note also that the main tested and supported use case of this config option is to deploy an app connector on
Kubernetes to access SaaS applications available on the public internet. Using the app connector to expose
cluster workloads or other internal workloads to tailnet might work, but this is not a use case that we have
tested or optimised for.
If you are using the app connector to access SaaS applications because you need a predictable egress IP that
can be whitelisted, it is also your responsibility to ensure that cluster traffic from the connector flows
via that predictable IP, for example by enforcing that cluster egress traffic is routed via an egress NAT
device with a static IP address.
https://tailscale.com/kb/1281/app-connectors
type: object
properties:
routes:
description: |-
Routes are optional preconfigured routes for the domains routed via the app connector.
If not set, routes for the domains will be discovered dynamically.
If set, the app connector will immediately be able to route traffic using the preconfigured routes, but may
also dynamically discover other routes.
https://tailscale.com/kb/1332/apps-best-practices#preconfiguration
type: array
minItems: 1
items:
type: string
format: cidr
exitNode:
description: |-
ExitNode defines whether the Connector node should act as a
Tailscale exit node. Defaults to false.
ExitNode defines whether the Connector device should act as a Tailscale exit node. Defaults to false.
This field is mutually exclusive with the appConnector field.
https://tailscale.com/kb/1103/exit-nodes
type: boolean
hostname:
@@ -90,9 +124,11 @@ spec:
type: string
subnetRouter:
description: |-
SubnetRouter defines subnet routes that the Connector node should
expose to tailnet. If unset, none are exposed.
SubnetRouter defines subnet routes that the Connector device should
expose to tailnet as a Tailscale subnet router.
https://tailscale.com/kb/1019/subnets/
If this field is unset, the device does not get configured as a Tailscale subnet router.
This field is mutually exclusive with the appConnector field.
type: object
required:
- advertiseRoutes
@@ -115,7 +151,7 @@ spec:
To autoapprove the subnet routes or exit node defined by a Connector,
you can configure Tailscale ACLs to give these tags the necessary
permissions.
See https://tailscale.com/kb/1018/acls/#auto-approvers-for-routes-and-exit-nodes.
See https://tailscale.com/kb/1337/acl-syntax#autoapprovers.
If you specify custom tags here, you must also make the operator an owner of these tags.
See https://tailscale.com/kb/1236/kubernetes-operator/#setting-up-the-kubernetes-operator.
Tags cannot be changed once a Connector node has been created.
@@ -125,8 +161,10 @@ spec:
type: string
pattern: ^tag:[a-zA-Z][a-zA-Z0-9-]*$
x-kubernetes-validations:
- rule: has(self.subnetRouter) || self.exitNode == true
message: A Connector needs to be either an exit node or a subnet router, or both.
- rule: has(self.subnetRouter) || (has(self.exitNode) && self.exitNode == true) || has(self.appConnector)
message: A Connector needs to have at least one of exit node, subnet router or app connector configured.
- rule: '!((has(self.subnetRouter) || (has(self.exitNode) && self.exitNode == true)) && has(self.appConnector))'
message: The appConnector field is mutually exclusive with exitNode and subnetRouter fields.
status:
description: |-
ConnectorStatus describes the status of the Connector. This is set
@@ -200,6 +238,9 @@ spec:
If MagicDNS is enabled in your tailnet, it is the MagicDNS name of the
node.
type: string
isAppConnector:
description: IsAppConnector is set to true if the Connector acts as an app connector.
type: boolean
isExitNode:
description: IsExitNode is set to true if the Connector acts as an exit node.
type: boolean

View File

@@ -89,14 +89,14 @@ spec:
type: object
properties:
image:
description: Nameserver image.
description: Nameserver image. Defaults to tailscale/k8s-nameserver:unstable.
type: object
properties:
repo:
description: Repo defaults to tailscale/k8s-nameserver.
type: string
tag:
description: Tag defaults to operator's own tag.
description: Tag defaults to unstable.
type: string
status:
description: |-

View File

@@ -30,7 +30,7 @@ spec:
connector.spec.proxyClass field.
ProxyClass is a cluster scoped resource.
More info:
https://tailscale.com/kb/1236/kubernetes-operator#cluster-resource-customization-using-proxyclass-custom-resource.
https://tailscale.com/kb/1445/kubernetes-operator-customization#cluster-resource-customization-using-proxyclass-custom-resource
type: object
required:
- spec
@@ -1896,6 +1896,182 @@ spec:
Value is the taint value the toleration matches to.
If the operator is Exists, the value should be empty, otherwise just a regular string.
type: string
topologySpreadConstraints:
description: |-
Proxy Pod's topology spread constraints.
By default Tailscale Kubernetes operator does not apply any topology spread constraints.
https://kubernetes.io/docs/concepts/scheduling-eviction/topology-spread-constraints/
type: array
items:
description: TopologySpreadConstraint specifies how to spread matching pods among the given topology.
type: object
required:
- maxSkew
- topologyKey
- whenUnsatisfiable
properties:
labelSelector:
description: |-
LabelSelector is used to find matching pods.
Pods that match this label selector are counted to determine the number of pods
in their corresponding topology domain.
type: object
properties:
matchExpressions:
description: matchExpressions is a list of label selector requirements. The requirements are ANDed.
type: array
items:
description: |-
A label selector requirement is a selector that contains values, a key, and an operator that
relates the key and values.
type: object
required:
- key
- operator
properties:
key:
description: key is the label key that the selector applies to.
type: string
operator:
description: |-
operator represents a key's relationship to a set of values.
Valid operators are In, NotIn, Exists and DoesNotExist.
type: string
values:
description: |-
values is an array of string values. If the operator is In or NotIn,
the values array must be non-empty. If the operator is Exists or DoesNotExist,
the values array must be empty. This array is replaced during a strategic
merge patch.
type: array
items:
type: string
x-kubernetes-list-type: atomic
x-kubernetes-list-type: atomic
matchLabels:
description: |-
matchLabels is a map of {key,value} pairs. A single {key,value} in the matchLabels
map is equivalent to an element of matchExpressions, whose key field is "key", the
operator is "In", and the values array contains only "value". The requirements are ANDed.
type: object
additionalProperties:
type: string
x-kubernetes-map-type: atomic
matchLabelKeys:
description: |-
MatchLabelKeys is a set of pod label keys to select the pods over which
spreading will be calculated. The keys are used to lookup values from the
incoming pod labels, those key-value labels are ANDed with labelSelector
to select the group of existing pods over which spreading will be calculated
for the incoming pod. The same key is forbidden to exist in both MatchLabelKeys and LabelSelector.
MatchLabelKeys cannot be set when LabelSelector isn't set.
Keys that don't exist in the incoming pod labels will
be ignored. A null or empty list means only match against labelSelector.
This is a beta field and requires the MatchLabelKeysInPodTopologySpread feature gate to be enabled (enabled by default).
type: array
items:
type: string
x-kubernetes-list-type: atomic
maxSkew:
description: |-
MaxSkew describes the degree to which pods may be unevenly distributed.
When `whenUnsatisfiable=DoNotSchedule`, it is the maximum permitted difference
between the number of matching pods in the target topology and the global minimum.
The global minimum is the minimum number of matching pods in an eligible domain
or zero if the number of eligible domains is less than MinDomains.
For example, in a 3-zone cluster, MaxSkew is set to 1, and pods with the same
labelSelector spread as 2/2/1:
In this case, the global minimum is 1.
| zone1 | zone2 | zone3 |
| P P | P P | P |
- if MaxSkew is 1, incoming pod can only be scheduled to zone3 to become 2/2/2;
scheduling it onto zone1(zone2) would make the ActualSkew(3-1) on zone1(zone2)
violate MaxSkew(1).
- if MaxSkew is 2, incoming pod can be scheduled onto any zone.
When `whenUnsatisfiable=ScheduleAnyway`, it is used to give higher precedence
to topologies that satisfy it.
It's a required field. Default value is 1 and 0 is not allowed.
type: integer
format: int32
minDomains:
description: |-
MinDomains indicates a minimum number of eligible domains.
When the number of eligible domains with matching topology keys is less than minDomains,
Pod Topology Spread treats "global minimum" as 0, and then the calculation of Skew is performed.
And when the number of eligible domains with matching topology keys equals or greater than minDomains,
this value has no effect on scheduling.
As a result, when the number of eligible domains is less than minDomains,
scheduler won't schedule more than maxSkew Pods to those domains.
If value is nil, the constraint behaves as if MinDomains is equal to 1.
Valid values are integers greater than 0.
When value is not nil, WhenUnsatisfiable must be DoNotSchedule.
For example, in a 3-zone cluster, MaxSkew is set to 2, MinDomains is set to 5 and pods with the same
labelSelector spread as 2/2/2:
| zone1 | zone2 | zone3 |
| P P | P P | P P |
The number of domains is less than 5(MinDomains), so "global minimum" is treated as 0.
In this situation, new pod with the same labelSelector cannot be scheduled,
because computed skew will be 3(3 - 0) if new Pod is scheduled to any of the three zones,
it will violate MaxSkew.
type: integer
format: int32
nodeAffinityPolicy:
description: |-
NodeAffinityPolicy indicates how we will treat Pod's nodeAffinity/nodeSelector
when calculating pod topology spread skew. Options are:
- Honor: only nodes matching nodeAffinity/nodeSelector are included in the calculations.
- Ignore: nodeAffinity/nodeSelector are ignored. All nodes are included in the calculations.
If this value is nil, the behavior is equivalent to the Honor policy.
This is a beta-level feature default enabled by the NodeInclusionPolicyInPodTopologySpread feature flag.
type: string
nodeTaintsPolicy:
description: |-
NodeTaintsPolicy indicates how we will treat node taints when calculating
pod topology spread skew. Options are:
- Honor: nodes without taints, along with tainted nodes for which the incoming pod
has a toleration, are included.
- Ignore: node taints are ignored. All nodes are included.
If this value is nil, the behavior is equivalent to the Ignore policy.
This is a beta-level feature default enabled by the NodeInclusionPolicyInPodTopologySpread feature flag.
type: string
topologyKey:
description: |-
TopologyKey is the key of node labels. Nodes that have a label with this key
and identical values are considered to be in the same topology.
We consider each <key, value> as a "bucket", and try to put balanced number
of pods into each bucket.
We define a domain as a particular instance of a topology.
Also, we define an eligible domain as a domain whose nodes meet the requirements of
nodeAffinityPolicy and nodeTaintsPolicy.
e.g. If TopologyKey is "kubernetes.io/hostname", each Node is a domain of that topology.
And, if TopologyKey is "topology.kubernetes.io/zone", each zone is a domain of that topology.
It's a required field.
type: string
whenUnsatisfiable:
description: |-
WhenUnsatisfiable indicates how to deal with a pod if it doesn't satisfy
the spread constraint.
- DoNotSchedule (default) tells the scheduler not to schedule it.
- ScheduleAnyway tells the scheduler to schedule the pod in any location,
but giving higher precedence to topologies that would help reduce the
skew.
A constraint is considered "Unsatisfiable" for an incoming pod
if and only if every possible node assignment for that pod would violate
"MaxSkew" on some topology.
For example, in a 3-zone cluster, MaxSkew is set to 1, and pods with the same
labelSelector spread as 3/1/1:
| zone1 | zone2 | zone3 |
| P P P | P | P |
If WhenUnsatisfiable is set to DoNotSchedule, incoming pod can only be scheduled
to zone2(zone3) to become 3/2/1(3/1/2) as ActualSkew(2-1) on zone2(zone3) satisfies
MaxSkew(1). In other words, the cluster can still be imbalanced, but scheduler
won't make it *more* imbalanced.
It's a required field.
type: string
tailscale:
description: |-
TailscaleConfig contains options to configure the tailscale-specific
@@ -1908,7 +2084,7 @@ spec:
routes advertized by other nodes on the tailnet, such as subnet
routes.
This is equivalent of passing --accept-routes flag to a tailscale Linux client.
https://tailscale.com/kb/1019/subnets#use-your-subnet-routes-from-other-machines
https://tailscale.com/kb/1019/subnets#use-your-subnet-routes-from-other-devices
Defaults to false.
type: boolean
status:

View File

@@ -0,0 +1,187 @@
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
annotations:
controller-gen.kubebuilder.io/version: v0.15.1-0.20240618033008-7824932b0cab
name: proxygroups.tailscale.com
spec:
group: tailscale.com
names:
kind: ProxyGroup
listKind: ProxyGroupList
plural: proxygroups
shortNames:
- pg
singular: proxygroup
scope: Cluster
versions:
- additionalPrinterColumns:
- description: Status of the deployed ProxyGroup resources.
jsonPath: .status.conditions[?(@.type == "ProxyGroupReady")].reason
name: Status
type: string
name: v1alpha1
schema:
openAPIV3Schema:
type: object
required:
- spec
properties:
apiVersion:
description: |-
APIVersion defines the versioned schema of this representation of an object.
Servers should convert recognized schemas to the latest internal value, and
may reject unrecognized values.
More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources
type: string
kind:
description: |-
Kind is a string value representing the REST resource this object represents.
Servers may infer this from the endpoint the client submits requests to.
Cannot be updated.
In CamelCase.
More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds
type: string
metadata:
type: object
spec:
description: Spec describes the desired ProxyGroup instances.
type: object
required:
- type
properties:
hostnamePrefix:
description: |-
HostnamePrefix is the hostname prefix to use for tailnet devices created
by the ProxyGroup. Each device will have the integer number from its
StatefulSet pod appended to this prefix to form the full hostname.
HostnamePrefix can contain lower case letters, numbers and dashes, it
must not start with a dash and must be between 1 and 62 characters long.
type: string
pattern: ^[a-z0-9][a-z0-9-]{0,61}$
proxyClass:
description: |-
ProxyClass is the name of the ProxyClass custom resource that contains
configuration options that should be applied to the resources created
for this ProxyGroup. If unset, and there is no default ProxyClass
configured, the operator will create resources with the default
configuration.
type: string
replicas:
description: |-
Replicas specifies how many replicas to create the StatefulSet with.
Defaults to 2.
type: integer
format: int32
tags:
description: |-
Tags that the Tailscale devices will be tagged with. Defaults to [tag:k8s].
If you specify custom tags here, make sure you also make the operator
an owner of these tags.
See https://tailscale.com/kb/1236/kubernetes-operator/#setting-up-the-kubernetes-operator.
Tags cannot be changed once a ProxyGroup device has been created.
Tag values must be in form ^tag:[a-zA-Z][a-zA-Z0-9-]*$.
type: array
items:
type: string
pattern: ^tag:[a-zA-Z][a-zA-Z0-9-]*$
type:
description: Type of the ProxyGroup proxies. Currently the only supported type is egress.
type: string
enum:
- egress
status:
description: |-
ProxyGroupStatus describes the status of the ProxyGroup resources. This is
set and managed by the Tailscale operator.
type: object
properties:
conditions:
description: |-
List of status conditions to indicate the status of the ProxyGroup
resources. Known condition types are `ProxyGroupReady`.
type: array
items:
description: Condition contains details for one aspect of the current state of this API Resource.
type: object
required:
- lastTransitionTime
- message
- reason
- status
- type
properties:
lastTransitionTime:
description: |-
lastTransitionTime is the last time the condition transitioned from one status to another.
This should be when the underlying condition changed. If that is not known, then using the time when the API field changed is acceptable.
type: string
format: date-time
message:
description: |-
message is a human readable message indicating details about the transition.
This may be an empty string.
type: string
maxLength: 32768
observedGeneration:
description: |-
observedGeneration represents the .metadata.generation that the condition was set based upon.
For instance, if .metadata.generation is currently 12, but the .status.conditions[x].observedGeneration is 9, the condition is out of date
with respect to the current state of the instance.
type: integer
format: int64
minimum: 0
reason:
description: |-
reason contains a programmatic identifier indicating the reason for the condition's last transition.
Producers of specific condition types may define expected values and meanings for this field,
and whether the values are considered a guaranteed API.
The value should be a CamelCase string.
This field may not be empty.
type: string
maxLength: 1024
minLength: 1
pattern: ^[A-Za-z]([A-Za-z0-9_,:]*[A-Za-z0-9_])?$
status:
description: status of the condition, one of True, False, Unknown.
type: string
enum:
- "True"
- "False"
- Unknown
type:
description: type of condition in CamelCase or in foo.example.com/CamelCase.
type: string
maxLength: 316
pattern: ^([a-z0-9]([-a-z0-9]*[a-z0-9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*/)?(([A-Za-z0-9][-A-Za-z0-9_.]*)?[A-Za-z0-9])$
x-kubernetes-list-map-keys:
- type
x-kubernetes-list-type: map
devices:
description: List of tailnet devices associated with the ProxyGroup StatefulSet.
type: array
items:
type: object
required:
- hostname
properties:
hostname:
description: |-
Hostname is the fully qualified domain name of the device.
If MagicDNS is enabled in your tailnet, it is the MagicDNS name of the
node.
type: string
tailnetIPs:
description: |-
TailnetIPs is the set of tailnet IP addresses (both IPv4 and IPv6)
assigned to the device.
type: array
items:
type: string
x-kubernetes-list-map-keys:
- hostname
x-kubernetes-list-type: map
served: true
storage: true
subresources:
status: {}

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,7 @@
apiVersion: tailscale.com/v1alpha1
kind: ProxyGroup
metadata:
name: egress-proxies
spec:
type: egress
replicas: 3

View File

@@ -0,0 +1,6 @@
apiVersion: tailscale.com/v1alpha1
kind: Recorder
metadata:
name: recorder
spec:
enableUI: true

File diff suppressed because it is too large Load Diff

View File

@@ -3,9 +3,6 @@
//go:build !plan9
// tailscale-operator provides a way to expose services running in a Kubernetes
// cluster to your Tailnet and to make Tailscale nodes available to cluster
// workloads
package main
import (
@@ -27,6 +24,7 @@ import (
operatorutils "tailscale.com/k8s-operator"
tsapi "tailscale.com/k8s-operator/apis/v1alpha1"
"tailscale.com/util/mak"
"tailscale.com/util/set"
)
const (
@@ -170,36 +168,49 @@ func (dnsRR *dnsRecordsReconciler) maybeProvision(ctx context.Context, headlessS
}
}
// Get the Pod IP addresses for the proxy from the EndpointSlice for the
// headless Service.
// Get the Pod IP addresses for the proxy from the EndpointSlices for
// the headless Service. The Service can have multiple EndpointSlices
// associated with it, for example in dual-stack clusters.
labels := map[string]string{discoveryv1.LabelServiceName: headlessSvc.Name} // https://kubernetes.io/docs/concepts/services-networking/endpoint-slices/#ownership
eps, err := getSingleObject[discoveryv1.EndpointSlice](ctx, dnsRR.Client, dnsRR.tsNamespace, labels)
if err != nil {
return fmt.Errorf("error getting the EndpointSlice for the proxy's headless Service: %w", err)
var eps = new(discoveryv1.EndpointSliceList)
if err := dnsRR.List(ctx, eps, client.InNamespace(dnsRR.tsNamespace), client.MatchingLabels(labels)); err != nil {
return fmt.Errorf("error listing EndpointSlices for the proxy's headless Service: %w", err)
}
if eps == nil {
if len(eps.Items) == 0 {
logger.Debugf("proxy's headless Service EndpointSlice does not yet exist. We will reconcile again once it's created")
return nil
}
// An EndpointSlice for a Service can have a list of endpoints that each
// Each EndpointSlice for a Service can have a list of endpoints that each
// can have multiple addresses - these are the IP addresses of any Pods
// selected by that Service. Pick all the IPv4 addresses.
ips := make([]string, 0)
for _, ep := range eps.Endpoints {
for _, ip := range ep.Addresses {
if !net.IsIPv4String(ip) {
logger.Infof("EndpointSlice contains IP address %q that is not IPv4, ignoring. Currently only IPv4 is supported", ip)
} else {
ips = append(ips, ip)
// It is also possible that multiple EndpointSlices have overlapping addresses.
// https://kubernetes.io/docs/concepts/services-networking/endpoint-slices/#duplicate-endpoints
ips := make(set.Set[string], 0)
for _, slice := range eps.Items {
if slice.AddressType != discoveryv1.AddressTypeIPv4 {
logger.Infof("EndpointSlice is for AddressType %s, currently only IPv4 address type is supported", slice.AddressType)
continue
}
for _, ep := range slice.Endpoints {
if !epIsReady(&ep) {
logger.Debugf("Endpoint with addresses %v appears not ready to receive traffic %v", ep.Addresses, ep.Conditions.String())
continue
}
for _, ip := range ep.Addresses {
if !net.IsIPv4String(ip) {
logger.Infof("EndpointSlice contains IP address %q that is not IPv4, ignoring. Currently only IPv4 is supported", ip)
} else {
ips.Add(ip)
}
}
}
}
if len(ips) == 0 {
if ips.Len() == 0 {
logger.Debugf("EndpointSlice for the Service contains no IPv4 addresses. We will reconcile again once they are created.")
return nil
}
updateFunc := func(rec *operatorutils.Records) {
mak.Set(&rec.IP4, fqdn, ips)
mak.Set(&rec.IP4, fqdn, ips.Slice())
}
if err = dnsRR.updateDNSConfig(ctx, updateFunc); err != nil {
return fmt.Errorf("error updating DNS records: %w", err)
@@ -207,6 +218,17 @@ func (dnsRR *dnsRecordsReconciler) maybeProvision(ctx context.Context, headlessS
return nil
}
// epIsReady reports whether the endpoint is currently in a state to receive new
// traffic. As per kube docs, only explicitly set 'false' for 'Ready' or
// 'Serving' conditions or explicitly set 'true' for 'Terminating' condition
// means that the Endpoint is NOT ready.
// https://github.com/kubernetes/kubernetes/blob/60c4c2b2521fb454ce69dee737e3eb91a25e0535/pkg/apis/discovery/types.go#L109-L131
func epIsReady(ep *discoveryv1.Endpoint) bool {
return (ep.Conditions.Ready == nil || *ep.Conditions.Ready) &&
(ep.Conditions.Serving == nil || *ep.Conditions.Serving) &&
(ep.Conditions.Terminating == nil || !*ep.Conditions.Terminating)
}
// maybeCleanup ensures that the DNS record for the proxy has been removed from
// dnsrecords ConfigMap and the tailscale.com/dns-records-reconciler finalizer
// has been removed from the Service. If the record is not found in the

View File

@@ -8,6 +8,7 @@ package main
import (
"context"
"encoding/json"
"fmt"
"testing"
"github.com/google/go-cmp/cmp"
@@ -87,13 +88,16 @@ func TestDNSRecordsReconciler(t *testing.T) {
},
}
headlessForEgressSvcFQDN := headlessSvcForParent(egressSvcFQDN, "svc") // create the proxy headless Service
ep := endpointSliceForService(headlessForEgressSvcFQDN, "10.9.8.7")
ep := endpointSliceForService(headlessForEgressSvcFQDN, "10.9.8.7", discoveryv1.AddressTypeIPv4)
epv6 := endpointSliceForService(headlessForEgressSvcFQDN, "2600:1900:4011:161:0:d:0:d", discoveryv1.AddressTypeIPv6)
mustCreate(t, fc, egressSvcFQDN)
mustCreate(t, fc, headlessForEgressSvcFQDN)
mustCreate(t, fc, ep)
mustCreate(t, fc, epv6)
expectReconciled(t, dnsRR, "tailscale", "egress-fqdn") // dns-records-reconciler reconcile the headless Service
// ConfigMap should now have a record for foo.bar.ts.net -> 10.8.8.7
wantHosts := map[string][]string{"foo.bar.ts.net": {"10.9.8.7"}}
wantHosts := map[string][]string{"foo.bar.ts.net": {"10.9.8.7"}} // IPv6 endpoint is currently ignored
expectHostsRecords(t, fc, wantHosts)
// 2. DNS record is updated if tailscale.com/tailnet-fqdn annotation's
@@ -106,7 +110,7 @@ func TestDNSRecordsReconciler(t *testing.T) {
expectHostsRecords(t, fc, wantHosts)
// 3. DNS record is updated if the IP address of the proxy Pod changes.
ep = endpointSliceForService(headlessForEgressSvcFQDN, "10.6.5.4")
ep = endpointSliceForService(headlessForEgressSvcFQDN, "10.6.5.4", discoveryv1.AddressTypeIPv4)
mustUpdate(t, fc, ep.Namespace, ep.Name, func(ep *discoveryv1.EndpointSlice) {
ep.Endpoints[0].Addresses = []string{"10.6.5.4"}
})
@@ -116,7 +120,7 @@ func TestDNSRecordsReconciler(t *testing.T) {
// 4. DNS record is created for an ingress proxy configured via Ingress
headlessForIngress := headlessSvcForParent(ing, "ingress")
ep = endpointSliceForService(headlessForIngress, "10.9.8.7")
ep = endpointSliceForService(headlessForIngress, "10.9.8.7", discoveryv1.AddressTypeIPv4)
mustCreate(t, fc, headlessForIngress)
mustCreate(t, fc, ep)
expectReconciled(t, dnsRR, "tailscale", "ts-ingress") // dns-records-reconciler should reconcile the headless Service
@@ -140,6 +144,17 @@ func TestDNSRecordsReconciler(t *testing.T) {
expectReconciled(t, dnsRR, "tailscale", "ts-ingress")
wantHosts["another.ingress.ts.net"] = []string{"7.8.9.10"}
expectHostsRecords(t, fc, wantHosts)
// 7. A not-ready Endpoint is removed from DNS config.
mustUpdate(t, fc, ep.Namespace, ep.Name, func(ep *discoveryv1.EndpointSlice) {
ep.Endpoints[0].Conditions.Ready = ptr.To(false)
ep.Endpoints = append(ep.Endpoints, discoveryv1.Endpoint{
Addresses: []string{"1.2.3.4"},
})
})
expectReconciled(t, dnsRR, "tailscale", "ts-ingress")
wantHosts["another.ingress.ts.net"] = []string{"1.2.3.4"}
expectHostsRecords(t, fc, wantHosts)
}
func headlessSvcForParent(o client.Object, typ string) *corev1.Service {
@@ -162,15 +177,21 @@ func headlessSvcForParent(o client.Object, typ string) *corev1.Service {
}
}
func endpointSliceForService(svc *corev1.Service, ip string) *discoveryv1.EndpointSlice {
func endpointSliceForService(svc *corev1.Service, ip string, fam discoveryv1.AddressType) *discoveryv1.EndpointSlice {
return &discoveryv1.EndpointSlice{
ObjectMeta: metav1.ObjectMeta{
Name: svc.Name,
Name: fmt.Sprintf("%s-%s", svc.Name, string(fam)),
Namespace: svc.Namespace,
Labels: map[string]string{discoveryv1.LabelServiceName: svc.Name},
},
AddressType: fam,
Endpoints: []discoveryv1.Endpoint{{
Addresses: []string{ip},
Conditions: discoveryv1.EndpointConditions{
Ready: ptr.To(true),
Serving: ptr.To(true),
Terminating: ptr.To(false),
},
}},
}
}

View File

@@ -0,0 +1,213 @@
// Copyright (c) Tailscale Inc & AUTHORS
// SPDX-License-Identifier: BSD-3-Clause
//go:build !plan9
package main
import (
"context"
"encoding/json"
"fmt"
"net/netip"
"reflect"
"strings"
"go.uber.org/zap"
corev1 "k8s.io/api/core/v1"
discoveryv1 "k8s.io/api/discovery/v1"
apierrors "k8s.io/apimachinery/pkg/api/errors"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
"sigs.k8s.io/controller-runtime/pkg/client"
"sigs.k8s.io/controller-runtime/pkg/reconcile"
tsoperator "tailscale.com/k8s-operator"
"tailscale.com/kube/egressservices"
"tailscale.com/types/ptr"
)
// egressEpsReconciler reconciles EndpointSlices for tailnet services exposed to cluster via egress ProxyGroup proxies.
type egressEpsReconciler struct {
client.Client
logger *zap.SugaredLogger
tsNamespace string
}
// Reconcile reconciles an EndpointSlice for a tailnet service. It updates the EndpointSlice with the endpoints of
// those ProxyGroup Pods that are ready to route traffic to the tailnet service.
// It compares tailnet service state stored in egress proxy state Secrets by containerboot with the desired
// configuration stored in proxy-cfg ConfigMap to determine if the endpoint is ready.
func (er *egressEpsReconciler) Reconcile(ctx context.Context, req reconcile.Request) (res reconcile.Result, err error) {
l := er.logger.With("Service", req.NamespacedName)
l.Debugf("starting reconcile")
defer l.Debugf("reconcile finished")
eps := new(discoveryv1.EndpointSlice)
err = er.Get(ctx, req.NamespacedName, eps)
if apierrors.IsNotFound(err) {
l.Debugf("EndpointSlice not found")
return reconcile.Result{}, nil
}
if err != nil {
return reconcile.Result{}, fmt.Errorf("failed to get EndpointSlice: %w", err)
}
if !eps.DeletionTimestamp.IsZero() {
l.Debugf("EnpointSlice is being deleted")
return res, nil
}
// Get the user-created ExternalName Service and use its status conditions to determine whether cluster
// resources are set up for this tailnet service.
svc := &corev1.Service{
ObjectMeta: metav1.ObjectMeta{
Name: eps.Labels[LabelParentName],
Namespace: eps.Labels[LabelParentNamespace],
},
}
err = er.Get(ctx, client.ObjectKeyFromObject(svc), svc)
if apierrors.IsNotFound(err) {
l.Infof("ExternalName Service %s/%s not found, perhaps it was deleted", svc.Namespace, svc.Name)
return res, nil
}
if err != nil {
return res, fmt.Errorf("error retrieving ExternalName Service: %w", err)
}
if !tsoperator.EgressServiceIsValidAndConfigured(svc) {
l.Infof("Cluster resources for ExternalName Service %s/%s are not yet configured", svc.Namespace, svc.Name)
return res, nil
}
// TODO(irbekrm): currently this reconcile loop runs all the checks every time it's triggered, which is
// wasteful. Once we have a Ready condition for ExternalName Services for ProxyGroup, use the condition to
// determine if a reconcile is needed.
oldEps := eps.DeepCopy()
proxyGroupName := eps.Labels[labelProxyGroup]
tailnetSvc := tailnetSvcName(svc)
l = l.With("tailnet-service-name", tailnetSvc)
// Retrieve the desired tailnet service configuration from the ConfigMap.
_, cfgs, err := egressSvcsConfigs(ctx, er.Client, proxyGroupName, er.tsNamespace)
if err != nil {
return res, fmt.Errorf("error retrieving tailnet services configuration: %w", err)
}
cfg, ok := (*cfgs)[tailnetSvc]
if !ok {
l.Infof("[unexpected] configuration for tailnet service %s not found", tailnetSvc)
return res, nil
}
// Check which Pods in ProxyGroup are ready to route traffic to this
// egress service.
podList := &corev1.PodList{}
if err := er.List(ctx, podList, client.MatchingLabels(pgLabels(proxyGroupName, nil))); err != nil {
return res, fmt.Errorf("error listing Pods for ProxyGroup %s: %w", proxyGroupName, err)
}
newEndpoints := make([]discoveryv1.Endpoint, 0)
for _, pod := range podList.Items {
ready, err := er.podIsReadyToRouteTraffic(ctx, pod, &cfg, tailnetSvc, l)
if err != nil {
return res, fmt.Errorf("error verifying if Pod is ready to route traffic: %w", err)
}
if !ready {
continue // maybe next time
}
podIP, err := podIPv4(&pod) // we currently only support IPv4
if err != nil {
return res, fmt.Errorf("error determining IPv4 address for Pod: %w", err)
}
newEndpoints = append(newEndpoints, discoveryv1.Endpoint{
Hostname: (*string)(&pod.UID),
Addresses: []string{podIP},
Conditions: discoveryv1.EndpointConditions{
Ready: ptr.To(true),
Serving: ptr.To(true),
Terminating: ptr.To(false),
},
})
}
// Note that Endpoints are being overwritten with the currently valid endpoints so we don't need to explicitly
// run a cleanup for deleted Pods etc.
eps.Endpoints = newEndpoints
if !reflect.DeepEqual(eps, oldEps) {
l.Infof("Updating EndpointSlice to ensure traffic is routed to ready proxy Pods")
if err := er.Update(ctx, eps); err != nil {
return res, fmt.Errorf("error updating EndpointSlice: %w", err)
}
}
return res, nil
}
func podIPv4(pod *corev1.Pod) (string, error) {
for _, ip := range pod.Status.PodIPs {
parsed, err := netip.ParseAddr(ip.IP)
if err != nil {
return "", fmt.Errorf("error parsing IP address %s: %w", ip, err)
}
if parsed.Is4() {
return parsed.String(), nil
}
}
return "", nil
}
// podIsReadyToRouteTraffic returns true if it appears that the proxy Pod has configured firewall rules to be able to
// route traffic to the given tailnet service. It retrieves the proxy's state Secret and compares the tailnet service
// status written there to the desired service configuration.
func (er *egressEpsReconciler) podIsReadyToRouteTraffic(ctx context.Context, pod corev1.Pod, cfg *egressservices.Config, tailnetSvcName string, l *zap.SugaredLogger) (bool, error) {
l = l.With("proxy_pod", pod.Name)
l.Debugf("checking whether proxy is ready to route to egress service")
if !pod.DeletionTimestamp.IsZero() {
l.Debugf("proxy Pod is being deleted, ignore")
return false, nil
}
podIP, err := podIPv4(&pod)
if err != nil {
return false, fmt.Errorf("error determining Pod IP address: %v", err)
}
if podIP == "" {
l.Infof("[unexpected] Pod does not have an IPv4 address, and IPv6 is not currently supported")
return false, nil
}
stateS := &corev1.Secret{
ObjectMeta: metav1.ObjectMeta{
Name: pod.Name,
Namespace: pod.Namespace,
},
}
err = er.Get(ctx, client.ObjectKeyFromObject(stateS), stateS)
if apierrors.IsNotFound(err) {
l.Debugf("proxy does not have a state Secret, waiting...")
return false, nil
}
if err != nil {
return false, fmt.Errorf("error getting state Secret: %w", err)
}
svcStatusBS := stateS.Data[egressservices.KeyEgressServices]
if len(svcStatusBS) == 0 {
l.Debugf("proxy's state Secret does not contain egress services status, waiting...")
return false, nil
}
svcStatus := &egressservices.Status{}
if err := json.Unmarshal(svcStatusBS, svcStatus); err != nil {
return false, fmt.Errorf("error unmarshalling egress service status: %w", err)
}
if !strings.EqualFold(podIP, svcStatus.PodIPv4) {
l.Infof("proxy's egress service status is for Pod IP %s, current proxy's Pod IP %s, waiting for the proxy to reconfigure...", svcStatus.PodIPv4, podIP)
return false, nil
}
st, ok := (*svcStatus).Services[tailnetSvcName]
if !ok {
l.Infof("proxy's state Secret does not have egress service status, waiting...")
return false, nil
}
if !reflect.DeepEqual(cfg.TailnetTarget, st.TailnetTarget) {
l.Infof("proxy has configured egress service for tailnet target %v, current target is %v, waiting for proxy to reconfigure...", st.TailnetTarget, cfg.TailnetTarget)
return false, nil
}
if !reflect.DeepEqual(cfg.Ports, st.Ports) {
l.Debugf("proxy has configured egress service for ports %#+v, wants ports %#+v, waiting for proxy to reconfigure", st.Ports, cfg.Ports)
return false, nil
}
l.Debugf("proxy is ready to route traffic to egress service")
return true, nil
}

View File

@@ -0,0 +1,211 @@
// Copyright (c) Tailscale Inc & AUTHORS
// SPDX-License-Identifier: BSD-3-Clause
//go:build !plan9
package main
import (
"encoding/json"
"fmt"
"math/rand/v2"
"testing"
"github.com/AlekSi/pointer"
"go.uber.org/zap"
corev1 "k8s.io/api/core/v1"
discoveryv1 "k8s.io/api/discovery/v1"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
"k8s.io/apimachinery/pkg/types"
"sigs.k8s.io/controller-runtime/pkg/client/fake"
tsapi "tailscale.com/k8s-operator/apis/v1alpha1"
"tailscale.com/kube/egressservices"
"tailscale.com/tstest"
"tailscale.com/util/mak"
)
func TestTailscaleEgressEndpointSlices(t *testing.T) {
clock := tstest.NewClock(tstest.ClockOpts{})
svc := &corev1.Service{
ObjectMeta: metav1.ObjectMeta{
Name: "test",
Namespace: "default",
UID: types.UID("1234-UID"),
Annotations: map[string]string{
AnnotationTailnetTargetFQDN: "foo.bar.ts.net",
AnnotationProxyGroup: "foo",
},
},
Spec: corev1.ServiceSpec{
ExternalName: "placeholder",
Type: corev1.ServiceTypeExternalName,
Selector: nil,
Ports: []corev1.ServicePort{
{
Name: "http",
Protocol: "TCP",
Port: 80,
},
},
},
Status: corev1.ServiceStatus{
Conditions: []metav1.Condition{
condition(tsapi.EgressSvcConfigured, metav1.ConditionTrue, "", "", clock),
condition(tsapi.EgressSvcValid, metav1.ConditionTrue, "", "", clock),
},
},
}
port := randomPort()
cm := configMapForSvc(t, svc, port)
fc := fake.NewClientBuilder().
WithScheme(tsapi.GlobalScheme).
WithObjects(svc, cm).
WithStatusSubresource(svc).
Build()
zl, err := zap.NewDevelopment()
if err != nil {
t.Fatal(err)
}
er := &egressEpsReconciler{
Client: fc,
logger: zl.Sugar(),
tsNamespace: "operator-ns",
}
eps := &discoveryv1.EndpointSlice{
ObjectMeta: metav1.ObjectMeta{
Name: "foo",
Namespace: "operator-ns",
Labels: map[string]string{
LabelParentName: "test",
LabelParentNamespace: "default",
labelSvcType: typeEgress,
labelProxyGroup: "foo"},
},
AddressType: discoveryv1.AddressTypeIPv4,
}
mustCreate(t, fc, eps)
t.Run("no_proxy_group_resources", func(t *testing.T) {
expectReconciled(t, er, "operator-ns", "foo") // should not error
})
t.Run("no_pods_ready_to_route_traffic", func(t *testing.T) {
pod, stateS := podAndSecretForProxyGroup("foo")
mustCreate(t, fc, pod)
mustCreate(t, fc, stateS)
expectReconciled(t, er, "operator-ns", "foo") // should not error
})
t.Run("pods_are_ready_to_route_traffic", func(t *testing.T) {
pod, stateS := podAndSecretForProxyGroup("foo")
stBs := serviceStatusForPodIP(t, svc, pod.Status.PodIPs[0].IP, port)
mustUpdate(t, fc, "operator-ns", stateS.Name, func(s *corev1.Secret) {
mak.Set(&s.Data, egressservices.KeyEgressServices, stBs)
})
expectReconciled(t, er, "operator-ns", "foo")
eps.Endpoints = append(eps.Endpoints, discoveryv1.Endpoint{
Addresses: []string{"10.0.0.1"},
Hostname: pointer.To("foo"),
Conditions: discoveryv1.EndpointConditions{
Serving: pointer.ToBool(true),
Ready: pointer.ToBool(true),
Terminating: pointer.ToBool(false),
},
})
expectEqual(t, fc, eps, nil)
})
t.Run("status_does_not_match_pod_ip", func(t *testing.T) {
_, stateS := podAndSecretForProxyGroup("foo") // replica Pod has IP 10.0.0.1
stBs := serviceStatusForPodIP(t, svc, "10.0.0.2", port) // status is for a Pod with IP 10.0.0.2
mustUpdate(t, fc, "operator-ns", stateS.Name, func(s *corev1.Secret) {
mak.Set(&s.Data, egressservices.KeyEgressServices, stBs)
})
expectReconciled(t, er, "operator-ns", "foo")
eps.Endpoints = []discoveryv1.Endpoint{}
expectEqual(t, fc, eps, nil)
})
}
func configMapForSvc(t *testing.T, svc *corev1.Service, p uint16) *corev1.ConfigMap {
t.Helper()
ports := make(map[egressservices.PortMap]struct{})
for _, port := range svc.Spec.Ports {
ports[egressservices.PortMap{Protocol: string(port.Protocol), MatchPort: p, TargetPort: uint16(port.Port)}] = struct{}{}
}
cfg := egressservices.Config{
Ports: ports,
}
if fqdn := svc.Annotations[AnnotationTailnetTargetFQDN]; fqdn != "" {
cfg.TailnetTarget = egressservices.TailnetTarget{FQDN: fqdn}
}
if ip := svc.Annotations[AnnotationTailnetTargetIP]; ip != "" {
cfg.TailnetTarget = egressservices.TailnetTarget{IP: ip}
}
name := tailnetSvcName(svc)
cfgs := egressservices.Configs{name: cfg}
bs, err := json.Marshal(&cfgs)
if err != nil {
t.Fatalf("error marshalling config: %v", err)
}
cm := &corev1.ConfigMap{
ObjectMeta: metav1.ObjectMeta{
Name: pgEgressCMName(svc.Annotations[AnnotationProxyGroup]),
Namespace: "operator-ns",
},
BinaryData: map[string][]byte{egressservices.KeyEgressServices: bs},
}
return cm
}
func serviceStatusForPodIP(t *testing.T, svc *corev1.Service, ip string, p uint16) []byte {
t.Helper()
ports := make(map[egressservices.PortMap]struct{})
for _, port := range svc.Spec.Ports {
ports[egressservices.PortMap{Protocol: string(port.Protocol), MatchPort: p, TargetPort: uint16(port.Port)}] = struct{}{}
}
svcSt := egressservices.ServiceStatus{Ports: ports}
if fqdn := svc.Annotations[AnnotationTailnetTargetFQDN]; fqdn != "" {
svcSt.TailnetTarget = egressservices.TailnetTarget{FQDN: fqdn}
}
if ip := svc.Annotations[AnnotationTailnetTargetIP]; ip != "" {
svcSt.TailnetTarget = egressservices.TailnetTarget{IP: ip}
}
svcName := tailnetSvcName(svc)
st := egressservices.Status{
PodIPv4: ip,
Services: map[string]*egressservices.ServiceStatus{svcName: &svcSt},
}
bs, err := json.Marshal(st)
if err != nil {
t.Fatalf("error marshalling service status: %v", err)
}
return bs
}
func podAndSecretForProxyGroup(pg string) (*corev1.Pod, *corev1.Secret) {
p := &corev1.Pod{
ObjectMeta: metav1.ObjectMeta{
Name: fmt.Sprintf("%s-0", pg),
Namespace: "operator-ns",
Labels: pgLabels(pg, nil),
UID: "foo",
},
Status: corev1.PodStatus{
PodIPs: []corev1.PodIP{
{IP: "10.0.0.1"},
},
},
}
s := &corev1.Secret{
ObjectMeta: metav1.ObjectMeta{
Name: fmt.Sprintf("%s-0", pg),
Namespace: "operator-ns",
Labels: pgSecretLabels(pg, "state"),
},
}
return p, s
}
func randomPort() uint16 {
return uint16(rand.Int32N(1000) + 1000)
}

View File

@@ -0,0 +1,179 @@
// Copyright (c) Tailscale Inc & AUTHORS
// SPDX-License-Identifier: BSD-3-Clause
//go:build !plan9
package main
import (
"context"
"errors"
"fmt"
"strings"
"go.uber.org/zap"
appsv1 "k8s.io/api/apps/v1"
corev1 "k8s.io/api/core/v1"
discoveryv1 "k8s.io/api/discovery/v1"
apiequality "k8s.io/apimachinery/pkg/api/equality"
apierrors "k8s.io/apimachinery/pkg/api/errors"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
"sigs.k8s.io/controller-runtime/pkg/client"
"sigs.k8s.io/controller-runtime/pkg/reconcile"
tsoperator "tailscale.com/k8s-operator"
tsapi "tailscale.com/k8s-operator/apis/v1alpha1"
"tailscale.com/tstime"
)
const (
reasonReadinessCheckFailed = "ReadinessCheckFailed"
reasonClusterResourcesNotReady = "ClusterResourcesNotReady"
reasonNoProxies = "NoProxiesConfigured"
reasonNotReady = "NotReadyToRouteTraffic"
reasonReady = "ReadyToRouteTraffic"
reasonPartiallyReady = "PartiallyReadyToRouteTraffic"
msgReadyToRouteTemplate = "%d out of %d replicas are ready to route traffic"
)
type egressSvcsReadinessReconciler struct {
client.Client
logger *zap.SugaredLogger
clock tstime.Clock
tsNamespace string
}
// Reconcile reconciles an ExternalName Service that defines a tailnet target to be exposed on a ProxyGroup and sets the
// EgressSvcReady condition on it. The condition gets set to true if at least one of the proxies is currently ready to
// route traffic to the target. It compares proxy Pod IPs with the endpoints set on the EndpointSlice for the egress
// service to determine how many replicas are currently able to route traffic.
func (esrr *egressSvcsReadinessReconciler) Reconcile(ctx context.Context, req reconcile.Request) (res reconcile.Result, err error) {
l := esrr.logger.With("Service", req.NamespacedName)
defer l.Info("reconcile finished")
svc := new(corev1.Service)
if err = esrr.Get(ctx, req.NamespacedName, svc); apierrors.IsNotFound(err) {
l.Info("Service not found")
return res, nil
} else if err != nil {
return res, fmt.Errorf("failed to get Service: %w", err)
}
var (
reason, msg string
st metav1.ConditionStatus = metav1.ConditionUnknown
)
oldStatus := svc.Status.DeepCopy()
defer func() {
tsoperator.SetServiceCondition(svc, tsapi.EgressSvcReady, st, reason, msg, esrr.clock, l)
if !apiequality.Semantic.DeepEqual(oldStatus, svc.Status) {
err = errors.Join(err, esrr.Status().Update(ctx, svc))
}
}()
crl := egressSvcChildResourceLabels(svc)
eps, err := getSingleObject[discoveryv1.EndpointSlice](ctx, esrr.Client, esrr.tsNamespace, crl)
if err != nil {
err = fmt.Errorf("error getting EndpointSlice: %w", err)
reason = reasonReadinessCheckFailed
msg = err.Error()
return res, err
}
if eps == nil {
l.Infof("EndpointSlice for Service does not yet exist, waiting...")
reason, msg = reasonClusterResourcesNotReady, reasonClusterResourcesNotReady
st = metav1.ConditionFalse
return res, nil
}
pg := &tsapi.ProxyGroup{
ObjectMeta: metav1.ObjectMeta{
Name: svc.Annotations[AnnotationProxyGroup],
},
}
err = esrr.Get(ctx, client.ObjectKeyFromObject(pg), pg)
if apierrors.IsNotFound(err) {
l.Infof("ProxyGroup for Service does not exist, waiting...")
reason, msg = reasonClusterResourcesNotReady, reasonClusterResourcesNotReady
st = metav1.ConditionFalse
return res, nil
}
if err != nil {
err = fmt.Errorf("error retrieving ProxyGroup: %w", err)
reason = reasonReadinessCheckFailed
msg = err.Error()
return res, err
}
if !tsoperator.ProxyGroupIsReady(pg) {
l.Infof("ProxyGroup for Service is not ready, waiting...")
reason, msg = reasonClusterResourcesNotReady, reasonClusterResourcesNotReady
st = metav1.ConditionFalse
return res, nil
}
replicas := pgReplicas(pg)
if replicas == 0 {
l.Infof("ProxyGroup replicas set to 0")
reason, msg = reasonNoProxies, reasonNoProxies
st = metav1.ConditionFalse
return res, nil
}
podLabels := pgLabels(pg.Name, nil)
var readyReplicas int32
for i := range replicas {
podLabels[appsv1.PodIndexLabel] = fmt.Sprintf("%d", i)
pod, err := getSingleObject[corev1.Pod](ctx, esrr.Client, esrr.tsNamespace, podLabels)
if err != nil {
err = fmt.Errorf("error retrieving ProxyGroup Pod: %w", err)
reason = reasonReadinessCheckFailed
msg = err.Error()
return res, err
}
if pod == nil {
l.Infof("[unexpected] ProxyGroup is ready, but replica %d was not found", i)
reason, msg = reasonClusterResourcesNotReady, reasonClusterResourcesNotReady
return res, nil
}
l.Infof("looking at Pod with IPs %v", pod.Status.PodIPs)
ready := false
for _, ep := range eps.Endpoints {
l.Infof("looking at endpoint with addresses %v", ep.Addresses)
if endpointReadyForPod(&ep, pod, l) {
l.Infof("endpoint is ready for Pod")
ready = true
break
}
}
if ready {
readyReplicas++
}
}
msg = fmt.Sprintf(msgReadyToRouteTemplate, readyReplicas, replicas)
if readyReplicas == 0 {
reason = reasonNotReady
st = metav1.ConditionFalse
return res, nil
}
st = metav1.ConditionTrue
if readyReplicas < replicas {
reason = reasonPartiallyReady
} else {
reason = reasonReady
}
return res, nil
}
// endpointReadyForPod returns true if the endpoint is for the Pod's IPv4 address and is ready to serve traffic.
// Endpoint must not be nil.
func endpointReadyForPod(ep *discoveryv1.Endpoint, pod *corev1.Pod, l *zap.SugaredLogger) bool {
podIP, err := podIPv4(pod)
if err != nil {
l.Infof("[unexpected] error retrieving Pod's IPv4 address: %v", err)
return false
}
// Currently we only ever set a single address on and Endpoint and nothing else is meant to modify this.
if len(ep.Addresses) != 1 {
return false
}
return strings.EqualFold(ep.Addresses[0], podIP) &&
*ep.Conditions.Ready &&
*ep.Conditions.Serving &&
!*ep.Conditions.Terminating
}

View File

@@ -0,0 +1,169 @@
// Copyright (c) Tailscale Inc & AUTHORS
// SPDX-License-Identifier: BSD-3-Clause
//go:build !plan9
package main
import (
"fmt"
"testing"
"github.com/AlekSi/pointer"
"go.uber.org/zap"
appsv1 "k8s.io/api/apps/v1"
corev1 "k8s.io/api/core/v1"
discoveryv1 "k8s.io/api/discovery/v1"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
"sigs.k8s.io/controller-runtime/pkg/client/fake"
tsoperator "tailscale.com/k8s-operator"
tsapi "tailscale.com/k8s-operator/apis/v1alpha1"
"tailscale.com/tstest"
"tailscale.com/tstime"
)
func TestEgressServiceReadiness(t *testing.T) {
// We need to pass a ProxyGroup object to WithStatusSubresource because of some quirks in how the fake client
// works. Without this code further down would not be able to update ProxyGroup status.
fc := fake.NewClientBuilder().
WithScheme(tsapi.GlobalScheme).
WithStatusSubresource(&tsapi.ProxyGroup{}).
Build()
zl, _ := zap.NewDevelopment()
cl := tstest.NewClock(tstest.ClockOpts{})
rec := &egressSvcsReadinessReconciler{
tsNamespace: "operator-ns",
Client: fc,
logger: zl.Sugar(),
clock: cl,
}
tailnetFQDN := "my-app.tailnetxyz.ts.net"
egressSvc := &corev1.Service{
ObjectMeta: metav1.ObjectMeta{
Name: "my-app",
Namespace: "dev",
Annotations: map[string]string{
AnnotationProxyGroup: "dev",
AnnotationTailnetTargetFQDN: tailnetFQDN,
},
},
}
fakeClusterIPSvc := &corev1.Service{ObjectMeta: metav1.ObjectMeta{Name: "my-app", Namespace: "operator-ns"}}
l := egressSvcEpsLabels(egressSvc, fakeClusterIPSvc)
eps := &discoveryv1.EndpointSlice{
ObjectMeta: metav1.ObjectMeta{
Name: "my-app",
Namespace: "operator-ns",
Labels: l,
},
AddressType: discoveryv1.AddressTypeIPv4,
}
pg := &tsapi.ProxyGroup{
ObjectMeta: metav1.ObjectMeta{
Name: "dev",
},
}
mustCreate(t, fc, egressSvc)
setClusterNotReady(egressSvc, cl, zl.Sugar())
t.Run("endpointslice_does_not_exist", func(t *testing.T) {
expectReconciled(t, rec, "dev", "my-app")
expectEqual(t, fc, egressSvc, nil) // not ready
})
t.Run("proxy_group_does_not_exist", func(t *testing.T) {
mustCreate(t, fc, eps)
expectReconciled(t, rec, "dev", "my-app")
expectEqual(t, fc, egressSvc, nil) // still not ready
})
t.Run("proxy_group_not_ready", func(t *testing.T) {
mustCreate(t, fc, pg)
expectReconciled(t, rec, "dev", "my-app")
expectEqual(t, fc, egressSvc, nil) // still not ready
})
t.Run("no_ready_replicas", func(t *testing.T) {
setPGReady(pg, cl, zl.Sugar())
mustUpdateStatus(t, fc, pg.Namespace, pg.Name, func(p *tsapi.ProxyGroup) {
p.Status = pg.Status
})
expectEqual(t, fc, pg, nil)
for i := range pgReplicas(pg) {
p := pod(pg, i)
mustCreate(t, fc, p)
mustUpdateStatus(t, fc, p.Namespace, p.Name, func(existing *corev1.Pod) {
existing.Status.PodIPs = p.Status.PodIPs
})
}
expectReconciled(t, rec, "dev", "my-app")
setNotReady(egressSvc, cl, zl.Sugar(), pgReplicas(pg))
expectEqual(t, fc, egressSvc, nil) // still not ready
})
t.Run("one_ready_replica", func(t *testing.T) {
setEndpointForReplica(pg, 0, eps)
mustUpdate(t, fc, eps.Namespace, eps.Name, func(e *discoveryv1.EndpointSlice) {
e.Endpoints = eps.Endpoints
})
setReady(egressSvc, cl, zl.Sugar(), pgReplicas(pg), 1)
expectReconciled(t, rec, "dev", "my-app")
expectEqual(t, fc, egressSvc, nil) // partially ready
})
t.Run("all_replicas_ready", func(t *testing.T) {
for i := range pgReplicas(pg) {
setEndpointForReplica(pg, i, eps)
}
mustUpdate(t, fc, eps.Namespace, eps.Name, func(e *discoveryv1.EndpointSlice) {
e.Endpoints = eps.Endpoints
})
setReady(egressSvc, cl, zl.Sugar(), pgReplicas(pg), pgReplicas(pg))
expectReconciled(t, rec, "dev", "my-app")
expectEqual(t, fc, egressSvc, nil) // ready
})
}
func setClusterNotReady(svc *corev1.Service, cl tstime.Clock, l *zap.SugaredLogger) {
tsoperator.SetServiceCondition(svc, tsapi.EgressSvcReady, metav1.ConditionFalse, reasonClusterResourcesNotReady, reasonClusterResourcesNotReady, cl, l)
}
func setNotReady(svc *corev1.Service, cl tstime.Clock, l *zap.SugaredLogger, replicas int32) {
msg := fmt.Sprintf(msgReadyToRouteTemplate, 0, replicas)
tsoperator.SetServiceCondition(svc, tsapi.EgressSvcReady, metav1.ConditionFalse, reasonNotReady, msg, cl, l)
}
func setReady(svc *corev1.Service, cl tstime.Clock, l *zap.SugaredLogger, replicas, readyReplicas int32) {
reason := reasonPartiallyReady
if readyReplicas == replicas {
reason = reasonReady
}
msg := fmt.Sprintf(msgReadyToRouteTemplate, readyReplicas, replicas)
tsoperator.SetServiceCondition(svc, tsapi.EgressSvcReady, metav1.ConditionTrue, reason, msg, cl, l)
}
func setPGReady(pg *tsapi.ProxyGroup, cl tstime.Clock, l *zap.SugaredLogger) {
tsoperator.SetProxyGroupCondition(pg, tsapi.ProxyGroupReady, metav1.ConditionTrue, "foo", "foo", pg.Generation, cl, l)
}
func setEndpointForReplica(pg *tsapi.ProxyGroup, ordinal int32, eps *discoveryv1.EndpointSlice) {
p := pod(pg, ordinal)
eps.Endpoints = append(eps.Endpoints, discoveryv1.Endpoint{
Addresses: []string{p.Status.PodIPs[0].IP},
Conditions: discoveryv1.EndpointConditions{
Ready: pointer.ToBool(true),
Serving: pointer.ToBool(true),
Terminating: pointer.ToBool(false),
},
})
}
func pod(pg *tsapi.ProxyGroup, ordinal int32) *corev1.Pod {
l := pgLabels(pg.Name, nil)
l[appsv1.PodIndexLabel] = fmt.Sprintf("%d", ordinal)
ip := fmt.Sprintf("10.0.0.%d", ordinal)
return &corev1.Pod{
ObjectMeta: metav1.ObjectMeta{
Name: fmt.Sprintf("%s-%d", pg.Name, ordinal),
Namespace: "operator-ns",
Labels: l,
},
Status: corev1.PodStatus{
PodIPs: []corev1.PodIP{{IP: ip}},
},
}
}

View File

@@ -0,0 +1,716 @@
// Copyright (c) Tailscale Inc & AUTHORS
// SPDX-License-Identifier: BSD-3-Clause
//go:build !plan9
package main
import (
"context"
"crypto/sha256"
"encoding/json"
"errors"
"fmt"
"math/rand/v2"
"reflect"
"slices"
"strings"
"sync"
"go.uber.org/zap"
corev1 "k8s.io/api/core/v1"
discoveryv1 "k8s.io/api/discovery/v1"
apiequality "k8s.io/apimachinery/pkg/api/equality"
apierrors "k8s.io/apimachinery/pkg/api/errors"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
"k8s.io/apimachinery/pkg/types"
"k8s.io/apimachinery/pkg/util/intstr"
"k8s.io/apimachinery/pkg/util/sets"
"k8s.io/apiserver/pkg/storage/names"
"k8s.io/client-go/tools/record"
"sigs.k8s.io/controller-runtime/pkg/client"
"sigs.k8s.io/controller-runtime/pkg/reconcile"
tsoperator "tailscale.com/k8s-operator"
tsapi "tailscale.com/k8s-operator/apis/v1alpha1"
"tailscale.com/kube/egressservices"
"tailscale.com/kube/kubetypes"
"tailscale.com/tstime"
"tailscale.com/util/clientmetric"
"tailscale.com/util/mak"
"tailscale.com/util/set"
)
const (
reasonEgressSvcInvalid = "EgressSvcInvalid"
reasonEgressSvcValid = "EgressSvcValid"
reasonEgressSvcCreationFailed = "EgressSvcCreationFailed"
reasonProxyGroupNotReady = "ProxyGroupNotReady"
labelProxyGroup = "tailscale.com/proxy-group"
labelSvcType = "tailscale.com/svc-type" // ingress or egress
typeEgress = "egress"
// maxPorts is the maximum number of ports that can be exposed on a
// container. In practice this will be ports in range [3000 - 4000). The
// high range should make it easier to distinguish container ports from
// the tailnet target ports for debugging purposes (i.e when reading
// netfilter rules). The limit of 10000 is somewhat arbitrary, the
// assumption is that this would not be hit in practice.
maxPorts = 10000
indexEgressProxyGroup = ".metadata.annotations.egress-proxy-group"
)
var gaugeEgressServices = clientmetric.NewGauge(kubetypes.MetricEgressServiceCount)
// egressSvcsReconciler reconciles user created ExternalName Services that specify a tailnet
// endpoint that should be exposed to cluster workloads and an egress ProxyGroup
// on whose proxies it should be exposed.
type egressSvcsReconciler struct {
client.Client
logger *zap.SugaredLogger
recorder record.EventRecorder
clock tstime.Clock
tsNamespace string
mu sync.Mutex // protects following
svcs set.Slice[types.UID] // UIDs of all currently managed egress Services for ProxyGroup
}
// Reconcile reconciles an ExternalName Service that specifies a tailnet target and a ProxyGroup on whose proxies should
// forward cluster traffic to the target.
// For an ExternalName Service the reconciler:
//
// - for each port N defined on the ExternalName Service, allocates a port X in range [3000- 4000), unique for the
// ProxyGroup proxies. Proxies will forward cluster traffic received on port N to port M on the tailnet target
//
// - creates a ClusterIP Service in the operator's namespace with portmappings for all M->N port pairs. This will allow
// cluster workloads to send traffic on the user-defined tailnet target port and get it transparently mapped to the
// randomly selected port on proxy Pods.
//
// - creates an EndpointSlice in the operator's namespace with kubernetes.io/service-name label pointing to the
// ClusterIP Service. The endpoints will get dynamically updates to proxy Pod IPs as the Pods become ready to route
// traffic to the tailnet target. kubernetes.io/service-name label ensures that kube-proxy sets up routing rules to
// forward cluster traffic received on ClusterIP Service's IP address to the endpoints (Pod IPs).
//
// - updates the egress service config in a ConfigMap mounted to the ProxyGroup proxies with the tailnet target and the
// portmappings.
func (esr *egressSvcsReconciler) Reconcile(ctx context.Context, req reconcile.Request) (res reconcile.Result, err error) {
l := esr.logger.With("Service", req.NamespacedName)
defer l.Info("reconcile finished")
svc := new(corev1.Service)
if err = esr.Get(ctx, req.NamespacedName, svc); apierrors.IsNotFound(err) {
l.Info("Service not found")
return res, nil
} else if err != nil {
return res, fmt.Errorf("failed to get Service: %w", err)
}
// Name of the 'egress service', meaning the tailnet target.
tailnetSvc := tailnetSvcName(svc)
l = l.With("tailnet-service", tailnetSvc)
// Note that resources for egress Services are only cleaned up when the
// Service is actually deleted (and not if, for example, user decides to
// remove the Tailscale annotation from it). This should be fine- we
// assume that the egress ExternalName Services are always created for
// Tailscale operator specifically.
if !svc.DeletionTimestamp.IsZero() {
l.Info("Service is being deleted, ensuring resource cleanup")
return res, esr.maybeCleanup(ctx, svc, l)
}
oldStatus := svc.Status.DeepCopy()
defer func() {
if !apiequality.Semantic.DeepEqual(oldStatus, svc.Status) {
err = errors.Join(err, esr.Status().Update(ctx, svc))
}
}()
// Validate the user-created ExternalName Service and the associated ProxyGroup.
if ok, err := esr.validateClusterResources(ctx, svc, l); err != nil {
return res, fmt.Errorf("error validating cluster resources: %w", err)
} else if !ok {
return res, nil
}
if !slices.Contains(svc.Finalizers, FinalizerName) {
l.Infof("configuring tailnet service") // logged exactly once
svc.Finalizers = append(svc.Finalizers, FinalizerName)
if err := esr.Update(ctx, svc); err != nil {
err := fmt.Errorf("failed to add finalizer: %w", err)
r := svcConfiguredReason(svc, false, l)
tsoperator.SetServiceCondition(svc, tsapi.EgressSvcConfigured, metav1.ConditionFalse, r, err.Error(), esr.clock, l)
return res, err
}
esr.mu.Lock()
esr.svcs.Add(svc.UID)
gaugeEgressServices.Set(int64(esr.svcs.Len()))
esr.mu.Unlock()
}
if err := esr.maybeCleanupProxyGroupConfig(ctx, svc, l); err != nil {
err = fmt.Errorf("cleaning up resources for previous ProxyGroup failed: %w", err)
r := svcConfiguredReason(svc, false, l)
tsoperator.SetServiceCondition(svc, tsapi.EgressSvcConfigured, metav1.ConditionFalse, r, err.Error(), esr.clock, l)
return res, err
}
return res, esr.maybeProvision(ctx, svc, l)
}
func (esr *egressSvcsReconciler) maybeProvision(ctx context.Context, svc *corev1.Service, l *zap.SugaredLogger) (err error) {
r := svcConfiguredReason(svc, false, l)
st := metav1.ConditionFalse
defer func() {
msg := r
if st != metav1.ConditionTrue && err != nil {
msg = err.Error()
}
tsoperator.SetServiceCondition(svc, tsapi.EgressSvcConfigured, st, r, msg, esr.clock, l)
}()
crl := egressSvcChildResourceLabels(svc)
clusterIPSvc, err := getSingleObject[corev1.Service](ctx, esr.Client, esr.tsNamespace, crl)
if err != nil {
err = fmt.Errorf("error retrieving ClusterIP Service: %w", err)
return err
}
if clusterIPSvc == nil {
clusterIPSvc = esr.clusterIPSvcForEgress(crl)
}
upToDate := svcConfigurationUpToDate(svc, l)
provisioned := true
if !upToDate {
if clusterIPSvc, provisioned, err = esr.provision(ctx, svc.Annotations[AnnotationProxyGroup], svc, clusterIPSvc, l); err != nil {
return err
}
}
if !provisioned {
l.Infof("unable to provision cluster resources")
return nil
}
// Update ExternalName Service to point at the ClusterIP Service.
clusterDomain := retrieveClusterDomain(esr.tsNamespace, l)
clusterIPSvcFQDN := fmt.Sprintf("%s.%s.svc.%s", clusterIPSvc.Name, clusterIPSvc.Namespace, clusterDomain)
if svc.Spec.ExternalName != clusterIPSvcFQDN {
l.Infof("Configuring ExternalName Service to point to ClusterIP Service %s", clusterIPSvcFQDN)
svc.Spec.ExternalName = clusterIPSvcFQDN
if err = esr.Update(ctx, svc); err != nil {
err = fmt.Errorf("error updating ExternalName Service: %w", err)
return err
}
}
r = svcConfiguredReason(svc, true, l)
st = metav1.ConditionTrue
return nil
}
func (esr *egressSvcsReconciler) provision(ctx context.Context, proxyGroupName string, svc, clusterIPSvc *corev1.Service, l *zap.SugaredLogger) (*corev1.Service, bool, error) {
l.Infof("updating configuration...")
usedPorts, err := esr.usedPortsForPG(ctx, proxyGroupName)
if err != nil {
return nil, false, fmt.Errorf("error calculating used ports for ProxyGroup %s: %w", proxyGroupName, err)
}
oldClusterIPSvc := clusterIPSvc.DeepCopy()
// loop over ClusterIP Service ports, remove any that are not needed.
for i := len(clusterIPSvc.Spec.Ports) - 1; i >= 0; i-- {
pm := clusterIPSvc.Spec.Ports[i]
found := false
for _, wantsPM := range svc.Spec.Ports {
if wantsPM.Port == pm.Port && strings.EqualFold(string(wantsPM.Protocol), string(pm.Protocol)) {
found = true
break
}
}
if !found {
l.Debugf("portmapping %s:%d -> %s:%d is no longer required, removing", pm.Protocol, pm.TargetPort.IntVal, pm.Protocol, pm.Port)
clusterIPSvc.Spec.Ports = slices.Delete(clusterIPSvc.Spec.Ports, i, i+1)
}
}
// loop over ExternalName Service ports, for each one not found on
// ClusterIP Service produce new target port and add a portmapping to
// the ClusterIP Service.
for _, wantsPM := range svc.Spec.Ports {
found := false
for _, gotPM := range clusterIPSvc.Spec.Ports {
if wantsPM.Port == gotPM.Port && strings.EqualFold(string(wantsPM.Protocol), string(gotPM.Protocol)) {
found = true
break
}
}
if !found {
// Calculate a free port to expose on container and add
// a new PortMap to the ClusterIP Service.
if usedPorts.Len() == maxPorts {
// TODO(irbekrm): refactor to avoid extra reconciles here. Low priority as in practice,
// the limit should not be hit.
return nil, false, fmt.Errorf("unable to allocate additional ports on ProxyGroup %s, %d ports already used. Create another ProxyGroup or open an issue if you believe this is unexpected.", proxyGroupName, maxPorts)
}
p := unusedPort(usedPorts)
l.Debugf("mapping tailnet target port %d to container port %d", wantsPM.Port, p)
usedPorts.Insert(p)
clusterIPSvc.Spec.Ports = append(clusterIPSvc.Spec.Ports, corev1.ServicePort{
Name: wantsPM.Name,
Protocol: wantsPM.Protocol,
Port: wantsPM.Port,
TargetPort: intstr.FromInt32(p),
})
}
}
if !reflect.DeepEqual(clusterIPSvc, oldClusterIPSvc) {
if clusterIPSvc, err = createOrUpdate(ctx, esr.Client, esr.tsNamespace, clusterIPSvc, func(svc *corev1.Service) {
svc.Labels = clusterIPSvc.Labels
svc.Spec = clusterIPSvc.Spec
}); err != nil {
return nil, false, fmt.Errorf("error ensuring ClusterIP Service: %v", err)
}
}
crl := egressSvcEpsLabels(svc, clusterIPSvc)
// TODO(irbekrm): support IPv6, but need to investigate how kube proxy
// sets up Service -> Pod routing when IPv6 is involved.
eps := &discoveryv1.EndpointSlice{
ObjectMeta: metav1.ObjectMeta{
Name: fmt.Sprintf("%s-ipv4", clusterIPSvc.Name),
Namespace: esr.tsNamespace,
Labels: crl,
},
AddressType: discoveryv1.AddressTypeIPv4,
Ports: epsPortsFromSvc(clusterIPSvc),
}
if eps, err = createOrUpdate(ctx, esr.Client, esr.tsNamespace, eps, func(e *discoveryv1.EndpointSlice) {
e.Labels = eps.Labels
e.AddressType = eps.AddressType
e.Ports = eps.Ports
for _, p := range e.Endpoints {
p.Conditions.Ready = nil
}
}); err != nil {
return nil, false, fmt.Errorf("error ensuring EndpointSlice: %w", err)
}
cm, cfgs, err := egressSvcsConfigs(ctx, esr.Client, proxyGroupName, esr.tsNamespace)
if err != nil {
return nil, false, fmt.Errorf("error retrieving egress services configuration: %w", err)
}
if cm == nil {
l.Info("ConfigMap not yet created, waiting..")
return nil, false, nil
}
tailnetSvc := tailnetSvcName(svc)
gotCfg := (*cfgs)[tailnetSvc]
wantsCfg := egressSvcCfg(svc, clusterIPSvc)
if !reflect.DeepEqual(gotCfg, wantsCfg) {
l.Debugf("updating egress services ConfigMap %s", cm.Name)
mak.Set(cfgs, tailnetSvc, wantsCfg)
bs, err := json.Marshal(cfgs)
if err != nil {
return nil, false, fmt.Errorf("error marshalling egress services configs: %w", err)
}
mak.Set(&cm.BinaryData, egressservices.KeyEgressServices, bs)
if err := esr.Update(ctx, cm); err != nil {
return nil, false, fmt.Errorf("error updating egress services ConfigMap: %w", err)
}
}
l.Infof("egress service configuration has been updated")
return clusterIPSvc, true, nil
}
func (esr *egressSvcsReconciler) maybeCleanup(ctx context.Context, svc *corev1.Service, logger *zap.SugaredLogger) error {
logger.Info("ensuring that resources created for egress service are deleted")
// Delete egress service config from the ConfigMap mounted by the proxies.
if err := esr.ensureEgressSvcCfgDeleted(ctx, svc, logger); err != nil {
return fmt.Errorf("error deleting egress service config: %w", err)
}
// Delete the ClusterIP Service and EndpointSlice for the egress
// service.
types := []client.Object{
&corev1.Service{},
&discoveryv1.EndpointSlice{},
}
crl := egressSvcChildResourceLabels(svc)
for _, typ := range types {
if err := esr.DeleteAllOf(ctx, typ, client.InNamespace(esr.tsNamespace), client.MatchingLabels(crl)); err != nil {
return fmt.Errorf("error deleting %s: %w", typ, err)
}
}
ix := slices.Index(svc.Finalizers, FinalizerName)
if ix != -1 {
logger.Debug("Removing Tailscale finalizer from Service")
svc.Finalizers = append(svc.Finalizers[:ix], svc.Finalizers[ix+1:]...)
if err := esr.Update(ctx, svc); err != nil {
return fmt.Errorf("failed to remove finalizer: %w", err)
}
}
esr.mu.Lock()
esr.svcs.Remove(svc.UID)
gaugeEgressServices.Set(int64(esr.svcs.Len()))
esr.mu.Unlock()
logger.Info("successfully cleaned up resources for egress Service")
return nil
}
func (esr *egressSvcsReconciler) maybeCleanupProxyGroupConfig(ctx context.Context, svc *corev1.Service, l *zap.SugaredLogger) error {
wantsProxyGroup := svc.Annotations[AnnotationProxyGroup]
cond := tsoperator.GetServiceCondition(svc, tsapi.EgressSvcConfigured)
if cond == nil {
return nil
}
ss := strings.Split(cond.Reason, ":")
if len(ss) < 3 {
return nil
}
if strings.EqualFold(wantsProxyGroup, ss[2]) {
return nil
}
esr.logger.Infof("egress Service configured on ProxyGroup %s, wants ProxyGroup %s, cleaning up...", ss[2], wantsProxyGroup)
if err := esr.ensureEgressSvcCfgDeleted(ctx, svc, l); err != nil {
return fmt.Errorf("error deleting egress service config: %w", err)
}
return nil
}
// usedPortsForPG calculates the currently used match ports for ProxyGroup
// containers. It does that by looking by retrieving all target ports of all
// ClusterIP Services created for egress services exposed on this ProxyGroup's
// proxies.
// TODO(irbekrm): this is currently good enough because we only have a single worker and
// because these Services are created by us, so we can always expect to get the
// latest ClusterIP Services via the controller cache. It will not work as well
// once we split into multiple workers- at that point we probably want to set
// used ports on ProxyGroup's status.
func (esr *egressSvcsReconciler) usedPortsForPG(ctx context.Context, pg string) (sets.Set[int32], error) {
svcList := &corev1.ServiceList{}
if err := esr.List(ctx, svcList, client.InNamespace(esr.tsNamespace), client.MatchingLabels(map[string]string{labelProxyGroup: pg})); err != nil {
return nil, fmt.Errorf("error listing Services: %w", err)
}
usedPorts := sets.New[int32]()
for _, s := range svcList.Items {
for _, p := range s.Spec.Ports {
usedPorts.Insert(p.TargetPort.IntVal)
}
}
return usedPorts, nil
}
// clusterIPSvcForEgress returns a template for the ClusterIP Service created
// for an egress service exposed on ProxyGroup proxies. The ClusterIP Service
// has no selector. Traffic sent to it will be routed to the endpoints defined
// by an EndpointSlice created for this egress service.
func (esr *egressSvcsReconciler) clusterIPSvcForEgress(crl map[string]string) *corev1.Service {
return &corev1.Service{
ObjectMeta: metav1.ObjectMeta{
GenerateName: svcNameBase(crl[LabelParentName]),
Namespace: esr.tsNamespace,
Labels: crl,
},
Spec: corev1.ServiceSpec{
Type: corev1.ServiceTypeClusterIP,
},
}
}
func (esr *egressSvcsReconciler) ensureEgressSvcCfgDeleted(ctx context.Context, svc *corev1.Service, logger *zap.SugaredLogger) error {
crl := egressSvcChildResourceLabels(svc)
cmName := pgEgressCMName(crl[labelProxyGroup])
cm := &corev1.ConfigMap{
ObjectMeta: metav1.ObjectMeta{
Name: cmName,
Namespace: esr.tsNamespace,
},
}
l := logger.With("ConfigMap", client.ObjectKeyFromObject(cm))
l.Debug("ensuring that egress service configuration is removed from proxy config")
if err := esr.Get(ctx, client.ObjectKeyFromObject(cm), cm); apierrors.IsNotFound(err) {
l.Debugf("ConfigMap not found")
return nil
} else if err != nil {
return fmt.Errorf("error retrieving ConfigMap: %w", err)
}
bs := cm.BinaryData[egressservices.KeyEgressServices]
if len(bs) == 0 {
l.Debugf("ConfigMap does not contain egress service configs")
return nil
}
cfgs := &egressservices.Configs{}
if err := json.Unmarshal(bs, cfgs); err != nil {
return fmt.Errorf("error unmarshalling egress services configs")
}
tailnetSvc := tailnetSvcName(svc)
_, ok := (*cfgs)[tailnetSvc]
if !ok {
l.Debugf("ConfigMap does not contain egress service config, likely because it was already deleted")
return nil
}
l.Infof("before deleting config %+#v", *cfgs)
delete(*cfgs, tailnetSvc)
l.Infof("after deleting config %+#v", *cfgs)
bs, err := json.Marshal(cfgs)
if err != nil {
return fmt.Errorf("error marshalling egress services configs: %w", err)
}
mak.Set(&cm.BinaryData, egressservices.KeyEgressServices, bs)
return esr.Update(ctx, cm)
}
func (esr *egressSvcsReconciler) validateClusterResources(ctx context.Context, svc *corev1.Service, l *zap.SugaredLogger) (bool, error) {
proxyGroupName := svc.Annotations[AnnotationProxyGroup]
pg := &tsapi.ProxyGroup{
ObjectMeta: metav1.ObjectMeta{
Name: proxyGroupName,
},
}
if err := esr.Get(ctx, client.ObjectKeyFromObject(pg), pg); apierrors.IsNotFound(err) {
l.Infof("ProxyGroup %q not found, waiting...", proxyGroupName)
tsoperator.SetServiceCondition(svc, tsapi.EgressSvcValid, metav1.ConditionUnknown, reasonProxyGroupNotReady, reasonProxyGroupNotReady, esr.clock, l)
tsoperator.RemoveServiceCondition(svc, tsapi.EgressSvcConfigured)
return false, nil
} else if err != nil {
err := fmt.Errorf("unable to retrieve ProxyGroup %s: %w", proxyGroupName, err)
tsoperator.SetServiceCondition(svc, tsapi.EgressSvcValid, metav1.ConditionUnknown, reasonProxyGroupNotReady, err.Error(), esr.clock, l)
tsoperator.RemoveServiceCondition(svc, tsapi.EgressSvcConfigured)
return false, err
}
if !tsoperator.ProxyGroupIsReady(pg) {
l.Infof("ProxyGroup %s is not ready, waiting...", proxyGroupName)
tsoperator.SetServiceCondition(svc, tsapi.EgressSvcValid, metav1.ConditionUnknown, reasonProxyGroupNotReady, reasonProxyGroupNotReady, esr.clock, l)
tsoperator.RemoveServiceCondition(svc, tsapi.EgressSvcConfigured)
return false, nil
}
if violations := validateEgressService(svc, pg); len(violations) > 0 {
msg := fmt.Sprintf("invalid egress Service: %s", strings.Join(violations, ", "))
esr.recorder.Event(svc, corev1.EventTypeWarning, "INVALIDSERVICE", msg)
l.Info(msg)
tsoperator.SetServiceCondition(svc, tsapi.EgressSvcValid, metav1.ConditionFalse, reasonEgressSvcInvalid, msg, esr.clock, l)
tsoperator.RemoveServiceCondition(svc, tsapi.EgressSvcConfigured)
return false, nil
}
l.Debugf("egress service is valid")
tsoperator.SetServiceCondition(svc, tsapi.EgressSvcValid, metav1.ConditionTrue, reasonEgressSvcValid, reasonEgressSvcValid, esr.clock, l)
return true, nil
}
func validateEgressService(svc *corev1.Service, pg *tsapi.ProxyGroup) []string {
violations := validateService(svc)
// We check that only one of these two is set in the earlier validateService function.
if svc.Annotations[AnnotationTailnetTargetFQDN] == "" && svc.Annotations[AnnotationTailnetTargetIP] == "" {
violations = append(violations, fmt.Sprintf("egress Service for ProxyGroup must have one of %s, %s annotations set", AnnotationTailnetTargetFQDN, AnnotationTailnetTargetIP))
}
if len(svc.Spec.Ports) == 0 {
violations = append(violations, "egress Service for ProxyGroup must have at least one target Port specified")
}
if svc.Spec.Type != corev1.ServiceTypeExternalName {
violations = append(violations, fmt.Sprintf("unexpected egress Service type %s. The only supported type is ExternalName.", svc.Spec.Type))
}
if pg.Spec.Type != tsapi.ProxyGroupTypeEgress {
violations = append(violations, fmt.Sprintf("egress Service references ProxyGroup of type %s, must be type %s", pg.Spec.Type, tsapi.ProxyGroupTypeEgress))
}
return violations
}
// egressSvcNameBase returns a name base that can be passed to
// ObjectMeta.GenerateName to generate a name for the ClusterIP Service.
// The generated name needs to be short enough so that it can later be used to
// generate a valid Kubernetes resource name for the EndpointSlice in form
// 'ipv4-|ipv6-<ClusterIP Service name>.
// A valid Kubernetes resource name must not be longer than 253 chars.
func svcNameBase(s string) string {
// -ipv4 - ipv6
const maxClusterIPSvcNameLength = 253 - 5
base := fmt.Sprintf("ts-%s-", s)
generator := names.SimpleNameGenerator
for {
generatedName := generator.GenerateName(base)
excess := len(generatedName) - maxClusterIPSvcNameLength
if excess <= 0 {
return base
}
base = base[:len(base)-1-excess] // cut off the excess chars
base = base + "-" // re-instate the dash
}
}
// unusedPort returns a port in range [3000 - 4000). The caller must ensure that
// usedPorts does not contain all ports in range [3000 - 4000).
func unusedPort(usedPorts sets.Set[int32]) int32 {
foundFreePort := false
var suggestPort int32
for !foundFreePort {
suggestPort = rand.Int32N(maxPorts) + 3000
if !usedPorts.Has(suggestPort) {
foundFreePort = true
}
}
return suggestPort
}
// tailnetTargetFromSvc returns a tailnet target for the given egress Service.
// Service must contain exactly one of tailscale.com/tailnet-ip,
// tailscale.com/tailnet-fqdn annotations.
func tailnetTargetFromSvc(svc *corev1.Service) egressservices.TailnetTarget {
if fqdn := svc.Annotations[AnnotationTailnetTargetFQDN]; fqdn != "" {
return egressservices.TailnetTarget{
FQDN: fqdn,
}
}
return egressservices.TailnetTarget{
IP: svc.Annotations[AnnotationTailnetTargetIP],
}
}
func egressSvcCfg(externalNameSvc, clusterIPSvc *corev1.Service) egressservices.Config {
tt := tailnetTargetFromSvc(externalNameSvc)
cfg := egressservices.Config{TailnetTarget: tt}
for _, svcPort := range clusterIPSvc.Spec.Ports {
pm := portMap(svcPort)
mak.Set(&cfg.Ports, pm, struct{}{})
}
return cfg
}
func portMap(p corev1.ServicePort) egressservices.PortMap {
// TODO (irbekrm): out of bounds check?
return egressservices.PortMap{Protocol: string(p.Protocol), MatchPort: uint16(p.TargetPort.IntVal), TargetPort: uint16(p.Port)}
}
func isEgressSvcForProxyGroup(obj client.Object) bool {
s, ok := obj.(*corev1.Service)
if !ok {
return false
}
annots := s.ObjectMeta.Annotations
return annots[AnnotationProxyGroup] != "" && (annots[AnnotationTailnetTargetFQDN] != "" || annots[AnnotationTailnetTargetIP] != "")
}
// egressSvcConfig returns a ConfigMap that contains egress services configuration for the provided ProxyGroup as well
// as unmarshalled configuration from the ConfigMap.
func egressSvcsConfigs(ctx context.Context, cl client.Client, proxyGroupName, tsNamespace string) (cm *corev1.ConfigMap, cfgs *egressservices.Configs, err error) {
name := pgEgressCMName(proxyGroupName)
cm = &corev1.ConfigMap{
ObjectMeta: metav1.ObjectMeta{
Name: name,
Namespace: tsNamespace,
},
}
if err := cl.Get(ctx, client.ObjectKeyFromObject(cm), cm); err != nil {
return nil, nil, fmt.Errorf("error retrieving egress services ConfigMap %s: %v", name, err)
}
cfgs = &egressservices.Configs{}
if len(cm.BinaryData[egressservices.KeyEgressServices]) != 0 {
if err := json.Unmarshal(cm.BinaryData[egressservices.KeyEgressServices], cfgs); err != nil {
return nil, nil, fmt.Errorf("error unmarshaling egress services config %v: %w", cm.BinaryData[egressservices.KeyEgressServices], err)
}
}
return cm, cfgs, nil
}
// egressSvcChildResourceLabels returns labels that should be applied to the
// ClusterIP Service and the EndpointSlice created for the egress service.
// TODO(irbekrm): we currently set a bunch of labels based on Kubernetes
// resource names (ProxyGroup, Service). Maximum allowed label length is 63
// chars whilst the maximum allowed resource name length is 253 chars, so we
// should probably validate and truncate (?) the names is they are too long.
func egressSvcChildResourceLabels(svc *corev1.Service) map[string]string {
return map[string]string{
LabelManaged: "true",
LabelParentType: "svc",
LabelParentName: svc.Name,
LabelParentNamespace: svc.Namespace,
labelProxyGroup: svc.Annotations[AnnotationProxyGroup],
labelSvcType: typeEgress,
}
}
// egressEpsLabels returns labels to be added to an EndpointSlice created for an egress service.
func egressSvcEpsLabels(extNSvc, clusterIPSvc *corev1.Service) map[string]string {
l := egressSvcChildResourceLabels(extNSvc)
// Adding this label is what makes kube proxy set up rules to route traffic sent to the clusterIP Service to the
// endpoints defined on this EndpointSlice.
// https://kubernetes.io/docs/concepts/services-networking/endpoint-slices/#ownership
l[discoveryv1.LabelServiceName] = clusterIPSvc.Name
// Kubernetes recommends setting this label.
// https://kubernetes.io/docs/concepts/services-networking/endpoint-slices/#management
l[discoveryv1.LabelManagedBy] = "tailscale.com"
return l
}
func svcConfigurationUpToDate(svc *corev1.Service, l *zap.SugaredLogger) bool {
cond := tsoperator.GetServiceCondition(svc, tsapi.EgressSvcConfigured)
if cond == nil {
return false
}
if cond.Status != metav1.ConditionTrue {
return false
}
wantsReadyReason := svcConfiguredReason(svc, true, l)
return strings.EqualFold(wantsReadyReason, cond.Reason)
}
func cfgHash(c cfg, l *zap.SugaredLogger) string {
bs, err := json.Marshal(c)
if err != nil {
// Don't use l.Error as that messes up component logs with, in this case, unnecessary stack trace.
l.Infof("error marhsalling Config: %v", err)
return ""
}
h := sha256.New()
if _, err := h.Write(bs); err != nil {
// Don't use l.Error as that messes up component logs with, in this case, unnecessary stack trace.
l.Infof("error producing Config hash: %v", err)
return ""
}
return fmt.Sprintf("%x", h.Sum(nil))
}
type cfg struct {
Ports []corev1.ServicePort `json:"ports"`
TailnetTarget egressservices.TailnetTarget `json:"tailnetTarget"`
ProxyGroup string `json:"proxyGroup"`
}
func svcConfiguredReason(svc *corev1.Service, configured bool, l *zap.SugaredLogger) string {
var r string
if configured {
r = "ConfiguredFor:"
} else {
r = fmt.Sprintf("ConfigurationFailed:%s", r)
}
r += fmt.Sprintf("ProxyGroup:%s", svc.Annotations[AnnotationProxyGroup])
tt := tailnetTargetFromSvc(svc)
s := cfg{
Ports: svc.Spec.Ports,
TailnetTarget: tt,
ProxyGroup: svc.Annotations[AnnotationProxyGroup],
}
r += fmt.Sprintf(":Config:%s", cfgHash(s, l))
return r
}
// tailnetSvc accepts and ExternalName Service name and returns a name that will be used to distinguish this tailnet
// service from other tailnet services exposed to cluster workloads.
func tailnetSvcName(extNSvc *corev1.Service) string {
return fmt.Sprintf("%s-%s", extNSvc.Namespace, extNSvc.Name)
}
// epsPortsFromSvc takes the ClusterIP Service created for an egress service and
// returns its Port array in a form that can be used for an EndpointSlice.
func epsPortsFromSvc(svc *corev1.Service) (ep []discoveryv1.EndpointPort) {
for _, p := range svc.Spec.Ports {
ep = append(ep, discoveryv1.EndpointPort{
Protocol: &p.Protocol,
Port: &p.TargetPort.IntVal,
Name: &p.Name,
})
}
return ep
}

View File

@@ -0,0 +1,268 @@
// Copyright (c) Tailscale Inc & AUTHORS
// SPDX-License-Identifier: BSD-3-Clause
//go:build !plan9
package main
import (
"context"
"encoding/json"
"fmt"
"testing"
"github.com/AlekSi/pointer"
"github.com/google/go-cmp/cmp"
"go.uber.org/zap"
corev1 "k8s.io/api/core/v1"
discoveryv1 "k8s.io/api/discovery/v1"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
"k8s.io/apimachinery/pkg/types"
"sigs.k8s.io/controller-runtime/pkg/client"
"sigs.k8s.io/controller-runtime/pkg/client/fake"
tsapi "tailscale.com/k8s-operator/apis/v1alpha1"
"tailscale.com/kube/egressservices"
"tailscale.com/tstest"
"tailscale.com/tstime"
)
func TestTailscaleEgressServices(t *testing.T) {
pg := &tsapi.ProxyGroup{
TypeMeta: metav1.TypeMeta{Kind: "ProxyGroup", APIVersion: "tailscale.com/v1alpha1"},
ObjectMeta: metav1.ObjectMeta{
Name: "foo",
UID: types.UID("1234-UID"),
},
Spec: tsapi.ProxyGroupSpec{
Replicas: pointer.To[int32](3),
Type: tsapi.ProxyGroupTypeEgress,
},
}
cm := &corev1.ConfigMap{
ObjectMeta: metav1.ObjectMeta{
Name: pgEgressCMName("foo"),
Namespace: "operator-ns",
},
}
fc := fake.NewClientBuilder().
WithScheme(tsapi.GlobalScheme).
WithObjects(pg, cm).
WithStatusSubresource(pg).
Build()
zl, err := zap.NewDevelopment()
if err != nil {
t.Fatal(err)
}
clock := tstest.NewClock(tstest.ClockOpts{})
esr := &egressSvcsReconciler{
Client: fc,
logger: zl.Sugar(),
clock: clock,
tsNamespace: "operator-ns",
}
tailnetTargetFQDN := "foo.bar.ts.net."
svc := &corev1.Service{
ObjectMeta: metav1.ObjectMeta{
Name: "test",
Namespace: "default",
UID: types.UID("1234-UID"),
Annotations: map[string]string{
AnnotationTailnetTargetFQDN: tailnetTargetFQDN,
AnnotationProxyGroup: "foo",
},
},
Spec: corev1.ServiceSpec{
ExternalName: "placeholder",
Type: corev1.ServiceTypeExternalName,
Selector: nil,
Ports: []corev1.ServicePort{
{
Name: "http",
Protocol: "TCP",
Port: 80,
},
{
Name: "https",
Protocol: "TCP",
Port: 443,
},
},
},
}
t.Run("proxy_group_not_ready", func(t *testing.T) {
mustCreate(t, fc, svc)
expectReconciled(t, esr, "default", "test")
// Service should have EgressSvcValid condition set to Unknown.
svc.Status.Conditions = []metav1.Condition{condition(tsapi.EgressSvcValid, metav1.ConditionUnknown, reasonProxyGroupNotReady, reasonProxyGroupNotReady, clock)}
expectEqual(t, fc, svc, nil)
})
t.Run("proxy_group_ready", func(t *testing.T) {
mustUpdateStatus(t, fc, "", "foo", func(pg *tsapi.ProxyGroup) {
pg.Status.Conditions = []metav1.Condition{
condition(tsapi.ProxyGroupReady, metav1.ConditionTrue, "", "", clock),
}
})
// Quirks of the fake client.
mustUpdateStatus(t, fc, "default", "test", func(svc *corev1.Service) {
svc.Status.Conditions = []metav1.Condition{}
})
expectReconciled(t, esr, "default", "test")
// Verify that a ClusterIP Service has been created.
name := findGenNameForEgressSvcResources(t, fc, svc)
expectEqual(t, fc, clusterIPSvc(name, svc), removeTargetPortsFromSvc)
clusterSvc := mustGetClusterIPSvc(t, fc, name)
// Verify that an EndpointSlice has been created.
expectEqual(t, fc, endpointSlice(name, svc, clusterSvc), nil)
// Verify that ConfigMap contains configuration for the new egress service.
mustHaveConfigForSvc(t, fc, svc, clusterSvc, cm)
r := svcConfiguredReason(svc, true, zl.Sugar())
// Verify that the user-created ExternalName Service has Configured set to true and ExternalName pointing to the
// CluterIP Service.
svc.Status.Conditions = []metav1.Condition{
condition(tsapi.EgressSvcConfigured, metav1.ConditionTrue, r, r, clock),
}
svc.ObjectMeta.Finalizers = []string{"tailscale.com/finalizer"}
svc.Spec.ExternalName = fmt.Sprintf("%s.operator-ns.svc.cluster.local", name)
expectEqual(t, fc, svc, nil)
})
t.Run("delete_external_name_service", func(t *testing.T) {
name := findGenNameForEgressSvcResources(t, fc, svc)
if err := fc.Delete(context.Background(), svc); err != nil {
t.Fatalf("error deleting ExternalName Service: %v", err)
}
expectReconciled(t, esr, "default", "test")
// Verify that ClusterIP Service and EndpointSlice have been deleted.
expectMissing[corev1.Service](t, fc, "operator-ns", name)
expectMissing[discoveryv1.EndpointSlice](t, fc, "operator-ns", fmt.Sprintf("%s-ipv4", name))
// Verify that service config has been deleted from the ConfigMap.
mustNotHaveConfigForSvc(t, fc, svc, cm)
})
}
func condition(typ tsapi.ConditionType, st metav1.ConditionStatus, r, msg string, clock tstime.Clock) metav1.Condition {
return metav1.Condition{
Type: string(typ),
Status: st,
LastTransitionTime: conditionTime(clock),
Reason: r,
Message: msg,
}
}
func findGenNameForEgressSvcResources(t *testing.T, client client.Client, svc *corev1.Service) string {
t.Helper()
labels := egressSvcChildResourceLabels(svc)
s, err := getSingleObject[corev1.Service](context.Background(), client, "operator-ns", labels)
if err != nil {
t.Fatalf("finding ClusterIP Service for ExternalName Service %s: %v", svc.Name, err)
}
if s == nil {
t.Fatalf("no ClusterIP Service found for ExternalName Service %q", svc.Name)
}
return s.GetName()
}
func clusterIPSvc(name string, extNSvc *corev1.Service) *corev1.Service {
labels := egressSvcChildResourceLabels(extNSvc)
return &corev1.Service{
ObjectMeta: metav1.ObjectMeta{
Name: name,
Namespace: "operator-ns",
GenerateName: fmt.Sprintf("ts-%s-", extNSvc.Name),
Labels: labels,
},
Spec: corev1.ServiceSpec{
Type: corev1.ServiceTypeClusterIP,
Ports: extNSvc.Spec.Ports,
},
}
}
func mustGetClusterIPSvc(t *testing.T, cl client.Client, name string) *corev1.Service {
svc := &corev1.Service{
ObjectMeta: metav1.ObjectMeta{
Name: name,
Namespace: "operator-ns",
},
}
if err := cl.Get(context.Background(), client.ObjectKeyFromObject(svc), svc); err != nil {
t.Fatalf("error retrieving Service")
}
return svc
}
func endpointSlice(name string, extNSvc, clusterIPSvc *corev1.Service) *discoveryv1.EndpointSlice {
labels := egressSvcChildResourceLabels(extNSvc)
labels[discoveryv1.LabelManagedBy] = "tailscale.com"
labels[discoveryv1.LabelServiceName] = name
return &discoveryv1.EndpointSlice{
ObjectMeta: metav1.ObjectMeta{
Name: fmt.Sprintf("%s-ipv4", name),
Namespace: "operator-ns",
Labels: labels,
},
Ports: portsForEndpointSlice(clusterIPSvc),
AddressType: discoveryv1.AddressTypeIPv4,
}
}
func portsForEndpointSlice(svc *corev1.Service) []discoveryv1.EndpointPort {
ports := make([]discoveryv1.EndpointPort, 0)
for _, p := range svc.Spec.Ports {
ports = append(ports, discoveryv1.EndpointPort{
Name: &p.Name,
Protocol: &p.Protocol,
Port: pointer.ToInt32(p.TargetPort.IntVal),
})
}
return ports
}
func mustHaveConfigForSvc(t *testing.T, cl client.Client, extNSvc, clusterIPSvc *corev1.Service, cm *corev1.ConfigMap) {
t.Helper()
wantsCfg := egressSvcCfg(extNSvc, clusterIPSvc)
if err := cl.Get(context.Background(), client.ObjectKeyFromObject(cm), cm); err != nil {
t.Fatalf("Error retrieving ConfigMap: %v", err)
}
name := tailnetSvcName(extNSvc)
gotCfg := configFromCM(t, cm, name)
if gotCfg == nil {
t.Fatalf("No config found for service %q", name)
}
if diff := cmp.Diff(*gotCfg, wantsCfg); diff != "" {
t.Fatalf("unexpected config for service %q (-got +want):\n%s", name, diff)
}
}
func mustNotHaveConfigForSvc(t *testing.T, cl client.Client, extNSvc *corev1.Service, cm *corev1.ConfigMap) {
t.Helper()
if err := cl.Get(context.Background(), client.ObjectKeyFromObject(cm), cm); err != nil {
t.Fatalf("Error retrieving ConfigMap: %v", err)
}
name := tailnetSvcName(extNSvc)
gotCfg := configFromCM(t, cm, name)
if gotCfg != nil {
t.Fatalf("Config %#+v for service %q found when it should not be present", gotCfg, name)
}
}
func configFromCM(t *testing.T, cm *corev1.ConfigMap, svcName string) *egressservices.Config {
t.Helper()
cfgBs, ok := cm.BinaryData[egressservices.KeyEgressServices]
if !ok {
return nil
}
cfgs := &egressservices.Configs{}
if err := json.Unmarshal(cfgBs, cfgs); err != nil {
t.Fatalf("error unmarshalling config: %v", err)
}
cfg, ok := (*cfgs)[svcName]
if ok {
return &cfg
}
return nil
}

View File

@@ -3,6 +3,7 @@
//go:build !plan9
// The generate command creates tailscale.com CRDs.
package main
import (
@@ -23,10 +24,14 @@ const (
connectorCRDPath = operatorDeploymentFilesPath + "/crds/tailscale.com_connectors.yaml"
proxyClassCRDPath = operatorDeploymentFilesPath + "/crds/tailscale.com_proxyclasses.yaml"
dnsConfigCRDPath = operatorDeploymentFilesPath + "/crds/tailscale.com_dnsconfigs.yaml"
recorderCRDPath = operatorDeploymentFilesPath + "/crds/tailscale.com_recorders.yaml"
proxyGroupCRDPath = operatorDeploymentFilesPath + "/crds/tailscale.com_proxygroups.yaml"
helmTemplatesPath = operatorDeploymentFilesPath + "/chart/templates"
connectorCRDHelmTemplatePath = helmTemplatesPath + "/connector.yaml"
proxyClassCRDHelmTemplatePath = helmTemplatesPath + "/proxyclass.yaml"
dnsConfigCRDHelmTemplatePath = helmTemplatesPath + "/dnsconfig.yaml"
recorderCRDHelmTemplatePath = helmTemplatesPath + "/recorder.yaml"
proxyGroupCRDHelmTemplatePath = helmTemplatesPath + "/proxygroup.yaml"
helmConditionalStart = "{{ if .Values.installCRDs -}}\n"
helmConditionalEnd = "{{- end -}}"
@@ -110,7 +115,7 @@ func main() {
}
}
// generate places tailscale.com CRDs (currently Connector, ProxyClass and DNSConfig) into
// generate places tailscale.com CRDs (currently Connector, ProxyClass, DNSConfig, Recorder) into
// the Helm chart templates behind .Values.installCRDs=true condition (true by
// default).
func generate(baseDir string) error {
@@ -136,28 +141,34 @@ func generate(baseDir string) error {
}
return nil
}
if err := addCRDToHelm(connectorCRDPath, connectorCRDHelmTemplatePath); err != nil {
return fmt.Errorf("error adding Connector CRD to Helm templates: %w", err)
}
if err := addCRDToHelm(proxyClassCRDPath, proxyClassCRDHelmTemplatePath); err != nil {
return fmt.Errorf("error adding ProxyClass CRD to Helm templates: %w", err)
}
if err := addCRDToHelm(dnsConfigCRDPath, dnsConfigCRDHelmTemplatePath); err != nil {
return fmt.Errorf("error adding DNSConfig CRD to Helm templates: %w", err)
for _, crd := range []struct {
crdPath, templatePath string
}{
{connectorCRDPath, connectorCRDHelmTemplatePath},
{proxyClassCRDPath, proxyClassCRDHelmTemplatePath},
{dnsConfigCRDPath, dnsConfigCRDHelmTemplatePath},
{recorderCRDPath, recorderCRDHelmTemplatePath},
{proxyGroupCRDPath, proxyGroupCRDHelmTemplatePath},
} {
if err := addCRDToHelm(crd.crdPath, crd.templatePath); err != nil {
return fmt.Errorf("error adding %s CRD to Helm templates: %w", crd.crdPath, err)
}
}
return nil
}
func cleanup(baseDir string) error {
log.Print("Cleaning up CRD from Helm templates")
if err := os.Remove(filepath.Join(baseDir, connectorCRDHelmTemplatePath)); err != nil && !os.IsNotExist(err) {
return fmt.Errorf("error cleaning up Connector CRD template: %w", err)
}
if err := os.Remove(filepath.Join(baseDir, proxyClassCRDHelmTemplatePath)); err != nil && !os.IsNotExist(err) {
return fmt.Errorf("error cleaning up ProxyClass CRD template: %w", err)
}
if err := os.Remove(filepath.Join(baseDir, dnsConfigCRDHelmTemplatePath)); err != nil && !os.IsNotExist(err) {
return fmt.Errorf("error cleaning up DNSConfig CRD template: %w", err)
for _, path := range []string{
connectorCRDHelmTemplatePath,
proxyClassCRDHelmTemplatePath,
dnsConfigCRDHelmTemplatePath,
recorderCRDHelmTemplatePath,
proxyGroupCRDHelmTemplatePath,
} {
if err := os.Remove(filepath.Join(baseDir, path)); err != nil && !os.IsNotExist(err) {
return fmt.Errorf("error cleaning up %s: %w", path, err)
}
}
return nil
}

View File

@@ -59,6 +59,12 @@ func Test_generate(t *testing.T) {
if !strings.Contains(installContentsWithCRD.String(), "name: dnsconfigs.tailscale.com") {
t.Errorf("DNSConfig CRD not found in default chart install")
}
if !strings.Contains(installContentsWithCRD.String(), "name: recorders.tailscale.com") {
t.Errorf("Recorder CRD not found in default chart install")
}
if !strings.Contains(installContentsWithCRD.String(), "name: proxygroups.tailscale.com") {
t.Errorf("ProxyGroup CRD not found in default chart install")
}
// Test that CRDs can be excluded from Helm chart install
installContentsWithoutCRD := bytes.NewBuffer([]byte{})
@@ -77,4 +83,10 @@ func Test_generate(t *testing.T) {
if strings.Contains(installContentsWithoutCRD.String(), "name: dnsconfigs.tailscale.com") {
t.Errorf("DNSConfig CRD found in chart install that should not contain a CRD")
}
if strings.Contains(installContentsWithoutCRD.String(), "name: recorders.tailscale.com") {
t.Errorf("Recorder CRD found in chart install that should not contain a CRD")
}
if strings.Contains(installContentsWithoutCRD.String(), "name: proxygroups.tailscale.com") {
t.Errorf("ProxyGroup CRD found in chart install that should not contain a CRD")
}
}

View File

@@ -23,6 +23,7 @@ import (
"sigs.k8s.io/controller-runtime/pkg/client"
"sigs.k8s.io/controller-runtime/pkg/reconcile"
"tailscale.com/ipn"
"tailscale.com/kube/kubetypes"
"tailscale.com/types/opt"
"tailscale.com/util/clientmetric"
"tailscale.com/util/set"
@@ -46,12 +47,14 @@ type IngressReconciler struct {
// managedIngresses is a set of all ingress resources that we're currently
// managing. This is only used for metrics.
managedIngresses set.Slice[types.UID]
defaultProxyClass string
}
var (
// gaugeIngressResources tracks the number of ingress resources that we're
// currently managing.
gaugeIngressResources = clientmetric.NewGauge("k8s_ingress_resources")
gaugeIngressResources = clientmetric.NewGauge(kubetypes.MetricIngressResourceCount)
)
func (a *IngressReconciler) Reconcile(ctx context.Context, req reconcile.Request) (_ reconcile.Result, err error) {
@@ -133,7 +136,7 @@ func (a *IngressReconciler) maybeProvision(ctx context.Context, logger *zap.Suga
}
}
proxyClass := proxyClassForObject(ing)
proxyClass := proxyClassForObject(ing, a.defaultProxyClass)
if proxyClass != "" {
if ready, err := proxyClassIsReady(ctx, proxyClass, a.Client); err != nil {
return fmt.Errorf("error verifying ProxyClass for Ingress: %w", err)

View File

@@ -17,6 +17,7 @@ import (
"sigs.k8s.io/controller-runtime/pkg/client/fake"
"tailscale.com/ipn"
tsapi "tailscale.com/k8s-operator/apis/v1alpha1"
"tailscale.com/kube/kubetypes"
"tailscale.com/types/ptr"
"tailscale.com/util/mak"
)
@@ -93,6 +94,7 @@ func TestTailscaleIngress(t *testing.T) {
namespace: "default",
parentType: "ingress",
hostname: "default-test",
app: kubetypes.AppIngressResource,
}
serveConfig := &ipn.ServeConfig{
TCP: map[uint16]*ipn.TCPPortHandler{443: {HTTPS: true}},
@@ -224,6 +226,7 @@ func TestTailscaleIngressWithProxyClass(t *testing.T) {
namespace: "default",
parentType: "ingress",
hostname: "default-test",
app: kubetypes.AppIngressResource,
}
serveConfig := &ipn.ServeConfig{
TCP: map[uint16]*ipn.TCPPortHandler{443: {HTTPS: true}},
@@ -250,7 +253,7 @@ func TestTailscaleIngressWithProxyClass(t *testing.T) {
pc.Status = tsapi.ProxyClassStatus{
Conditions: []metav1.Condition{{
Status: metav1.ConditionTrue,
Type: string(tsapi.ProxyClassready),
Type: string(tsapi.ProxyClassReady),
ObservedGeneration: pc.Generation,
}}}
})

View File

@@ -28,6 +28,7 @@ import (
"sigs.k8s.io/yaml"
tsoperator "tailscale.com/k8s-operator"
tsapi "tailscale.com/k8s-operator/apis/v1alpha1"
"tailscale.com/kube/kubetypes"
"tailscale.com/tstime"
"tailscale.com/util/clientmetric"
"tailscale.com/util/set"
@@ -62,9 +63,7 @@ type NameserverReconciler struct {
managedNameservers set.Slice[types.UID] // one or none
}
var (
gaugeNameserverResources = clientmetric.NewGauge("k8s_nameserver_resources")
)
var gaugeNameserverResources = clientmetric.NewGauge(kubetypes.MetricNameserverCount)
func (a *NameserverReconciler) Reconcile(ctx context.Context, req reconcile.Request) (res reconcile.Result, err error) {
logger := a.logger.With("dnsConfig", req.Name)

View File

@@ -33,7 +33,7 @@ func TestNameserverReconciler(t *testing.T) {
},
Spec: tsapi.DNSConfigSpec{
Nameserver: &tsapi.Nameserver{
Image: &tsapi.Image{
Image: &tsapi.NameserverImage{
Repo: "test",
Tag: "v0.0.1",
},

View File

@@ -11,6 +11,7 @@ import (
"context"
"os"
"regexp"
"strconv"
"strings"
"time"
@@ -22,6 +23,7 @@ import (
corev1 "k8s.io/api/core/v1"
discoveryv1 "k8s.io/api/discovery/v1"
networkingv1 "k8s.io/api/networking/v1"
rbacv1 "k8s.io/api/rbac/v1"
"k8s.io/apimachinery/pkg/types"
"k8s.io/client-go/rest"
"sigs.k8s.io/controller-runtime/pkg/builder"
@@ -39,6 +41,7 @@ import (
"tailscale.com/ipn"
"tailscale.com/ipn/store/kubestore"
tsapi "tailscale.com/k8s-operator/apis/v1alpha1"
"tailscale.com/kube/kubetypes"
"tailscale.com/tsnet"
"tailscale.com/tstime"
"tailscale.com/types/logger"
@@ -51,8 +54,8 @@ import (
// Generate static manifests for deploying Tailscale operator on Kubernetes from the operator's Helm chart.
//go:generate go run tailscale.com/cmd/k8s-operator/generate staticmanifests
// Generate CRD docs from the yamls
//go:generate go run fybrik.io/crdoc --resources=./deploy/crds --output=../../k8s-operator/api.md
// Generate CRD API docs.
//go:generate go run github.com/elastic/crd-ref-docs --renderer=markdown --source-path=../../k8s-operator/apis/ --config=../../k8s-operator/api-docs-config.yaml --output-path=../../k8s-operator/api.md
func main() {
// Required to use our client API. We're fine with the instability since the
@@ -66,6 +69,7 @@ func main() {
priorityClassName = defaultEnv("PROXY_PRIORITY_CLASS_NAME", "")
tags = defaultEnv("PROXY_TAGS", "tag:k8s")
tsFirewallMode = defaultEnv("PROXY_FIREWALL_MODE", "")
defaultProxyClass = defaultEnv("PROXY_DEFAULT_CLASS", "")
isDefaultLoadBalancer = defaultBool("OPERATOR_DEFAULT_LOAD_BALANCER", false)
)
@@ -86,9 +90,9 @@ func main() {
// https://tailscale.com/kb/1236/kubernetes-operator/?q=kubernetes#accessing-the-kubernetes-control-plane-using-an-api-server-proxy.
mode := parseAPIProxyMode()
if mode == apiserverProxyModeDisabled {
hostinfo.SetApp("k8s-operator")
hostinfo.SetApp(kubetypes.AppOperator)
} else {
hostinfo.SetApp("k8s-operator-proxy")
hostinfo.SetApp(kubetypes.AppAPIServerProxy)
}
s, tsClient := initTSNet(zlog)
@@ -106,6 +110,7 @@ func main() {
proxyActAsDefaultLoadBalancer: isDefaultLoadBalancer,
proxyTags: tags,
proxyFirewallMode: tsFirewallMode,
defaultProxyClass: defaultProxyClass,
}
runReconcilers(rOpts)
}
@@ -139,12 +144,20 @@ func initTSNet(zlog *zap.SugaredLogger) (*tsnet.Server, *tailscale.Client) {
TokenURL: "https://login.tailscale.com/api/v2/oauth/token",
}
tsClient := tailscale.NewClient("-", nil)
tsClient.UserAgent = "tailscale-k8s-operator"
tsClient.HTTPClient = credentials.Client(context.Background())
s := &tsnet.Server{
Hostname: hostname,
Logf: zlog.Named("tailscaled").Debugf,
}
if p := os.Getenv("TS_PORT"); p != "" {
port, err := strconv.ParseUint(p, 10, 16)
if err != nil {
startlog.Fatalf("TS_PORT %q cannot be parsed as uint16: %v", p, err)
}
s.Port = uint16(port)
}
if kubeSecret != "" {
st, err := kubestore.New(logger.Discard, kubeSecret)
if err != nil {
@@ -234,10 +247,13 @@ func runReconcilers(opts reconcilerOpts) {
ByObject: map[client.Object]cache.ByObject{
&corev1.Secret{}: nsFilter,
&corev1.ServiceAccount{}: nsFilter,
&corev1.Pod{}: nsFilter,
&corev1.ConfigMap{}: nsFilter,
&appsv1.StatefulSet{}: nsFilter,
&appsv1.Deployment{}: nsFilter,
&discoveryv1.EndpointSlice{}: nsFilter,
&rbacv1.Role{}: nsFilter,
&rbacv1.RoleBinding{}: nsFilter,
},
},
Scheme: tsapi.GlobalScheme,
@@ -279,6 +295,7 @@ func runReconcilers(opts reconcilerOpts) {
recorder: eventRecorder,
tsNamespace: opts.tailscaleNamespace,
clock: tstime.DefaultClock{},
defaultProxyClass: opts.defaultProxyClass,
})
if err != nil {
startlog.Fatalf("could not create service reconciler: %v", err)
@@ -297,10 +314,11 @@ func runReconcilers(opts reconcilerOpts) {
Watches(&corev1.Service{}, svcHandlerForIngress).
Watches(&tsapi.ProxyClass{}, proxyClassFilterForIngress).
Complete(&IngressReconciler{
ssr: ssr,
recorder: eventRecorder,
Client: mgr.GetClient(),
logger: opts.log.Named("ingress-reconciler"),
ssr: ssr,
recorder: eventRecorder,
Client: mgr.GetClient(),
logger: opts.log.Named("ingress-reconciler"),
defaultProxyClass: opts.defaultProxyClass,
})
if err != nil {
startlog.Fatalf("could not create ingress reconciler: %v", err)
@@ -345,6 +363,65 @@ func runReconcilers(opts reconcilerOpts) {
if err != nil {
startlog.Fatalf("could not create nameserver reconciler: %v", err)
}
egressSvcFilter := handler.EnqueueRequestsFromMapFunc(egressSvcsHandler)
egressProxyGroupFilter := handler.EnqueueRequestsFromMapFunc(egressSvcsFromEgressProxyGroup(mgr.GetClient(), opts.log))
err = builder.
ControllerManagedBy(mgr).
Named("egress-svcs-reconciler").
Watches(&corev1.Service{}, egressSvcFilter).
Watches(&tsapi.ProxyGroup{}, egressProxyGroupFilter).
Complete(&egressSvcsReconciler{
Client: mgr.GetClient(),
tsNamespace: opts.tailscaleNamespace,
recorder: eventRecorder,
clock: tstime.DefaultClock{},
logger: opts.log.Named("egress-svcs-reconciler"),
})
if err != nil {
startlog.Fatalf("could not create egress Services reconciler: %v", err)
}
if err := mgr.GetFieldIndexer().IndexField(context.Background(), new(corev1.Service), indexEgressProxyGroup, indexEgressServices); err != nil {
startlog.Fatalf("failed setting up indexer for egress Services: %v", err)
}
egressSvcFromEpsFilter := handler.EnqueueRequestsFromMapFunc(egressSvcFromEps)
err = builder.
ControllerManagedBy(mgr).
Named("egress-svcs-readiness-reconciler").
Watches(&corev1.Service{}, egressSvcFilter).
Watches(&discoveryv1.EndpointSlice{}, egressSvcFromEpsFilter).
Complete(&egressSvcsReadinessReconciler{
Client: mgr.GetClient(),
tsNamespace: opts.tailscaleNamespace,
clock: tstime.DefaultClock{},
logger: opts.log.Named("egress-svcs-readiness-reconciler"),
})
if err != nil {
startlog.Fatalf("could not create egress Services readiness reconciler: %v", err)
}
epsFilter := handler.EnqueueRequestsFromMapFunc(egressEpsHandler)
podsFilter := handler.EnqueueRequestsFromMapFunc(egressEpsFromPGPods(mgr.GetClient(), opts.tailscaleNamespace))
secretsFilter := handler.EnqueueRequestsFromMapFunc(egressEpsFromPGStateSecrets(mgr.GetClient(), opts.tailscaleNamespace))
epsFromExtNSvcFilter := handler.EnqueueRequestsFromMapFunc(epsFromExternalNameService(mgr.GetClient(), opts.log, opts.tailscaleNamespace))
err = builder.
ControllerManagedBy(mgr).
Named("egress-eps-reconciler").
Watches(&discoveryv1.EndpointSlice{}, epsFilter).
Watches(&corev1.Pod{}, podsFilter).
Watches(&corev1.Secret{}, secretsFilter).
Watches(&corev1.Service{}, epsFromExtNSvcFilter).
Complete(&egressEpsReconciler{
Client: mgr.GetClient(),
tsNamespace: opts.tailscaleNamespace,
logger: opts.log.Named("egress-eps-reconciler"),
})
if err != nil {
startlog.Fatalf("could not create egress EndpointSlices reconciler: %v", err)
}
err = builder.ControllerManagedBy(mgr).
For(&tsapi.ProxyClass{}).
Complete(&ProxyClassReconciler{
@@ -384,6 +461,56 @@ func runReconcilers(opts reconcilerOpts) {
if err != nil {
startlog.Fatalf("could not create DNS records reconciler: %v", err)
}
// Recorder reconciler.
recorderFilter := handler.EnqueueRequestForOwner(mgr.GetScheme(), mgr.GetRESTMapper(), &tsapi.Recorder{})
err = builder.ControllerManagedBy(mgr).
For(&tsapi.Recorder{}).
Watches(&appsv1.StatefulSet{}, recorderFilter).
Watches(&corev1.ServiceAccount{}, recorderFilter).
Watches(&corev1.Secret{}, recorderFilter).
Watches(&rbacv1.Role{}, recorderFilter).
Watches(&rbacv1.RoleBinding{}, recorderFilter).
Complete(&RecorderReconciler{
recorder: eventRecorder,
tsNamespace: opts.tailscaleNamespace,
Client: mgr.GetClient(),
l: opts.log.Named("recorder-reconciler"),
clock: tstime.DefaultClock{},
tsClient: opts.tsClient,
})
if err != nil {
startlog.Fatalf("could not create Recorder reconciler: %v", err)
}
// Recorder reconciler.
ownedByProxyGroupFilter := handler.EnqueueRequestForOwner(mgr.GetScheme(), mgr.GetRESTMapper(), &tsapi.ProxyGroup{})
proxyClassFilterForProxyGroup := handler.EnqueueRequestsFromMapFunc(proxyClassHandlerForProxyGroup(mgr.GetClient(), startlog))
err = builder.ControllerManagedBy(mgr).
For(&tsapi.ProxyGroup{}).
Watches(&appsv1.StatefulSet{}, ownedByProxyGroupFilter).
Watches(&corev1.ServiceAccount{}, ownedByProxyGroupFilter).
Watches(&corev1.Secret{}, ownedByProxyGroupFilter).
Watches(&rbacv1.Role{}, ownedByProxyGroupFilter).
Watches(&rbacv1.RoleBinding{}, ownedByProxyGroupFilter).
Watches(&tsapi.ProxyClass{}, proxyClassFilterForProxyGroup).
Complete(&ProxyGroupReconciler{
recorder: eventRecorder,
Client: mgr.GetClient(),
l: opts.log.Named("proxygroup-reconciler"),
clock: tstime.DefaultClock{},
tsClient: opts.tsClient,
tsNamespace: opts.tailscaleNamespace,
proxyImage: opts.proxyImage,
defaultTags: strings.Split(opts.proxyTags, ","),
tsFirewallMode: opts.proxyFirewallMode,
defaultProxyClass: opts.defaultProxyClass,
})
if err != nil {
startlog.Fatalf("could not create ProxyGroup reconciler: %v", err)
}
startlog.Infof("Startup complete, operator running, version: %s", version.Long())
if err := mgr.Start(signals.SetupSignalHandler()); err != nil {
startlog.Fatalf("could not start manager: %v", err)
@@ -424,6 +551,10 @@ type reconcilerOpts struct {
// Auto is usually the best choice, unless you want to explicitly set
// specific mode for debugging purposes.
proxyFirewallMode string
// defaultProxyClass is the name of the ProxyClass to use as the default
// class for proxies that do not have a ProxyClass set.
// this is defined by an operator env variable.
defaultProxyClass string
}
// enqueueAllIngressEgressProxySvcsinNS returns a reconcile request for each
@@ -516,6 +647,7 @@ func dnsRecordsReconcilerIngressHandler(ns string, isDefaultLoadBalancer bool, c
type tsClient interface {
CreateKey(ctx context.Context, caps tailscale.KeyCapabilities) (string, *tailscale.Key, error)
Device(ctx context.Context, deviceID string, fields *tailscale.DeviceFieldsOpts) (*tailscale.Device, error)
DeleteDevice(ctx context.Context, nodeStableID string) error
}
@@ -611,6 +743,27 @@ func proxyClassHandlerForConnector(cl client.Client, logger *zap.SugaredLogger)
}
}
// proxyClassHandlerForConnector returns a handler that, for a given ProxyClass,
// returns a list of reconcile requests for all Connectors that have
// .spec.proxyClass set.
func proxyClassHandlerForProxyGroup(cl client.Client, logger *zap.SugaredLogger) handler.MapFunc {
return func(ctx context.Context, o client.Object) []reconcile.Request {
pgList := new(tsapi.ProxyGroupList)
if err := cl.List(ctx, pgList); err != nil {
logger.Debugf("error listing ProxyGroups for ProxyClass: %v", err)
return nil
}
reqs := make([]reconcile.Request, 0)
proxyClassName := o.GetName()
for _, pg := range pgList.Items {
if pg.Spec.ProxyClass == proxyClassName {
reqs = append(reqs, reconcile.Request{NamespacedName: client.ObjectKeyFromObject(&pg)})
}
}
return reqs
}
}
// serviceHandlerForIngress returns a handler for Service events for ingress
// reconciler that ensures that if the Service associated with an event is of
// interest to the reconciler, the associated Ingress(es) gets be reconciled.
@@ -652,6 +805,10 @@ func serviceHandlerForIngress(cl client.Client, logger *zap.SugaredLogger) handl
}
func serviceHandler(_ context.Context, o client.Object) []reconcile.Request {
if _, ok := o.GetAnnotations()[AnnotationProxyGroup]; ok {
// Do not reconcile Services for ProxyGroup.
return nil
}
if isManagedByType(o, "svc") {
// If this is a Service managed by a Service we want to enqueue its parent
return []reconcile.Request{{NamespacedName: parentFromObjectLabels(o)}}
@@ -677,3 +834,195 @@ func isMagicDNSName(name string) bool {
validMagicDNSName := regexp.MustCompile(`^[a-zA-Z0-9-]+\.[a-zA-Z0-9-]+\.ts\.net\.?$`)
return validMagicDNSName.MatchString(name)
}
// egressSvcsHandler returns accepts a Kubernetes object and returns a reconcile
// request for it , if the object is a Tailscale egress Service meant to be
// exposed on a ProxyGroup.
func egressSvcsHandler(_ context.Context, o client.Object) []reconcile.Request {
if !isEgressSvcForProxyGroup(o) {
return nil
}
return []reconcile.Request{
{
NamespacedName: types.NamespacedName{
Namespace: o.GetNamespace(),
Name: o.GetName(),
},
},
}
}
// egressEpsHandler returns accepts an EndpointSlice and, if the EndpointSlice
// is for an egress service, returns a reconcile request for it.
func egressEpsHandler(_ context.Context, o client.Object) []reconcile.Request {
if typ := o.GetLabels()[labelSvcType]; typ != typeEgress {
return nil
}
return []reconcile.Request{
{
NamespacedName: types.NamespacedName{
Namespace: o.GetNamespace(),
Name: o.GetName(),
},
},
}
}
// egressEpsFromEgressPods returns a Pod event handler that checks if Pod is a replica for a ProxyGroup and if it is,
// returns reconciler requests for all egress EndpointSlices for that ProxyGroup.
func egressEpsFromPGPods(cl client.Client, ns string) handler.MapFunc {
return func(_ context.Context, o client.Object) []reconcile.Request {
if v, ok := o.GetLabels()[LabelManaged]; !ok || v != "true" {
return nil
}
// TODO(irbekrm): for now this is good enough as all ProxyGroups are egress. Add a type check once we
// have ingress ProxyGroups.
if typ := o.GetLabels()[LabelParentType]; typ != "proxygroup" {
return nil
}
pg, ok := o.GetLabels()[LabelParentName]
if !ok {
return nil
}
return reconcileRequestsForPG(pg, cl, ns)
}
}
// egressEpsFromPGStateSecrets returns a Secret event handler that checks if Secret is a state Secret for a ProxyGroup and if it is,
// returns reconciler requests for all egress EndpointSlices for that ProxyGroup.
func egressEpsFromPGStateSecrets(cl client.Client, ns string) handler.MapFunc {
return func(_ context.Context, o client.Object) []reconcile.Request {
if v, ok := o.GetLabels()[LabelManaged]; !ok || v != "true" {
return nil
}
// TODO(irbekrm): for now this is good enough as all ProxyGroups are egress. Add a type check once we
// have ingress ProxyGroups.
if parentType := o.GetLabels()[LabelParentType]; parentType != "proxygroup" {
return nil
}
if secretType := o.GetLabels()[labelSecretType]; secretType != "state" {
return nil
}
pg, ok := o.GetLabels()[LabelParentName]
if !ok {
return nil
}
return reconcileRequestsForPG(pg, cl, ns)
}
}
// egressSvcFromEps is an event handler for EndpointSlices. If an EndpointSlice is for an egress ExternalName Service
// meant to be exposed on a ProxyGroup, returns a reconcile request for the Service.
func egressSvcFromEps(_ context.Context, o client.Object) []reconcile.Request {
if typ := o.GetLabels()[labelSvcType]; typ != typeEgress {
return nil
}
if v, ok := o.GetLabels()[LabelManaged]; !ok || v != "true" {
return nil
}
svcName, ok := o.GetLabels()[LabelParentName]
if !ok {
return nil
}
svcNs, ok := o.GetLabels()[LabelParentNamespace]
if !ok {
return nil
}
return []reconcile.Request{
{
NamespacedName: types.NamespacedName{
Namespace: svcNs,
Name: svcName,
},
},
}
}
func reconcileRequestsForPG(pg string, cl client.Client, ns string) []reconcile.Request {
epsList := discoveryv1.EndpointSliceList{}
if err := cl.List(context.Background(), &epsList,
client.InNamespace(ns),
client.MatchingLabels(map[string]string{labelProxyGroup: pg})); err != nil {
return nil
}
reqs := make([]reconcile.Request, 0)
for _, ep := range epsList.Items {
reqs = append(reqs, reconcile.Request{
NamespacedName: types.NamespacedName{
Namespace: ep.Namespace,
Name: ep.Name,
},
})
}
return reqs
}
// egressSvcsFromEgressProxyGroup is an event handler for egress ProxyGroups. It returns reconcile requests for all
// user-created ExternalName Services that should be exposed on this ProxyGroup.
func egressSvcsFromEgressProxyGroup(cl client.Client, logger *zap.SugaredLogger) handler.MapFunc {
return func(_ context.Context, o client.Object) []reconcile.Request {
pg, ok := o.(*tsapi.ProxyGroup)
if !ok {
logger.Infof("[unexpected] ProxyGroup handler triggered for an object that is not a ProxyGroup")
return nil
}
if pg.Spec.Type != tsapi.ProxyGroupTypeEgress {
return nil
}
svcList := &corev1.ServiceList{}
if err := cl.List(context.Background(), svcList, client.MatchingFields{indexEgressProxyGroup: pg.Name}); err != nil {
logger.Infof("error listing Services: %v, skipping a reconcile for event on ProxyGroup %s", err, pg.Name)
return nil
}
reqs := make([]reconcile.Request, 0)
for _, svc := range svcList.Items {
reqs = append(reqs, reconcile.Request{
NamespacedName: types.NamespacedName{
Namespace: svc.Namespace,
Name: svc.Name,
},
})
}
return reqs
}
}
// epsFromExternalNameService is an event handler for ExternalName Services that define a Tailscale egress service that
// should be exposed on a ProxyGroup. It returns reconcile requests for EndpointSlices created for this Service.
func epsFromExternalNameService(cl client.Client, logger *zap.SugaredLogger, ns string) handler.MapFunc {
return func(_ context.Context, o client.Object) []reconcile.Request {
svc, ok := o.(*corev1.Service)
if !ok {
logger.Infof("[unexpected] Service handler triggered for an object that is not a Service")
return nil
}
if !isEgressSvcForProxyGroup(svc) {
return nil
}
epsList := &discoveryv1.EndpointSliceList{}
if err := cl.List(context.Background(), epsList, client.InNamespace(ns),
client.MatchingLabels(egressSvcChildResourceLabels(svc))); err != nil {
logger.Infof("error listing EndpointSlices: %v, skipping a reconcile for event on Service %s", err, svc.Name)
return nil
}
reqs := make([]reconcile.Request, 0)
for _, eps := range epsList.Items {
reqs = append(reqs, reconcile.Request{
NamespacedName: types.NamespacedName{
Namespace: eps.Namespace,
Name: eps.Name,
},
})
}
return reqs
}
}
// indexEgressServices adds a local index to a cached Tailscale egress Services meant to be exposed on a ProxyGroup. The
// index is used a list filter.
func indexEgressServices(o client.Object) []string {
if !isEgressSvcForProxyGroup(o) {
return nil
}
return []string{o.GetAnnotations()[AnnotationProxyGroup]}
}

View File

@@ -22,6 +22,7 @@ import (
"sigs.k8s.io/controller-runtime/pkg/client/fake"
"sigs.k8s.io/controller-runtime/pkg/reconcile"
tsapi "tailscale.com/k8s-operator/apis/v1alpha1"
"tailscale.com/kube/kubetypes"
"tailscale.com/net/dns/resolvconffile"
"tailscale.com/tstest"
"tailscale.com/tstime"
@@ -123,6 +124,7 @@ func TestLoadBalancerClass(t *testing.T) {
parentType: "svc",
hostname: "default-test",
clusterTargetIP: "10.20.30.40",
app: kubetypes.AppIngressProxy,
}
expectEqual(t, fc, expectedSecret(t, fc, opts), nil)
@@ -260,6 +262,7 @@ func TestTailnetTargetFQDNAnnotation(t *testing.T) {
parentType: "svc",
tailnetTargetFQDN: tailnetTargetFQDN,
hostname: "default-test",
app: kubetypes.AppEgressProxy,
}
expectEqual(t, fc, expectedSecret(t, fc, o), nil)
@@ -371,6 +374,7 @@ func TestTailnetTargetIPAnnotation(t *testing.T) {
parentType: "svc",
tailnetTargetIP: tailnetTargetIP,
hostname: "default-test",
app: kubetypes.AppEgressProxy,
}
expectEqual(t, fc, expectedSecret(t, fc, o), nil)
@@ -428,6 +432,148 @@ func TestTailnetTargetIPAnnotation(t *testing.T) {
expectMissing[corev1.Secret](t, fc, "operator-ns", fullName)
}
func TestTailnetTargetIPAnnotation_IPCouldNotBeParsed(t *testing.T) {
fc := fake.NewFakeClient()
ft := &fakeTSClient{}
zl, err := zap.NewDevelopment()
if err != nil {
t.Fatal(err)
}
clock := tstest.NewClock(tstest.ClockOpts{})
sr := &ServiceReconciler{
Client: fc,
ssr: &tailscaleSTSReconciler{
Client: fc,
tsClient: ft,
defaultTags: []string{"tag:k8s"},
operatorNamespace: "operator-ns",
proxyImage: "tailscale/tailscale",
},
logger: zl.Sugar(),
clock: clock,
recorder: record.NewFakeRecorder(100),
}
tailnetTargetIP := "invalid-ip"
mustCreate(t, fc, &corev1.Service{
ObjectMeta: metav1.ObjectMeta{
Name: "test",
Namespace: "default",
UID: types.UID("1234-UID"),
Annotations: map[string]string{
AnnotationTailnetTargetIP: tailnetTargetIP,
},
},
Spec: corev1.ServiceSpec{
ClusterIP: "10.20.30.40",
Type: corev1.ServiceTypeLoadBalancer,
LoadBalancerClass: ptr.To("tailscale"),
},
})
expectReconciled(t, sr, "default", "test")
t0 := conditionTime(clock)
want := &corev1.Service{
ObjectMeta: metav1.ObjectMeta{
Name: "test",
Namespace: "default",
UID: types.UID("1234-UID"),
Annotations: map[string]string{
AnnotationTailnetTargetIP: tailnetTargetIP,
},
},
Spec: corev1.ServiceSpec{
ClusterIP: "10.20.30.40",
Type: corev1.ServiceTypeLoadBalancer,
LoadBalancerClass: ptr.To("tailscale"),
},
Status: corev1.ServiceStatus{
Conditions: []metav1.Condition{{
Type: string(tsapi.ProxyReady),
Status: metav1.ConditionFalse,
LastTransitionTime: t0,
Reason: reasonProxyInvalid,
Message: `unable to provision proxy resources: invalid Service: invalid value of annotation tailscale.com/tailnet-ip: "invalid-ip" could not be parsed as a valid IP Address, error: ParseAddr("invalid-ip"): unable to parse IP`,
}},
},
}
expectEqual(t, fc, want, nil)
}
func TestTailnetTargetIPAnnotation_InvalidIP(t *testing.T) {
fc := fake.NewFakeClient()
ft := &fakeTSClient{}
zl, err := zap.NewDevelopment()
if err != nil {
t.Fatal(err)
}
clock := tstest.NewClock(tstest.ClockOpts{})
sr := &ServiceReconciler{
Client: fc,
ssr: &tailscaleSTSReconciler{
Client: fc,
tsClient: ft,
defaultTags: []string{"tag:k8s"},
operatorNamespace: "operator-ns",
proxyImage: "tailscale/tailscale",
},
logger: zl.Sugar(),
clock: clock,
recorder: record.NewFakeRecorder(100),
}
tailnetTargetIP := "999.999.999.999"
mustCreate(t, fc, &corev1.Service{
ObjectMeta: metav1.ObjectMeta{
Name: "test",
Namespace: "default",
UID: types.UID("1234-UID"),
Annotations: map[string]string{
AnnotationTailnetTargetIP: tailnetTargetIP,
},
},
Spec: corev1.ServiceSpec{
ClusterIP: "10.20.30.40",
Type: corev1.ServiceTypeLoadBalancer,
LoadBalancerClass: ptr.To("tailscale"),
},
})
expectReconciled(t, sr, "default", "test")
t0 := conditionTime(clock)
want := &corev1.Service{
ObjectMeta: metav1.ObjectMeta{
Name: "test",
Namespace: "default",
UID: types.UID("1234-UID"),
Annotations: map[string]string{
AnnotationTailnetTargetIP: tailnetTargetIP,
},
},
Spec: corev1.ServiceSpec{
ClusterIP: "10.20.30.40",
Type: corev1.ServiceTypeLoadBalancer,
LoadBalancerClass: ptr.To("tailscale"),
},
Status: corev1.ServiceStatus{
Conditions: []metav1.Condition{{
Type: string(tsapi.ProxyReady),
Status: metav1.ConditionFalse,
LastTransitionTime: t0,
Reason: reasonProxyInvalid,
Message: `unable to provision proxy resources: invalid Service: invalid value of annotation tailscale.com/tailnet-ip: "999.999.999.999" could not be parsed as a valid IP Address, error: ParseAddr("999.999.999.999"): IPv4 field has value >255`,
}},
},
}
expectEqual(t, fc, want, nil)
}
func TestAnnotations(t *testing.T) {
fc := fake.NewFakeClient()
ft := &fakeTSClient{}
@@ -479,6 +625,7 @@ func TestAnnotations(t *testing.T) {
parentType: "svc",
hostname: "default-test",
clusterTargetIP: "10.20.30.40",
app: kubetypes.AppIngressProxy,
}
expectEqual(t, fc, expectedSecret(t, fc, o), nil)
@@ -584,6 +731,7 @@ func TestAnnotationIntoLB(t *testing.T) {
parentType: "svc",
hostname: "default-test",
clusterTargetIP: "10.20.30.40",
app: kubetypes.AppIngressProxy,
}
expectEqual(t, fc, expectedSecret(t, fc, o), nil)
@@ -713,6 +861,7 @@ func TestLBIntoAnnotation(t *testing.T) {
parentType: "svc",
hostname: "default-test",
clusterTargetIP: "10.20.30.40",
app: kubetypes.AppIngressProxy,
}
expectEqual(t, fc, expectedSecret(t, fc, o), nil)
@@ -852,6 +1001,7 @@ func TestCustomHostname(t *testing.T) {
parentType: "svc",
hostname: "reindeer-flotilla",
clusterTargetIP: "10.20.30.40",
app: kubetypes.AppIngressProxy,
}
expectEqual(t, fc, expectedSecret(t, fc, o), nil)
@@ -964,6 +1114,7 @@ func TestCustomPriorityClassName(t *testing.T) {
hostname: "tailscale-critical",
priorityClassName: "custom-priority-class-name",
clusterTargetIP: "10.20.30.40",
app: kubetypes.AppIngressProxy,
}
expectEqual(t, fc, expectedSTS(t, fc, o), removeHashAnnotation)
@@ -1032,6 +1183,7 @@ func TestProxyClassForService(t *testing.T) {
parentType: "svc",
hostname: "default-test",
clusterTargetIP: "10.20.30.40",
app: kubetypes.AppIngressProxy,
}
expectEqual(t, fc, expectedSecret(t, fc, opts), nil)
expectEqual(t, fc, expectedHeadlessService(shortName, "svc"), nil)
@@ -1054,7 +1206,7 @@ func TestProxyClassForService(t *testing.T) {
pc.Status = tsapi.ProxyClassStatus{
Conditions: []metav1.Condition{{
Status: metav1.ConditionTrue,
Type: string(tsapi.ProxyClassready),
Type: string(tsapi.ProxyClassReady),
ObservedGeneration: pc.Generation,
}}}
})
@@ -1125,6 +1277,7 @@ func TestDefaultLoadBalancer(t *testing.T) {
parentType: "svc",
hostname: "default-test",
clusterTargetIP: "10.20.30.40",
app: kubetypes.AppIngressProxy,
}
expectEqual(t, fc, expectedSTS(t, fc, o), removeHashAnnotation)
@@ -1181,6 +1334,7 @@ func TestProxyFirewallMode(t *testing.T) {
hostname: "default-test",
firewallMode: "nftables",
clusterTargetIP: "10.20.30.40",
app: kubetypes.AppIngressProxy,
}
expectEqual(t, fc, expectedSTS(t, fc, o), removeHashAnnotation)
}
@@ -1234,7 +1388,8 @@ func TestTailscaledConfigfileHash(t *testing.T) {
parentType: "svc",
hostname: "default-test",
clusterTargetIP: "10.20.30.40",
confFileHash: "e09bededa0379920141cbd0b0dbdf9b8b66545877f9e8397423f5ce3e1ba439e",
confFileHash: "a67b5ad3ff605531c822327e8f1a23dd0846e1075b722c13402f7d5d0ba32ba2",
app: kubetypes.AppIngressProxy,
}
expectEqual(t, fc, expectedSTS(t, fc, o), nil)
@@ -1244,7 +1399,7 @@ func TestTailscaledConfigfileHash(t *testing.T) {
mak.Set(&svc.Annotations, AnnotationHostname, "another-test")
})
o.hostname = "another-test"
o.confFileHash = "5d754cf55463135ee34aa9821f2fd8483b53eb0570c3740c84a086304f427684"
o.confFileHash = "888a993ebee20ad6be99623b45015339de117946850cf1252bede0b570e04293"
expectReconciled(t, sr, "default", "test")
expectEqual(t, fc, expectedSTS(t, fc, o), nil)
}
@@ -1474,6 +1629,72 @@ func Test_clusterDomainFromResolverConf(t *testing.T) {
})
}
}
func Test_authKeyRemoval(t *testing.T) {
fc := fake.NewFakeClient()
ft := &fakeTSClient{}
zl, err := zap.NewDevelopment()
if err != nil {
t.Fatal(err)
}
// 1. A new Service that should be exposed via Tailscale gets created, a Secret with a config that contains auth
// key is generated.
clock := tstest.NewClock(tstest.ClockOpts{})
sr := &ServiceReconciler{
Client: fc,
ssr: &tailscaleSTSReconciler{
Client: fc,
tsClient: ft,
defaultTags: []string{"tag:k8s"},
operatorNamespace: "operator-ns",
proxyImage: "tailscale/tailscale",
},
logger: zl.Sugar(),
clock: clock,
}
mustCreate(t, fc, &corev1.Service{
ObjectMeta: metav1.ObjectMeta{
Name: "test",
Namespace: "default",
UID: types.UID("1234-UID"),
},
Spec: corev1.ServiceSpec{
ClusterIP: "10.20.30.40",
Type: corev1.ServiceTypeLoadBalancer,
LoadBalancerClass: ptr.To("tailscale"),
},
})
expectReconciled(t, sr, "default", "test")
fullName, shortName := findGenName(t, fc, "default", "test", "svc")
opts := configOpts{
stsName: shortName,
secretName: fullName,
namespace: "default",
parentType: "svc",
hostname: "default-test",
clusterTargetIP: "10.20.30.40",
app: kubetypes.AppIngressProxy,
}
expectEqual(t, fc, expectedSecret(t, fc, opts), nil)
expectEqual(t, fc, expectedHeadlessService(shortName, "svc"), nil)
expectEqual(t, fc, expectedSTS(t, fc, opts), removeHashAnnotation)
// 2. Apply update to the Secret that imitates the proxy setting device_id.
s := expectedSecret(t, fc, opts)
mustUpdate(t, fc, s.Namespace, s.Name, func(s *corev1.Secret) {
mak.Set(&s.Data, "device_id", []byte("dkkdi4CNTRL"))
})
// 3. Config should no longer contain auth key
expectReconciled(t, sr, "default", "test")
opts.shouldRemoveAuthKey = true
opts.secretExtraData = map[string][]byte{"device_id": []byte("dkkdi4CNTRL")}
expectEqual(t, fc, expectedSecret(t, fc, opts), nil)
}
func Test_externalNameService(t *testing.T) {
fc := fake.NewFakeClient()
@@ -1529,6 +1750,7 @@ func Test_externalNameService(t *testing.T) {
parentType: "svc",
hostname: "default-test",
clusterTargetDNS: "foo.com",
app: kubetypes.AppIngressProxy,
}
expectEqual(t, fc, expectedSecret(t, fc, opts), nil)

View File

@@ -11,16 +11,19 @@ import (
"log"
"net/http"
"net/http/httputil"
"net/netip"
"net/url"
"os"
"strings"
"github.com/pkg/errors"
"go.uber.org/zap"
"k8s.io/client-go/rest"
"k8s.io/client-go/transport"
"tailscale.com/client/tailscale"
"tailscale.com/client/tailscale/apitype"
tskube "tailscale.com/kube"
ksr "tailscale.com/k8s-operator/sessionrecording"
"tailscale.com/kube/kubetypes"
"tailscale.com/tailcfg"
"tailscale.com/tsnet"
"tailscale.com/util/clientmetric"
@@ -28,12 +31,27 @@ import (
"tailscale.com/util/set"
)
var whoIsKey = ctxkey.New("", (*apitype.WhoIsResponse)(nil))
var counterNumRequestsProxied = clientmetric.NewCounter("k8s_auth_proxy_requests_proxied")
var (
// counterNumRequestsproxies counts the number of API server requests proxied via this proxy.
counterNumRequestsProxied = clientmetric.NewCounter("k8s_auth_proxy_requests_proxied")
whoIsKey = ctxkey.New("", (*apitype.WhoIsResponse)(nil))
)
type apiServerProxyMode int
func (a apiServerProxyMode) String() string {
switch a {
case apiserverProxyModeDisabled:
return "disabled"
case apiserverProxyModeEnabled:
return "auth"
case apiserverProxyModeNoAuth:
return "noauth"
default:
return "unknown"
}
}
const (
apiserverProxyModeDisabled apiServerProxyMode = iota
apiserverProxyModeEnabled
@@ -97,26 +115,7 @@ func maybeLaunchAPIServerProxy(zlog *zap.SugaredLogger, restConfig *rest.Config,
if err != nil {
startlog.Fatalf("could not get rest.TransportConfig(): %v", err)
}
go runAPIServerProxy(s, rt, zlog.Named("apiserver-proxy"), mode)
}
// apiserverProxy is an http.Handler that authenticates requests using the Tailscale
// LocalAPI and then proxies them to the Kubernetes API.
type apiserverProxy struct {
log *zap.SugaredLogger
lc *tailscale.LocalClient
rp *httputil.ReverseProxy
}
func (h *apiserverProxy) ServeHTTP(w http.ResponseWriter, r *http.Request) {
who, err := h.lc.WhoIs(r.Context(), r.RemoteAddr)
if err != nil {
h.log.Errorf("failed to authenticate caller: %v", err)
http.Error(w, "failed to authenticate caller", http.StatusInternalServerError)
return
}
counterNumRequestsProxied.Add(1)
h.rp.ServeHTTP(w, r.WithContext(whoIsKey.WithValue(r.Context(), who)))
go runAPIServerProxy(s, rt, zlog.Named("apiserver-proxy"), mode, restConfig.Host)
}
// runAPIServerProxy runs an HTTP server that authenticates requests using the
@@ -133,64 +132,43 @@ func (h *apiserverProxy) ServeHTTP(w http.ResponseWriter, r *http.Request) {
// are passed through to the Kubernetes API.
//
// It never returns.
func runAPIServerProxy(s *tsnet.Server, rt http.RoundTripper, log *zap.SugaredLogger, mode apiServerProxyMode) {
func runAPIServerProxy(ts *tsnet.Server, rt http.RoundTripper, log *zap.SugaredLogger, mode apiServerProxyMode, host string) {
if mode == apiserverProxyModeDisabled {
return
}
ln, err := s.Listen("tcp", ":443")
ln, err := ts.Listen("tcp", ":443")
if err != nil {
log.Fatalf("could not listen on :443: %v", err)
}
u, err := url.Parse(fmt.Sprintf("https://%s:%s", os.Getenv("KUBERNETES_SERVICE_HOST"), os.Getenv("KUBERNETES_SERVICE_PORT_HTTPS")))
u, err := url.Parse(host)
if err != nil {
log.Fatalf("runAPIServerProxy: failed to parse URL %v", err)
}
lc, err := s.LocalClient()
lc, err := ts.LocalClient()
if err != nil {
log.Fatalf("could not get local client: %v", err)
}
ap := &apiserverProxy{
log: log,
lc: lc,
rp: &httputil.ReverseProxy{
Rewrite: func(r *httputil.ProxyRequest) {
// Replace the URL with the Kubernetes APIServer.
r.Out.URL.Scheme = u.Scheme
r.Out.URL.Host = u.Host
if mode == apiserverProxyModeNoAuth {
// If we are not providing authentication, then we are just
// proxying to the Kubernetes API, so we don't need to do
// anything else.
return
}
// We want to proxy to the Kubernetes API, but we want to use
// the caller's identity to do so. We do this by impersonating
// the caller using the Kubernetes User Impersonation feature:
// https://kubernetes.io/docs/reference/access-authn-authz/authentication/#user-impersonation
// Out of paranoia, remove all authentication headers that might
// have been set by the client.
r.Out.Header.Del("Authorization")
r.Out.Header.Del("Impersonate-Group")
r.Out.Header.Del("Impersonate-User")
r.Out.Header.Del("Impersonate-Uid")
for k := range r.Out.Header {
if strings.HasPrefix(k, "Impersonate-Extra-") {
r.Out.Header.Del(k)
}
}
// Now add the impersonation headers that we want.
if err := addImpersonationHeaders(r.Out, log); err != nil {
panic("failed to add impersonation headers: " + err.Error())
}
},
Transport: rt,
},
log: log,
lc: lc,
mode: mode,
upstreamURL: u,
ts: ts,
}
ap.rp = &httputil.ReverseProxy{
Rewrite: func(pr *httputil.ProxyRequest) {
ap.addImpersonationHeadersAsRequired(pr.Out)
},
Transport: rt,
}
mux := http.NewServeMux()
mux.HandleFunc("/", ap.serveDefault)
mux.HandleFunc("POST /api/v1/namespaces/{namespace}/pods/{pod}/exec", ap.serveExecSPDY)
mux.HandleFunc("GET /api/v1/namespaces/{namespace}/pods/{pod}/exec", ap.serveExecWS)
hs := &http.Server{
// Kubernetes uses SPDY for exec and port-forward, however SPDY is
// incompatible with HTTP/2; so disable HTTP/2 in the proxy.
@@ -199,14 +177,153 @@ func runAPIServerProxy(s *tsnet.Server, rt http.RoundTripper, log *zap.SugaredLo
NextProtos: []string{"http/1.1"},
},
TLSNextProto: make(map[string]func(*http.Server, *tls.Conn, http.Handler)),
Handler: ap,
Handler: mux,
}
log.Infof("listening on %s", ln.Addr())
log.Infof("API server proxy in %q mode is listening on %s", mode, ln.Addr())
if err := hs.ServeTLS(ln, "", ""); err != nil {
log.Fatalf("runAPIServerProxy: failed to serve %v", err)
}
}
// apiserverProxy is an [net/http.Handler] that authenticates requests using the Tailscale
// LocalAPI and then proxies them to the Kubernetes API.
type apiserverProxy struct {
log *zap.SugaredLogger
lc *tailscale.LocalClient
rp *httputil.ReverseProxy
mode apiServerProxyMode
ts *tsnet.Server
upstreamURL *url.URL
}
// serveDefault is the default handler for Kubernetes API server requests.
func (ap *apiserverProxy) serveDefault(w http.ResponseWriter, r *http.Request) {
who, err := ap.whoIs(r)
if err != nil {
ap.authError(w, err)
return
}
counterNumRequestsProxied.Add(1)
ap.rp.ServeHTTP(w, r.WithContext(whoIsKey.WithValue(r.Context(), who)))
}
// serveExecSPDY serves 'kubectl exec' requests for sessions streamed over SPDY,
// optionally configuring the kubectl exec sessions to be recorded.
func (ap *apiserverProxy) serveExecSPDY(w http.ResponseWriter, r *http.Request) {
ap.execForProto(w, r, ksr.SPDYProtocol)
}
// serveExecWS serves 'kubectl exec' requests for sessions streamed over WebSocket,
// optionally configuring the kubectl exec sessions to be recorded.
func (ap *apiserverProxy) serveExecWS(w http.ResponseWriter, r *http.Request) {
ap.execForProto(w, r, ksr.WSProtocol)
}
func (ap *apiserverProxy) execForProto(w http.ResponseWriter, r *http.Request, proto ksr.Protocol) {
const (
podNameKey = "pod"
namespaceNameKey = "namespace"
upgradeHeaderKey = "Upgrade"
)
who, err := ap.whoIs(r)
if err != nil {
ap.authError(w, err)
return
}
counterNumRequestsProxied.Add(1)
failOpen, addrs, err := determineRecorderConfig(who)
if err != nil {
ap.log.Errorf("error trying to determine whether the 'kubectl exec' session needs to be recorded: %v", err)
return
}
if failOpen && len(addrs) == 0 { // will not record
ap.rp.ServeHTTP(w, r.WithContext(whoIsKey.WithValue(r.Context(), who)))
return
}
ksr.CounterSessionRecordingsAttempted.Add(1) // at this point we know that users intended for this session to be recorded
if !failOpen && len(addrs) == 0 {
msg := "forbidden: 'kubectl exec' session must be recorded, but no recorders are available."
ap.log.Error(msg)
http.Error(w, msg, http.StatusForbidden)
return
}
wantsHeader := upgradeHeaderForProto[proto]
if h := r.Header.Get(upgradeHeaderKey); h != wantsHeader {
msg := fmt.Sprintf("[unexpected] unable to verify that streaming protocol is %s, wants Upgrade header %q, got: %q", proto, wantsHeader, h)
if failOpen {
msg = msg + "; failure mode is 'fail open'; continuing session without recording."
ap.log.Warn(msg)
ap.rp.ServeHTTP(w, r.WithContext(whoIsKey.WithValue(r.Context(), who)))
return
}
ap.log.Error(msg)
msg += "; failure mode is 'fail closed'; closing connection."
http.Error(w, msg, http.StatusForbidden)
return
}
opts := ksr.HijackerOpts{
Req: r,
W: w,
Proto: proto,
TS: ap.ts,
Who: who,
Addrs: addrs,
FailOpen: failOpen,
Pod: r.PathValue(podNameKey),
Namespace: r.PathValue(namespaceNameKey),
Log: ap.log,
}
h := ksr.New(opts)
ap.rp.ServeHTTP(h, r.WithContext(whoIsKey.WithValue(r.Context(), who)))
}
func (h *apiserverProxy) addImpersonationHeadersAsRequired(r *http.Request) {
r.URL.Scheme = h.upstreamURL.Scheme
r.URL.Host = h.upstreamURL.Host
if h.mode == apiserverProxyModeNoAuth {
// If we are not providing authentication, then we are just
// proxying to the Kubernetes API, so we don't need to do
// anything else.
return
}
// We want to proxy to the Kubernetes API, but we want to use
// the caller's identity to do so. We do this by impersonating
// the caller using the Kubernetes User Impersonation feature:
// https://kubernetes.io/docs/reference/access-authn-authz/authentication/#user-impersonation
// Out of paranoia, remove all authentication headers that might
// have been set by the client.
r.Header.Del("Authorization")
r.Header.Del("Impersonate-Group")
r.Header.Del("Impersonate-User")
r.Header.Del("Impersonate-Uid")
for k := range r.Header {
if strings.HasPrefix(k, "Impersonate-Extra-") {
r.Header.Del(k)
}
}
// Now add the impersonation headers that we want.
if err := addImpersonationHeaders(r, h.log); err != nil {
log.Printf("failed to add impersonation headers: " + err.Error())
}
}
func (ap *apiserverProxy) whoIs(r *http.Request) (*apitype.WhoIsResponse, error) {
return ap.lc.WhoIs(r.Context(), r.RemoteAddr)
}
func (ap *apiserverProxy) authError(w http.ResponseWriter, err error) {
ap.log.Errorf("failed to authenticate caller: %v", err)
http.Error(w, "failed to authenticate caller", http.StatusInternalServerError)
}
const (
// oldCapabilityName is a legacy form of
// tailfcg.PeerCapabilityKubernetes capability. The only capability rule
@@ -222,10 +339,10 @@ const (
func addImpersonationHeaders(r *http.Request, log *zap.SugaredLogger) error {
log = log.With("remote", r.RemoteAddr)
who := whoIsKey.Value(r.Context())
rules, err := tailcfg.UnmarshalCapJSON[tskube.KubernetesCapRule](who.CapMap, tailcfg.PeerCapabilityKubernetes)
rules, err := tailcfg.UnmarshalCapJSON[kubetypes.KubernetesCapRule](who.CapMap, tailcfg.PeerCapabilityKubernetes)
if len(rules) == 0 && err == nil {
// Try the old capability name for backwards compatibility.
rules, err = tailcfg.UnmarshalCapJSON[tskube.KubernetesCapRule](who.CapMap, oldCapabilityName)
rules, err = tailcfg.UnmarshalCapJSON[kubetypes.KubernetesCapRule](who.CapMap, oldCapabilityName)
}
if err != nil {
return fmt.Errorf("failed to unmarshal capability: %v", err)
@@ -266,3 +383,39 @@ func addImpersonationHeaders(r *http.Request, log *zap.SugaredLogger) error {
}
return nil
}
// determineRecorderConfig determines recorder config from requester's peer
// capabilities. Determines whether a 'kubectl exec' session from this requester
// needs to be recorded and what recorders the recording should be sent to.
func determineRecorderConfig(who *apitype.WhoIsResponse) (failOpen bool, recorderAddresses []netip.AddrPort, _ error) {
if who == nil {
return false, nil, errors.New("[unexpected] cannot determine caller")
}
failOpen = true
rules, err := tailcfg.UnmarshalCapJSON[kubetypes.KubernetesCapRule](who.CapMap, tailcfg.PeerCapabilityKubernetes)
if err != nil {
return failOpen, nil, fmt.Errorf("failed to unmarshal Kubernetes capability: %w", err)
}
if len(rules) == 0 {
return failOpen, nil, nil
}
for _, rule := range rules {
if len(rule.RecorderAddrs) != 0 {
// TODO (irbekrm): here or later determine if the
// recorders behind those addrs are online - else we
// spend 30s trying to reach a recorder whose tailscale
// status is offline.
recorderAddresses = append(recorderAddresses, rule.RecorderAddrs...)
}
if rule.EnforceRecorder {
failOpen = false
}
}
return failOpen, recorderAddresses, nil
}
var upgradeHeaderForProto = map[ksr.Protocol]string{
ksr.SPDYProtocol: "SPDY/3.1",
ksr.WSProtocol: "websocket",
}

View File

@@ -7,6 +7,8 @@ package main
import (
"net/http"
"net/netip"
"reflect"
"testing"
"github.com/google/go-cmp/cmp"
@@ -126,3 +128,72 @@ func TestImpersonationHeaders(t *testing.T) {
}
}
}
func Test_determineRecorderConfig(t *testing.T) {
addr1, addr2 := netip.MustParseAddrPort("[fd7a:115c:a1e0:ab12:4843:cd96:626b:628b]:80"), netip.MustParseAddrPort("100.99.99.99:80")
tests := []struct {
name string
wantFailOpen bool
wantRecorderAddresses []netip.AddrPort
who *apitype.WhoIsResponse
}{
{
name: "two_ips_fail_closed",
who: whoResp(map[string][]string{string(tailcfg.PeerCapabilityKubernetes): {`{"recorderAddrs":["[fd7a:115c:a1e0:ab12:4843:cd96:626b:628b]:80","100.99.99.99:80"],"enforceRecorder":true}`}}),
wantRecorderAddresses: []netip.AddrPort{addr1, addr2},
},
{
name: "two_ips_fail_open",
who: whoResp(map[string][]string{string(tailcfg.PeerCapabilityKubernetes): {`{"recorderAddrs":["[fd7a:115c:a1e0:ab12:4843:cd96:626b:628b]:80","100.99.99.99:80"]}`}}),
wantRecorderAddresses: []netip.AddrPort{addr1, addr2},
wantFailOpen: true,
},
{
name: "odd_rule_combination_fail_closed",
who: whoResp(map[string][]string{string(tailcfg.PeerCapabilityKubernetes): {`{"recorderAddrs":["100.99.99.99:80"],"enforceRecorder":false}`, `{"recorderAddrs":["[fd7a:115c:a1e0:ab12:4843:cd96:626b:628b]:80"]}`, `{"enforceRecorder":true,"impersonate":{"groups":["system:masters"]}}`}}),
wantRecorderAddresses: []netip.AddrPort{addr2, addr1},
},
{
name: "no_caps",
who: whoResp(map[string][]string{}),
wantFailOpen: true,
},
{
name: "no_recorder_caps",
who: whoResp(map[string][]string{"foo": {`{"x":"y"}`}, string(tailcfg.PeerCapabilityKubernetes): {`{"impersonate":{"groups":["system:masters"]}}`}}),
wantFailOpen: true,
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
gotFailOpen, gotRecorderAddresses, err := determineRecorderConfig(tt.who)
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
if gotFailOpen != tt.wantFailOpen {
t.Errorf("determineRecorderConfig() gotFailOpen = %v, want %v", gotFailOpen, tt.wantFailOpen)
}
if !reflect.DeepEqual(gotRecorderAddresses, tt.wantRecorderAddresses) {
t.Errorf("determineRecorderConfig() gotRecorderAddresses = %v, want %v", gotRecorderAddresses, tt.wantRecorderAddresses)
}
})
}
}
func whoResp(capMap map[string][]string) *apitype.WhoIsResponse {
resp := &apitype.WhoIsResponse{
CapMap: tailcfg.PeerCapMap{},
}
for cap, rules := range capMap {
resp.CapMap[tailcfg.PeerCapability(cap)] = raw(rules...)
}
return resp
}
func raw(in ...string) []tailcfg.RawMessage {
var out []tailcfg.RawMessage
for _, i := range in {
out = append(out, tailcfg.RawMessage(i))
}
return out
}

View File

@@ -8,7 +8,9 @@ package main
import (
"context"
"fmt"
"slices"
"strings"
"sync"
dockerref "github.com/distribution/reference"
"go.uber.org/zap"
@@ -18,6 +20,7 @@ import (
apivalidation "k8s.io/apimachinery/pkg/api/validation"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
metavalidation "k8s.io/apimachinery/pkg/apis/meta/v1/validation"
"k8s.io/apimachinery/pkg/types"
"k8s.io/apimachinery/pkg/util/validation/field"
"k8s.io/client-go/tools/record"
"sigs.k8s.io/controller-runtime/pkg/client"
@@ -25,6 +28,8 @@ import (
tsoperator "tailscale.com/k8s-operator"
tsapi "tailscale.com/k8s-operator/apis/v1alpha1"
"tailscale.com/tstime"
"tailscale.com/util/clientmetric"
"tailscale.com/util/set"
)
const (
@@ -41,8 +46,20 @@ type ProxyClassReconciler struct {
recorder record.EventRecorder
logger *zap.SugaredLogger
clock tstime.Clock
mu sync.Mutex // protects following
// managedProxyClasses is a set of all ProxyClass resources that we're currently
// managing. This is only used for metrics.
managedProxyClasses set.Slice[types.UID]
}
var (
// gaugeProxyClassResources tracks the number of ProxyClass resources
// that we're currently managing.
gaugeProxyClassResources = clientmetric.NewGauge("k8s_proxyclass_resources")
)
func (pcr *ProxyClassReconciler) Reconcile(ctx context.Context, req reconcile.Request) (res reconcile.Result, err error) {
logger := pcr.logger.With("ProxyClass", req.Name)
logger.Debugf("starting reconcile")
@@ -57,16 +74,33 @@ func (pcr *ProxyClassReconciler) Reconcile(ctx context.Context, req reconcile.Re
return reconcile.Result{}, fmt.Errorf("failed to get tailscale.com ProxyClass: %w", err)
}
if !pc.DeletionTimestamp.IsZero() {
logger.Debugf("ProxyClass is being deleted, do nothing")
return reconcile.Result{}, nil
logger.Debugf("ProxyClass is being deleted")
return reconcile.Result{}, pcr.maybeCleanup(ctx, logger, pc)
}
// Add a finalizer so that we can ensure that metrics get updated when
// this ProxyClass is deleted.
if !slices.Contains(pc.Finalizers, FinalizerName) {
logger.Debugf("updating ProxyClass finalizers")
pc.Finalizers = append(pc.Finalizers, FinalizerName)
if err := pcr.Update(ctx, pc); err != nil {
return res, fmt.Errorf("failed to add finalizer: %w", err)
}
}
// Ensure this ProxyClass is tracked in metrics.
pcr.mu.Lock()
pcr.managedProxyClasses.Add(pc.UID)
gaugeProxyClassResources.Set(int64(pcr.managedProxyClasses.Len()))
pcr.mu.Unlock()
oldPCStatus := pc.Status.DeepCopy()
if errs := pcr.validate(pc); errs != nil {
msg := fmt.Sprintf(messageProxyClassInvalid, errs.ToAggregate().Error())
pcr.recorder.Event(pc, corev1.EventTypeWarning, reasonProxyClassInvalid, msg)
tsoperator.SetProxyClassCondition(pc, tsapi.ProxyClassready, metav1.ConditionFalse, reasonProxyClassInvalid, msg, pc.Generation, pcr.clock, logger)
tsoperator.SetProxyClassCondition(pc, tsapi.ProxyClassReady, metav1.ConditionFalse, reasonProxyClassInvalid, msg, pc.Generation, pcr.clock, logger)
} else {
tsoperator.SetProxyClassCondition(pc, tsapi.ProxyClassready, metav1.ConditionTrue, reasonProxyClassValid, reasonProxyClassValid, pc.Generation, pcr.clock, logger)
tsoperator.SetProxyClassCondition(pc, tsapi.ProxyClassReady, metav1.ConditionTrue, reasonProxyClassValid, reasonProxyClassValid, pc.Generation, pcr.clock, logger)
}
if !apiequality.Semantic.DeepEqual(oldPCStatus, pc.Status) {
if err := pcr.Client.Status().Update(ctx, pc); err != nil {
@@ -77,7 +111,7 @@ func (pcr *ProxyClassReconciler) Reconcile(ctx context.Context, req reconcile.Re
return reconcile.Result{}, nil
}
func (a *ProxyClassReconciler) validate(pc *tsapi.ProxyClass) (violations field.ErrorList) {
func (pcr *ProxyClassReconciler) validate(pc *tsapi.ProxyClass) (violations field.ErrorList) {
if sts := pc.Spec.StatefulSet; sts != nil {
if len(sts.Labels) > 0 {
if errs := metavalidation.ValidateLabels(sts.Labels, field.NewPath(".spec.statefulSet.labels")); errs != nil {
@@ -103,13 +137,13 @@ func (a *ProxyClassReconciler) validate(pc *tsapi.ProxyClass) (violations field.
if tc := pod.TailscaleContainer; tc != nil {
for _, e := range tc.Env {
if strings.HasPrefix(string(e.Name), "TS_") {
a.recorder.Event(pc, corev1.EventTypeWarning, reasonCustomTSEnvVar, fmt.Sprintf(messageCustomTSEnvVar, string(e.Name), "tailscale"))
pcr.recorder.Event(pc, corev1.EventTypeWarning, reasonCustomTSEnvVar, fmt.Sprintf(messageCustomTSEnvVar, string(e.Name), "tailscale"))
}
if strings.EqualFold(string(e.Name), "EXPERIMENTAL_TS_CONFIGFILE_PATH") {
a.recorder.Event(pc, corev1.EventTypeWarning, reasonCustomTSEnvVar, fmt.Sprintf(messageCustomTSEnvVar, string(e.Name), "tailscale"))
pcr.recorder.Event(pc, corev1.EventTypeWarning, reasonCustomTSEnvVar, fmt.Sprintf(messageCustomTSEnvVar, string(e.Name), "tailscale"))
}
if strings.EqualFold(string(e.Name), "EXPERIMENTAL_ALLOW_PROXYING_CLUSTER_TRAFFIC_VIA_INGRESS") {
a.recorder.Event(pc, corev1.EventTypeWarning, reasonCustomTSEnvVar, fmt.Sprintf(messageCustomTSEnvVar, string(e.Name), "tailscale"))
pcr.recorder.Event(pc, corev1.EventTypeWarning, reasonCustomTSEnvVar, fmt.Sprintf(messageCustomTSEnvVar, string(e.Name), "tailscale"))
}
}
if tc.Image != "" {
@@ -135,3 +169,27 @@ func (a *ProxyClassReconciler) validate(pc *tsapi.ProxyClass) (violations field.
// time.
return violations
}
// maybeCleanup removes tailscale.com finalizer and ensures that the ProxyClass
// is no longer counted towards k8s_proxyclass_resources.
func (pcr *ProxyClassReconciler) maybeCleanup(ctx context.Context, logger *zap.SugaredLogger, pc *tsapi.ProxyClass) error {
ix := slices.Index(pc.Finalizers, FinalizerName)
if ix < 0 {
logger.Debugf("no finalizer, nothing to do")
pcr.mu.Lock()
defer pcr.mu.Unlock()
pcr.managedProxyClasses.Remove(pc.UID)
gaugeProxyClassResources.Set(int64(pcr.managedProxyClasses.Len()))
return nil
}
pc.Finalizers = append(pc.Finalizers[:ix], pc.Finalizers[ix+1:]...)
if err := pcr.Update(ctx, pc); err != nil {
return fmt.Errorf("failed to remove finalizer: %w", err)
}
pcr.mu.Lock()
defer pcr.mu.Unlock()
pcr.managedProxyClasses.Remove(pc.UID)
gaugeProxyClassResources.Set(int64(pcr.managedProxyClasses.Len()))
logger.Infof("ProxyClass resources have been cleaned up")
return nil
}

View File

@@ -29,7 +29,8 @@ func TestProxyClass(t *testing.T) {
// The apiserver is supposed to set the UID, but the fake client
// doesn't. So, set it explicitly because other code later depends
// on it being set.
UID: types.UID("1234-UID"),
UID: types.UID("1234-UID"),
Finalizers: []string{"tailscale.com/finalizer"},
},
Spec: tsapi.ProxyClassSpec{
StatefulSet: &tsapi.StatefulSet{
@@ -68,7 +69,7 @@ func TestProxyClass(t *testing.T) {
// 1. A valid ProxyClass resource gets its status updated to Ready.
expectReconciled(t, pcr, "", "test")
pc.Status.Conditions = append(pc.Status.Conditions, metav1.Condition{
Type: string(tsapi.ProxyClassready),
Type: string(tsapi.ProxyClassReady),
Status: metav1.ConditionTrue,
Reason: reasonProxyClassValid,
Message: reasonProxyClassValid,
@@ -84,7 +85,7 @@ func TestProxyClass(t *testing.T) {
})
expectReconciled(t, pcr, "", "test")
msg := `ProxyClass is not valid: .spec.statefulSet.labels: Invalid value: "?!someVal": a valid label must be an empty string or consist of alphanumeric characters, '-', '_' or '.', and must start and end with an alphanumeric character (e.g. 'MyValue', or 'my_value', or '12345', regex used for validation is '(([A-Za-z0-9][-A-Za-z0-9_.]*)?[A-Za-z0-9])?')`
tsoperator.SetProxyClassCondition(pc, tsapi.ProxyClassready, metav1.ConditionFalse, reasonProxyClassInvalid, msg, 0, cl, zl.Sugar())
tsoperator.SetProxyClassCondition(pc, tsapi.ProxyClassReady, metav1.ConditionFalse, reasonProxyClassInvalid, msg, 0, cl, zl.Sugar())
expectEqual(t, fc, pc, nil)
expectedEvent := "Warning ProxyClassInvalid ProxyClass is not valid: .spec.statefulSet.labels: Invalid value: \"?!someVal\": a valid label must be an empty string or consist of alphanumeric characters, '-', '_' or '.', and must start and end with an alphanumeric character (e.g. 'MyValue', or 'my_value', or '12345', regex used for validation is '(([A-Za-z0-9][-A-Za-z0-9_.]*)?[A-Za-z0-9])?')"
expectEvents(t, fr, []string{expectedEvent})
@@ -98,7 +99,7 @@ func TestProxyClass(t *testing.T) {
})
expectReconciled(t, pcr, "", "test")
msg = `ProxyClass is not valid: spec.statefulSet.pod.tailscaleContainer.image: Invalid value: "FOO bar": invalid reference format: repository name (library/FOO bar) must be lowercase`
tsoperator.SetProxyClassCondition(pc, tsapi.ProxyClassready, metav1.ConditionFalse, reasonProxyClassInvalid, msg, 0, cl, zl.Sugar())
tsoperator.SetProxyClassCondition(pc, tsapi.ProxyClassReady, metav1.ConditionFalse, reasonProxyClassInvalid, msg, 0, cl, zl.Sugar())
expectEqual(t, fc, pc, nil)
expectedEvent = `Warning ProxyClassInvalid ProxyClass is not valid: spec.statefulSet.pod.tailscaleContainer.image: Invalid value: "FOO bar": invalid reference format: repository name (library/FOO bar) must be lowercase`
expectEvents(t, fr, []string{expectedEvent})
@@ -117,7 +118,7 @@ func TestProxyClass(t *testing.T) {
})
expectReconciled(t, pcr, "", "test")
msg = `ProxyClass is not valid: spec.statefulSet.pod.tailscaleInitContainer.image: Invalid value: "FOO bar": invalid reference format: repository name (library/FOO bar) must be lowercase`
tsoperator.SetProxyClassCondition(pc, tsapi.ProxyClassready, metav1.ConditionFalse, reasonProxyClassInvalid, msg, 0, cl, zl.Sugar())
tsoperator.SetProxyClassCondition(pc, tsapi.ProxyClassReady, metav1.ConditionFalse, reasonProxyClassInvalid, msg, 0, cl, zl.Sugar())
expectEqual(t, fc, pc, nil)
expectedEvent = `Warning ProxyClassInvalid ProxyClass is not valid: spec.statefulSet.pod.tailscaleInitContainer.image: Invalid value: "FOO bar": invalid reference format: repository name (library/FOO bar) must be lowercase`
expectEvents(t, fr, []string{expectedEvent})

View File

@@ -0,0 +1,549 @@
// Copyright (c) Tailscale Inc & AUTHORS
// SPDX-License-Identifier: BSD-3-Clause
//go:build !plan9
package main
import (
"context"
"crypto/sha256"
"encoding/json"
"fmt"
"net/http"
"slices"
"sync"
"github.com/pkg/errors"
"go.uber.org/zap"
xslices "golang.org/x/exp/slices"
appsv1 "k8s.io/api/apps/v1"
corev1 "k8s.io/api/core/v1"
rbacv1 "k8s.io/api/rbac/v1"
apiequality "k8s.io/apimachinery/pkg/api/equality"
apierrors "k8s.io/apimachinery/pkg/api/errors"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
"k8s.io/apimachinery/pkg/types"
"k8s.io/client-go/tools/record"
"sigs.k8s.io/controller-runtime/pkg/client"
"sigs.k8s.io/controller-runtime/pkg/reconcile"
"tailscale.com/client/tailscale"
"tailscale.com/ipn"
tsoperator "tailscale.com/k8s-operator"
tsapi "tailscale.com/k8s-operator/apis/v1alpha1"
"tailscale.com/kube/kubetypes"
"tailscale.com/tailcfg"
"tailscale.com/tstime"
"tailscale.com/types/ptr"
"tailscale.com/util/clientmetric"
"tailscale.com/util/mak"
"tailscale.com/util/set"
)
const (
reasonProxyGroupCreationFailed = "ProxyGroupCreationFailed"
reasonProxyGroupReady = "ProxyGroupReady"
reasonProxyGroupCreating = "ProxyGroupCreating"
reasonProxyGroupInvalid = "ProxyGroupInvalid"
)
var gaugeProxyGroupResources = clientmetric.NewGauge(kubetypes.MetricProxyGroupEgressCount)
// ProxyGroupReconciler ensures cluster resources for a ProxyGroup definition.
type ProxyGroupReconciler struct {
client.Client
l *zap.SugaredLogger
recorder record.EventRecorder
clock tstime.Clock
tsClient tsClient
// User-specified defaults from the helm installation.
tsNamespace string
proxyImage string
defaultTags []string
tsFirewallMode string
defaultProxyClass string
mu sync.Mutex // protects following
proxyGroups set.Slice[types.UID] // for proxygroups gauge
}
func (r *ProxyGroupReconciler) logger(name string) *zap.SugaredLogger {
return r.l.With("ProxyGroup", name)
}
func (r *ProxyGroupReconciler) Reconcile(ctx context.Context, req reconcile.Request) (_ reconcile.Result, err error) {
logger := r.logger(req.Name)
logger.Debugf("starting reconcile")
defer logger.Debugf("reconcile finished")
pg := new(tsapi.ProxyGroup)
err = r.Get(ctx, req.NamespacedName, pg)
if apierrors.IsNotFound(err) {
logger.Debugf("ProxyGroup not found, assuming it was deleted")
return reconcile.Result{}, nil
} else if err != nil {
return reconcile.Result{}, fmt.Errorf("failed to get tailscale.com ProxyGroup: %w", err)
}
if markedForDeletion(pg) {
logger.Debugf("ProxyGroup is being deleted, cleaning up resources")
ix := xslices.Index(pg.Finalizers, FinalizerName)
if ix < 0 {
logger.Debugf("no finalizer, nothing to do")
return reconcile.Result{}, nil
}
if done, err := r.maybeCleanup(ctx, pg); err != nil {
return reconcile.Result{}, err
} else if !done {
logger.Debugf("ProxyGroup resource cleanup not yet finished, will retry...")
return reconcile.Result{RequeueAfter: shortRequeue}, nil
}
pg.Finalizers = slices.Delete(pg.Finalizers, ix, ix+1)
if err := r.Update(ctx, pg); err != nil {
return reconcile.Result{}, err
}
return reconcile.Result{}, nil
}
oldPGStatus := pg.Status.DeepCopy()
setStatusReady := func(pg *tsapi.ProxyGroup, status metav1.ConditionStatus, reason, message string) (reconcile.Result, error) {
tsoperator.SetProxyGroupCondition(pg, tsapi.ProxyGroupReady, status, reason, message, pg.Generation, r.clock, logger)
if !apiequality.Semantic.DeepEqual(oldPGStatus, pg.Status) {
// An error encountered here should get returned by the Reconcile function.
if updateErr := r.Client.Status().Update(ctx, pg); updateErr != nil {
err = errors.Wrap(err, updateErr.Error())
}
}
return reconcile.Result{}, err
}
if !slices.Contains(pg.Finalizers, FinalizerName) {
// This log line is printed exactly once during initial provisioning,
// because once the finalizer is in place this block gets skipped. So,
// this is a nice place to log that the high level, multi-reconcile
// operation is underway.
logger.Infof("ensuring ProxyGroup is set up")
pg.Finalizers = append(pg.Finalizers, FinalizerName)
if err = r.Update(ctx, pg); err != nil {
err = fmt.Errorf("error adding finalizer: %w", err)
return setStatusReady(pg, metav1.ConditionFalse, reasonProxyGroupCreationFailed, reasonProxyGroupCreationFailed)
}
}
if err = r.validate(pg); err != nil {
message := fmt.Sprintf("ProxyGroup is invalid: %s", err)
r.recorder.Eventf(pg, corev1.EventTypeWarning, reasonProxyGroupInvalid, message)
return setStatusReady(pg, metav1.ConditionFalse, reasonProxyGroupInvalid, message)
}
proxyClassName := r.defaultProxyClass
if pg.Spec.ProxyClass != "" {
proxyClassName = pg.Spec.ProxyClass
}
var proxyClass *tsapi.ProxyClass
if proxyClassName != "" {
proxyClass = new(tsapi.ProxyClass)
err := r.Get(ctx, types.NamespacedName{Name: proxyClassName}, proxyClass)
if apierrors.IsNotFound(err) {
err = nil
message := fmt.Sprintf("the ProxyGroup's ProxyClass %s does not (yet) exist", proxyClassName)
logger.Info(message)
return setStatusReady(pg, metav1.ConditionFalse, reasonProxyGroupCreating, message)
}
if err != nil {
err = fmt.Errorf("error getting ProxyGroup's ProxyClass %s: %s", proxyClassName, err)
r.recorder.Eventf(pg, corev1.EventTypeWarning, reasonProxyGroupCreationFailed, err.Error())
return setStatusReady(pg, metav1.ConditionFalse, reasonProxyGroupCreationFailed, err.Error())
}
if !tsoperator.ProxyClassIsReady(proxyClass) {
message := fmt.Sprintf("the ProxyGroup's ProxyClass %s is not yet in a ready state, waiting...", proxyClassName)
logger.Info(message)
return setStatusReady(pg, metav1.ConditionFalse, reasonProxyGroupCreating, message)
}
}
if err = r.maybeProvision(ctx, pg, proxyClass); err != nil {
err = fmt.Errorf("error provisioning ProxyGroup resources: %w", err)
r.recorder.Eventf(pg, corev1.EventTypeWarning, reasonProxyGroupCreationFailed, err.Error())
return setStatusReady(pg, metav1.ConditionFalse, reasonProxyGroupCreationFailed, err.Error())
}
desiredReplicas := int(pgReplicas(pg))
if len(pg.Status.Devices) < desiredReplicas {
message := fmt.Sprintf("%d/%d ProxyGroup pods running", len(pg.Status.Devices), desiredReplicas)
logger.Debug(message)
return setStatusReady(pg, metav1.ConditionFalse, reasonProxyGroupCreating, message)
}
if len(pg.Status.Devices) > desiredReplicas {
message := fmt.Sprintf("waiting for %d ProxyGroup pods to shut down", len(pg.Status.Devices)-desiredReplicas)
logger.Debug(message)
return setStatusReady(pg, metav1.ConditionFalse, reasonProxyGroupCreating, message)
}
logger.Info("ProxyGroup resources synced")
return setStatusReady(pg, metav1.ConditionTrue, reasonProxyGroupReady, reasonProxyGroupReady)
}
func (r *ProxyGroupReconciler) maybeProvision(ctx context.Context, pg *tsapi.ProxyGroup, proxyClass *tsapi.ProxyClass) error {
logger := r.logger(pg.Name)
r.mu.Lock()
r.proxyGroups.Add(pg.UID)
gaugeProxyGroupResources.Set(int64(r.proxyGroups.Len()))
r.mu.Unlock()
cfgHash, err := r.ensureConfigSecretsCreated(ctx, pg, proxyClass)
if err != nil {
return fmt.Errorf("error provisioning config Secrets: %w", err)
}
// State secrets are precreated so we can use the ProxyGroup CR as their owner ref.
stateSecrets := pgStateSecrets(pg, r.tsNamespace)
for _, sec := range stateSecrets {
if _, err := createOrUpdate(ctx, r.Client, r.tsNamespace, sec, func(s *corev1.Secret) {
s.ObjectMeta.Labels = sec.ObjectMeta.Labels
s.ObjectMeta.Annotations = sec.ObjectMeta.Annotations
s.ObjectMeta.OwnerReferences = sec.ObjectMeta.OwnerReferences
}); err != nil {
return fmt.Errorf("error provisioning state Secrets: %w", err)
}
}
sa := pgServiceAccount(pg, r.tsNamespace)
if _, err := createOrUpdate(ctx, r.Client, r.tsNamespace, sa, func(s *corev1.ServiceAccount) {
s.ObjectMeta.Labels = sa.ObjectMeta.Labels
s.ObjectMeta.Annotations = sa.ObjectMeta.Annotations
s.ObjectMeta.OwnerReferences = sa.ObjectMeta.OwnerReferences
}); err != nil {
return fmt.Errorf("error provisioning ServiceAccount: %w", err)
}
role := pgRole(pg, r.tsNamespace)
if _, err := createOrUpdate(ctx, r.Client, r.tsNamespace, role, func(r *rbacv1.Role) {
r.ObjectMeta.Labels = role.ObjectMeta.Labels
r.ObjectMeta.Annotations = role.ObjectMeta.Annotations
r.ObjectMeta.OwnerReferences = role.ObjectMeta.OwnerReferences
r.Rules = role.Rules
}); err != nil {
return fmt.Errorf("error provisioning Role: %w", err)
}
roleBinding := pgRoleBinding(pg, r.tsNamespace)
if _, err := createOrUpdate(ctx, r.Client, r.tsNamespace, roleBinding, func(r *rbacv1.RoleBinding) {
r.ObjectMeta.Labels = roleBinding.ObjectMeta.Labels
r.ObjectMeta.Annotations = roleBinding.ObjectMeta.Annotations
r.ObjectMeta.OwnerReferences = roleBinding.ObjectMeta.OwnerReferences
r.RoleRef = roleBinding.RoleRef
r.Subjects = roleBinding.Subjects
}); err != nil {
return fmt.Errorf("error provisioning RoleBinding: %w", err)
}
if pg.Spec.Type == tsapi.ProxyGroupTypeEgress {
cm := pgEgressCM(pg, r.tsNamespace)
if _, err := createOrUpdate(ctx, r.Client, r.tsNamespace, cm, func(existing *corev1.ConfigMap) {
existing.ObjectMeta.Labels = cm.ObjectMeta.Labels
existing.ObjectMeta.OwnerReferences = cm.ObjectMeta.OwnerReferences
}); err != nil {
return fmt.Errorf("error provisioning ConfigMap: %w", err)
}
}
ss, err := pgStatefulSet(pg, r.tsNamespace, r.proxyImage, r.tsFirewallMode, cfgHash)
if err != nil {
return fmt.Errorf("error generating StatefulSet spec: %w", err)
}
ss = applyProxyClassToStatefulSet(proxyClass, ss, nil, logger)
if _, err := createOrUpdate(ctx, r.Client, r.tsNamespace, ss, func(s *appsv1.StatefulSet) {
s.ObjectMeta.Labels = ss.ObjectMeta.Labels
s.ObjectMeta.Annotations = ss.ObjectMeta.Annotations
s.ObjectMeta.OwnerReferences = ss.ObjectMeta.OwnerReferences
s.Spec = ss.Spec
}); err != nil {
return fmt.Errorf("error provisioning StatefulSet: %w", err)
}
if err := r.cleanupDanglingResources(ctx, pg); err != nil {
return fmt.Errorf("error cleaning up dangling resources: %w", err)
}
devices, err := r.getDeviceInfo(ctx, pg)
if err != nil {
return fmt.Errorf("failed to get device info: %w", err)
}
pg.Status.Devices = devices
return nil
}
// cleanupDanglingResources ensures we don't leak config secrets, state secrets, and
// tailnet devices when the number of replicas specified is reduced.
func (r *ProxyGroupReconciler) cleanupDanglingResources(ctx context.Context, pg *tsapi.ProxyGroup) error {
logger := r.logger(pg.Name)
metadata, err := r.getNodeMetadata(ctx, pg)
if err != nil {
return err
}
for _, m := range metadata {
if m.ordinal+1 <= int(pgReplicas(pg)) {
continue
}
// Dangling resource, delete the config + state Secrets, as well as
// deleting the device from the tailnet.
if err := r.deleteTailnetDevice(ctx, m.tsID, logger); err != nil {
return err
}
if err := r.Delete(ctx, m.stateSecret); err != nil {
if !apierrors.IsNotFound(err) {
return fmt.Errorf("error deleting state Secret %s: %w", m.stateSecret.Name, err)
}
}
configSecret := m.stateSecret.DeepCopy()
configSecret.Name += "-config"
if err := r.Delete(ctx, configSecret); err != nil {
if !apierrors.IsNotFound(err) {
return fmt.Errorf("error deleting config Secret %s: %w", configSecret.Name, err)
}
}
}
return nil
}
// maybeCleanup just deletes the device from the tailnet. All the kubernetes
// resources linked to a ProxyGroup will get cleaned up via owner references
// (which we can use because they are all in the same namespace).
func (r *ProxyGroupReconciler) maybeCleanup(ctx context.Context, pg *tsapi.ProxyGroup) (bool, error) {
logger := r.logger(pg.Name)
metadata, err := r.getNodeMetadata(ctx, pg)
if err != nil {
return false, err
}
for _, m := range metadata {
if err := r.deleteTailnetDevice(ctx, m.tsID, logger); err != nil {
return false, err
}
}
logger.Infof("cleaned up ProxyGroup resources")
r.mu.Lock()
r.proxyGroups.Remove(pg.UID)
gaugeProxyGroupResources.Set(int64(r.proxyGroups.Len()))
r.mu.Unlock()
return true, nil
}
func (r *ProxyGroupReconciler) deleteTailnetDevice(ctx context.Context, id tailcfg.StableNodeID, logger *zap.SugaredLogger) error {
logger.Debugf("deleting device %s from control", string(id))
if err := r.tsClient.DeleteDevice(ctx, string(id)); err != nil {
errResp := &tailscale.ErrResponse{}
if ok := errors.As(err, errResp); ok && errResp.Status == http.StatusNotFound {
logger.Debugf("device %s not found, likely because it has already been deleted from control", string(id))
} else {
return fmt.Errorf("error deleting device: %w", err)
}
} else {
logger.Debugf("device %s deleted from control", string(id))
}
return nil
}
func (r *ProxyGroupReconciler) ensureConfigSecretsCreated(ctx context.Context, pg *tsapi.ProxyGroup, proxyClass *tsapi.ProxyClass) (hash string, err error) {
logger := r.logger(pg.Name)
var configSHA256Sum string
for i := range pgReplicas(pg) {
cfgSecret := &corev1.Secret{
ObjectMeta: metav1.ObjectMeta{
Name: fmt.Sprintf("%s-%d-config", pg.Name, i),
Namespace: r.tsNamespace,
Labels: pgSecretLabels(pg.Name, "config"),
OwnerReferences: pgOwnerReference(pg),
},
}
var existingCfgSecret *corev1.Secret // unmodified copy of secret
if err := r.Get(ctx, client.ObjectKeyFromObject(cfgSecret), cfgSecret); err == nil {
logger.Debugf("secret %s/%s already exists", cfgSecret.GetNamespace(), cfgSecret.GetName())
existingCfgSecret = cfgSecret.DeepCopy()
} else if !apierrors.IsNotFound(err) {
return "", err
}
var authKey string
if existingCfgSecret == nil {
logger.Debugf("creating authkey for new ProxyGroup proxy")
tags := pg.Spec.Tags.Stringify()
if len(tags) == 0 {
tags = r.defaultTags
}
authKey, err = newAuthKey(ctx, r.tsClient, tags)
if err != nil {
return "", err
}
}
configs, err := pgTailscaledConfig(pg, proxyClass, i, authKey, existingCfgSecret)
if err != nil {
return "", fmt.Errorf("error creating tailscaled config: %w", err)
}
for cap, cfg := range configs {
cfgJSON, err := json.Marshal(cfg)
if err != nil {
return "", fmt.Errorf("error marshalling tailscaled config: %w", err)
}
mak.Set(&cfgSecret.StringData, tsoperator.TailscaledConfigFileName(cap), string(cfgJSON))
}
// The config sha256 sum is a value for a hash annotation used to trigger
// pod restarts when tailscaled config changes. Any config changes apply
// to all replicas, so it is sufficient to only hash the config for the
// first replica.
//
// In future, we're aiming to eliminate restarts altogether and have
// pods dynamically reload their config when it changes.
if i == 0 {
sum := sha256.New()
for _, cfg := range configs {
// Zero out the auth key so it doesn't affect the sha256 hash when we
// remove it from the config after the pods have all authed. Otherwise
// all the pods will need to restart immediately after authing.
cfg.AuthKey = nil
b, err := json.Marshal(cfg)
if err != nil {
return "", err
}
if _, err := sum.Write(b); err != nil {
return "", err
}
}
configSHA256Sum = fmt.Sprintf("%x", sum.Sum(nil))
}
if existingCfgSecret != nil {
logger.Debugf("patching the existing ProxyGroup config Secret %s", cfgSecret.Name)
if err := r.Patch(ctx, cfgSecret, client.MergeFrom(existingCfgSecret)); err != nil {
return "", err
}
} else {
logger.Debugf("creating a new config Secret %s for the ProxyGroup", cfgSecret.Name)
if err := r.Create(ctx, cfgSecret); err != nil {
return "", err
}
}
}
return configSHA256Sum, nil
}
func pgTailscaledConfig(pg *tsapi.ProxyGroup, class *tsapi.ProxyClass, idx int32, authKey string, oldSecret *corev1.Secret) (tailscaledConfigs, error) {
conf := &ipn.ConfigVAlpha{
Version: "alpha0",
AcceptDNS: "false",
AcceptRoutes: "false", // AcceptRoutes defaults to true
Locked: "false",
Hostname: ptr.To(fmt.Sprintf("%s-%d", pg.Name, idx)),
}
if pg.Spec.HostnamePrefix != "" {
conf.Hostname = ptr.To(fmt.Sprintf("%s%d", pg.Spec.HostnamePrefix, idx))
}
if shouldAcceptRoutes(class) {
conf.AcceptRoutes = "true"
}
deviceAuthed := false
for _, d := range pg.Status.Devices {
if d.Hostname == *conf.Hostname {
deviceAuthed = true
break
}
}
if authKey != "" {
conf.AuthKey = &authKey
} else if !deviceAuthed {
key, err := authKeyFromSecret(oldSecret)
if err != nil {
return nil, fmt.Errorf("error retrieving auth key from Secret: %w", err)
}
conf.AuthKey = key
}
capVerConfigs := make(map[tailcfg.CapabilityVersion]ipn.ConfigVAlpha)
capVerConfigs[106] = *conf
return capVerConfigs, nil
}
func (r *ProxyGroupReconciler) validate(_ *tsapi.ProxyGroup) error {
return nil
}
// getNodeMetadata gets metadata for all the pods owned by this ProxyGroup by
// querying their state Secrets. It may not return the same number of items as
// specified in the ProxyGroup spec if e.g. it is getting scaled up or down, or
// some pods have failed to write state.
func (r *ProxyGroupReconciler) getNodeMetadata(ctx context.Context, pg *tsapi.ProxyGroup) (metadata []nodeMetadata, _ error) {
// List all state secrets owned by this ProxyGroup.
secrets := &corev1.SecretList{}
if err := r.List(ctx, secrets, client.InNamespace(r.tsNamespace), client.MatchingLabels(pgSecretLabels(pg.Name, "state"))); err != nil {
return nil, fmt.Errorf("failed to list state Secrets: %w", err)
}
for _, secret := range secrets.Items {
var ordinal int
if _, err := fmt.Sscanf(secret.Name, pg.Name+"-%d", &ordinal); err != nil {
return nil, fmt.Errorf("unexpected secret %s was labelled as owned by the ProxyGroup %s: %w", secret.Name, pg.Name, err)
}
id, dnsName, ok, err := getNodeMetadata(ctx, &secret)
if err != nil {
return nil, err
}
if !ok {
continue
}
metadata = append(metadata, nodeMetadata{
ordinal: ordinal,
stateSecret: &secret,
tsID: id,
dnsName: dnsName,
})
}
return metadata, nil
}
func (r *ProxyGroupReconciler) getDeviceInfo(ctx context.Context, pg *tsapi.ProxyGroup) (devices []tsapi.TailnetDevice, _ error) {
metadata, err := r.getNodeMetadata(ctx, pg)
if err != nil {
return nil, err
}
for _, m := range metadata {
device, ok, err := getDeviceInfo(ctx, r.tsClient, m.stateSecret)
if err != nil {
return nil, err
}
if !ok {
continue
}
devices = append(devices, tsapi.TailnetDevice{
Hostname: device.Hostname,
TailnetIPs: device.TailnetIPs,
})
}
return devices, nil
}
type nodeMetadata struct {
ordinal int
stateSecret *corev1.Secret
tsID tailcfg.StableNodeID
dnsName string
}

View File

@@ -0,0 +1,303 @@
// Copyright (c) Tailscale Inc & AUTHORS
// SPDX-License-Identifier: BSD-3-Clause
//go:build !plan9
package main
import (
"fmt"
appsv1 "k8s.io/api/apps/v1"
corev1 "k8s.io/api/core/v1"
rbacv1 "k8s.io/api/rbac/v1"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
"sigs.k8s.io/yaml"
tsapi "tailscale.com/k8s-operator/apis/v1alpha1"
"tailscale.com/kube/egressservices"
"tailscale.com/kube/kubetypes"
"tailscale.com/types/ptr"
)
// Returns the base StatefulSet definition for a ProxyGroup. A ProxyClass may be
// applied over the top after.
func pgStatefulSet(pg *tsapi.ProxyGroup, namespace, image, tsFirewallMode, cfgHash string) (*appsv1.StatefulSet, error) {
ss := new(appsv1.StatefulSet)
if err := yaml.Unmarshal(proxyYaml, &ss); err != nil {
return nil, fmt.Errorf("failed to unmarshal proxy spec: %w", err)
}
// Validate some base assumptions.
if len(ss.Spec.Template.Spec.InitContainers) != 1 {
return nil, fmt.Errorf("[unexpected] base proxy config had %d init containers instead of 1", len(ss.Spec.Template.Spec.InitContainers))
}
if len(ss.Spec.Template.Spec.Containers) != 1 {
return nil, fmt.Errorf("[unexpected] base proxy config had %d containers instead of 1", len(ss.Spec.Template.Spec.Containers))
}
// StatefulSet config.
ss.ObjectMeta = metav1.ObjectMeta{
Name: pg.Name,
Namespace: namespace,
Labels: pgLabels(pg.Name, nil),
OwnerReferences: pgOwnerReference(pg),
}
ss.Spec.Replicas = ptr.To(pgReplicas(pg))
ss.Spec.Selector = &metav1.LabelSelector{
MatchLabels: pgLabels(pg.Name, nil),
}
// Template config.
tmpl := &ss.Spec.Template
tmpl.ObjectMeta = metav1.ObjectMeta{
Name: pg.Name,
Namespace: namespace,
Labels: pgLabels(pg.Name, nil),
DeletionGracePeriodSeconds: ptr.To[int64](10),
Annotations: map[string]string{
podAnnotationLastSetConfigFileHash: cfgHash,
},
}
tmpl.Spec.ServiceAccountName = pg.Name
tmpl.Spec.InitContainers[0].Image = image
tmpl.Spec.Volumes = func() []corev1.Volume {
var volumes []corev1.Volume
for i := range pgReplicas(pg) {
volumes = append(volumes, corev1.Volume{
Name: fmt.Sprintf("tailscaledconfig-%d", i),
VolumeSource: corev1.VolumeSource{
Secret: &corev1.SecretVolumeSource{
SecretName: fmt.Sprintf("%s-%d-config", pg.Name, i),
},
},
})
}
if pg.Spec.Type == tsapi.ProxyGroupTypeEgress {
volumes = append(volumes, corev1.Volume{
Name: pgEgressCMName(pg.Name),
VolumeSource: corev1.VolumeSource{
ConfigMap: &corev1.ConfigMapVolumeSource{
LocalObjectReference: corev1.LocalObjectReference{
Name: pgEgressCMName(pg.Name),
},
},
},
})
}
return volumes
}()
// Main container config.
c := &ss.Spec.Template.Spec.Containers[0]
c.Image = image
c.VolumeMounts = func() []corev1.VolumeMount {
var mounts []corev1.VolumeMount
// TODO(tomhjp): Read config directly from the secret instead. The
// mounts change on scaling up/down which causes unnecessary restarts
// for pods that haven't meaningfully changed.
for i := range pgReplicas(pg) {
mounts = append(mounts, corev1.VolumeMount{
Name: fmt.Sprintf("tailscaledconfig-%d", i),
ReadOnly: true,
MountPath: fmt.Sprintf("/etc/tsconfig/%s-%d", pg.Name, i),
})
}
if pg.Spec.Type == tsapi.ProxyGroupTypeEgress {
mounts = append(mounts, corev1.VolumeMount{
Name: pgEgressCMName(pg.Name),
MountPath: "/etc/proxies",
ReadOnly: true,
})
}
return mounts
}()
c.Env = func() []corev1.EnvVar {
envs := []corev1.EnvVar{
{
// TODO(irbekrm): verify that .status.podIPs are always set, else read in .status.podIP as well.
Name: "POD_IPS", // this will be a comma separate list i.e 10.136.0.6,2600:1900:4011:161:0:e:0:6
ValueFrom: &corev1.EnvVarSource{
FieldRef: &corev1.ObjectFieldSelector{
FieldPath: "status.podIPs",
},
},
},
{
Name: "POD_NAME",
ValueFrom: &corev1.EnvVarSource{
FieldRef: &corev1.ObjectFieldSelector{
// Secret is named after the pod.
FieldPath: "metadata.name",
},
},
},
{
Name: "TS_KUBE_SECRET",
Value: "$(POD_NAME)",
},
{
Name: "TS_STATE",
Value: "kube:$(POD_NAME)",
},
{
Name: "TS_EXPERIMENTAL_VERSIONED_CONFIG_DIR",
Value: "/etc/tsconfig/$(POD_NAME)",
},
{
Name: "TS_USERSPACE",
Value: "false",
},
{
Name: "TS_INTERNAL_APP",
Value: kubetypes.AppProxyGroupEgress,
},
}
if tsFirewallMode != "" {
envs = append(envs, corev1.EnvVar{
Name: "TS_DEBUG_FIREWALL_MODE",
Value: tsFirewallMode,
})
}
if pg.Spec.Type == tsapi.ProxyGroupTypeEgress {
envs = append(envs, corev1.EnvVar{
Name: "TS_EGRESS_SERVICES_CONFIG_PATH",
Value: fmt.Sprintf("/etc/proxies/%s", egressservices.KeyEgressServices),
})
}
return envs
}()
return ss, nil
}
func pgServiceAccount(pg *tsapi.ProxyGroup, namespace string) *corev1.ServiceAccount {
return &corev1.ServiceAccount{
ObjectMeta: metav1.ObjectMeta{
Name: pg.Name,
Namespace: namespace,
Labels: pgLabels(pg.Name, nil),
OwnerReferences: pgOwnerReference(pg),
},
}
}
func pgRole(pg *tsapi.ProxyGroup, namespace string) *rbacv1.Role {
return &rbacv1.Role{
ObjectMeta: metav1.ObjectMeta{
Name: pg.Name,
Namespace: namespace,
Labels: pgLabels(pg.Name, nil),
OwnerReferences: pgOwnerReference(pg),
},
Rules: []rbacv1.PolicyRule{
{
APIGroups: []string{""},
Resources: []string{"secrets"},
Verbs: []string{
"get",
"patch",
"update",
},
ResourceNames: func() (secrets []string) {
for i := range pgReplicas(pg) {
secrets = append(secrets,
fmt.Sprintf("%s-%d-config", pg.Name, i), // Config with auth key.
fmt.Sprintf("%s-%d", pg.Name, i), // State.
)
}
return secrets
}(),
},
},
}
}
func pgRoleBinding(pg *tsapi.ProxyGroup, namespace string) *rbacv1.RoleBinding {
return &rbacv1.RoleBinding{
ObjectMeta: metav1.ObjectMeta{
Name: pg.Name,
Namespace: namespace,
Labels: pgLabels(pg.Name, nil),
OwnerReferences: pgOwnerReference(pg),
},
Subjects: []rbacv1.Subject{
{
Kind: "ServiceAccount",
Name: pg.Name,
Namespace: namespace,
},
},
RoleRef: rbacv1.RoleRef{
Kind: "Role",
Name: pg.Name,
},
}
}
func pgStateSecrets(pg *tsapi.ProxyGroup, namespace string) (secrets []*corev1.Secret) {
for i := range pgReplicas(pg) {
secrets = append(secrets, &corev1.Secret{
ObjectMeta: metav1.ObjectMeta{
Name: fmt.Sprintf("%s-%d", pg.Name, i),
Namespace: namespace,
Labels: pgSecretLabels(pg.Name, "state"),
OwnerReferences: pgOwnerReference(pg),
},
})
}
return secrets
}
func pgEgressCM(pg *tsapi.ProxyGroup, namespace string) *corev1.ConfigMap {
return &corev1.ConfigMap{
ObjectMeta: metav1.ObjectMeta{
Name: pgEgressCMName(pg.Name),
Namespace: namespace,
Labels: pgLabels(pg.Name, nil),
OwnerReferences: pgOwnerReference(pg),
},
}
}
func pgSecretLabels(pgName, typ string) map[string]string {
return pgLabels(pgName, map[string]string{
labelSecretType: typ, // "config" or "state".
})
}
func pgLabels(pgName string, customLabels map[string]string) map[string]string {
l := make(map[string]string, len(customLabels)+3)
for k, v := range customLabels {
l[k] = v
}
l[LabelManaged] = "true"
l[LabelParentType] = "proxygroup"
l[LabelParentName] = pgName
return l
}
func pgOwnerReference(owner *tsapi.ProxyGroup) []metav1.OwnerReference {
return []metav1.OwnerReference{*metav1.NewControllerRef(owner, tsapi.SchemeGroupVersion.WithKind("ProxyGroup"))}
}
func pgReplicas(pg *tsapi.ProxyGroup) int32 {
if pg.Spec.Replicas != nil {
return *pg.Spec.Replicas
}
return 2
}
func pgEgressCMName(pg string) string {
return fmt.Sprintf("%s-egress-config", pg)
}

View File

@@ -0,0 +1,287 @@
// Copyright (c) Tailscale Inc & AUTHORS
// SPDX-License-Identifier: BSD-3-Clause
//go:build !plan9
package main
import (
"context"
"encoding/json"
"fmt"
"testing"
"time"
"github.com/google/go-cmp/cmp"
"go.uber.org/zap"
appsv1 "k8s.io/api/apps/v1"
corev1 "k8s.io/api/core/v1"
rbacv1 "k8s.io/api/rbac/v1"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
"k8s.io/client-go/tools/record"
"sigs.k8s.io/controller-runtime/pkg/client"
"sigs.k8s.io/controller-runtime/pkg/client/fake"
"tailscale.com/client/tailscale"
tsoperator "tailscale.com/k8s-operator"
tsapi "tailscale.com/k8s-operator/apis/v1alpha1"
"tailscale.com/tstest"
"tailscale.com/types/ptr"
)
const testProxyImage = "tailscale/tailscale:test"
var defaultProxyClassAnnotations = map[string]string{
"some-annotation": "from-the-proxy-class",
}
func TestProxyGroup(t *testing.T) {
const initialCfgHash = "6632726be70cf224049580deb4d317bba065915b5fd415461d60ed621c91b196"
pc := &tsapi.ProxyClass{
ObjectMeta: metav1.ObjectMeta{
Name: "default-pc",
},
Spec: tsapi.ProxyClassSpec{
StatefulSet: &tsapi.StatefulSet{
Annotations: defaultProxyClassAnnotations,
},
},
}
pg := &tsapi.ProxyGroup{
ObjectMeta: metav1.ObjectMeta{
Name: "test",
Finalizers: []string{"tailscale.com/finalizer"},
},
}
fc := fake.NewClientBuilder().
WithScheme(tsapi.GlobalScheme).
WithObjects(pg, pc).
WithStatusSubresource(pg, pc).
Build()
tsClient := &fakeTSClient{}
zl, _ := zap.NewDevelopment()
fr := record.NewFakeRecorder(1)
cl := tstest.NewClock(tstest.ClockOpts{})
reconciler := &ProxyGroupReconciler{
tsNamespace: tsNamespace,
proxyImage: testProxyImage,
defaultTags: []string{"tag:test-tag"},
tsFirewallMode: "auto",
defaultProxyClass: "default-pc",
Client: fc,
tsClient: tsClient,
recorder: fr,
l: zl.Sugar(),
clock: cl,
}
t.Run("proxyclass_not_ready", func(t *testing.T) {
expectReconciled(t, reconciler, "", pg.Name)
tsoperator.SetProxyGroupCondition(pg, tsapi.ProxyGroupReady, metav1.ConditionFalse, reasonProxyGroupCreating, "the ProxyGroup's ProxyClass default-pc is not yet in a ready state, waiting...", 0, cl, zl.Sugar())
expectEqual(t, fc, pg, nil)
expectProxyGroupResources(t, fc, pg, false, "")
})
t.Run("observe_ProxyGroupCreating_status_reason", func(t *testing.T) {
pc.Status = tsapi.ProxyClassStatus{
Conditions: []metav1.Condition{{
Type: string(tsapi.ProxyClassReady),
Status: metav1.ConditionTrue,
Reason: reasonProxyClassValid,
Message: reasonProxyClassValid,
LastTransitionTime: metav1.Time{Time: cl.Now().Truncate(time.Second)},
}},
}
if err := fc.Status().Update(context.Background(), pc); err != nil {
t.Fatal(err)
}
expectReconciled(t, reconciler, "", pg.Name)
tsoperator.SetProxyGroupCondition(pg, tsapi.ProxyGroupReady, metav1.ConditionFalse, reasonProxyGroupCreating, "0/2 ProxyGroup pods running", 0, cl, zl.Sugar())
expectEqual(t, fc, pg, nil)
expectProxyGroupResources(t, fc, pg, true, initialCfgHash)
if expected := 1; reconciler.proxyGroups.Len() != expected {
t.Fatalf("expected %d recorders, got %d", expected, reconciler.proxyGroups.Len())
}
expectProxyGroupResources(t, fc, pg, true, initialCfgHash)
keyReq := tailscale.KeyCapabilities{
Devices: tailscale.KeyDeviceCapabilities{
Create: tailscale.KeyDeviceCreateCapabilities{
Reusable: false,
Ephemeral: false,
Preauthorized: true,
Tags: []string{"tag:test-tag"},
},
},
}
if diff := cmp.Diff(tsClient.KeyRequests(), []tailscale.KeyCapabilities{keyReq, keyReq}); diff != "" {
t.Fatalf("unexpected secrets (-got +want):\n%s", diff)
}
})
t.Run("simulate_successful_device_auth", func(t *testing.T) {
addNodeIDToStateSecrets(t, fc, pg)
expectReconciled(t, reconciler, "", pg.Name)
pg.Status.Devices = []tsapi.TailnetDevice{
{
Hostname: "hostname-nodeid-0",
TailnetIPs: []string{"1.2.3.4", "::1"},
},
{
Hostname: "hostname-nodeid-1",
TailnetIPs: []string{"1.2.3.4", "::1"},
},
}
tsoperator.SetProxyGroupCondition(pg, tsapi.ProxyGroupReady, metav1.ConditionTrue, reasonProxyGroupReady, reasonProxyGroupReady, 0, cl, zl.Sugar())
expectEqual(t, fc, pg, nil)
expectProxyGroupResources(t, fc, pg, true, initialCfgHash)
})
t.Run("scale_up_to_3", func(t *testing.T) {
pg.Spec.Replicas = ptr.To[int32](3)
mustUpdate(t, fc, "", pg.Name, func(p *tsapi.ProxyGroup) {
p.Spec = pg.Spec
})
expectReconciled(t, reconciler, "", pg.Name)
tsoperator.SetProxyGroupCondition(pg, tsapi.ProxyGroupReady, metav1.ConditionFalse, reasonProxyGroupCreating, "2/3 ProxyGroup pods running", 0, cl, zl.Sugar())
expectEqual(t, fc, pg, nil)
expectProxyGroupResources(t, fc, pg, true, initialCfgHash)
addNodeIDToStateSecrets(t, fc, pg)
expectReconciled(t, reconciler, "", pg.Name)
tsoperator.SetProxyGroupCondition(pg, tsapi.ProxyGroupReady, metav1.ConditionTrue, reasonProxyGroupReady, reasonProxyGroupReady, 0, cl, zl.Sugar())
pg.Status.Devices = append(pg.Status.Devices, tsapi.TailnetDevice{
Hostname: "hostname-nodeid-2",
TailnetIPs: []string{"1.2.3.4", "::1"},
})
expectEqual(t, fc, pg, nil)
expectProxyGroupResources(t, fc, pg, true, initialCfgHash)
})
t.Run("scale_down_to_1", func(t *testing.T) {
pg.Spec.Replicas = ptr.To[int32](1)
mustUpdate(t, fc, "", pg.Name, func(p *tsapi.ProxyGroup) {
p.Spec = pg.Spec
})
expectReconciled(t, reconciler, "", pg.Name)
pg.Status.Devices = pg.Status.Devices[:1] // truncate to only the first device.
expectEqual(t, fc, pg, nil)
expectProxyGroupResources(t, fc, pg, true, initialCfgHash)
})
t.Run("trigger_config_change_and_observe_new_config_hash", func(t *testing.T) {
pc.Spec.TailscaleConfig = &tsapi.TailscaleConfig{
AcceptRoutes: true,
}
mustUpdate(t, fc, "", pc.Name, func(p *tsapi.ProxyClass) {
p.Spec = pc.Spec
})
expectReconciled(t, reconciler, "", pg.Name)
expectEqual(t, fc, pg, nil)
expectProxyGroupResources(t, fc, pg, true, "518a86e9fae64f270f8e0ec2a2ea6ca06c10f725035d3d6caca132cd61e42a74")
})
t.Run("delete_and_cleanup", func(t *testing.T) {
if err := fc.Delete(context.Background(), pg); err != nil {
t.Fatal(err)
}
expectReconciled(t, reconciler, "", pg.Name)
expectMissing[tsapi.Recorder](t, fc, "", pg.Name)
if expected := 0; reconciler.proxyGroups.Len() != expected {
t.Fatalf("expected %d ProxyGroups, got %d", expected, reconciler.proxyGroups.Len())
}
// 2 nodes should get deleted as part of the scale down, and then finally
// the first node gets deleted with the ProxyGroup cleanup.
if diff := cmp.Diff(tsClient.deleted, []string{"nodeid-1", "nodeid-2", "nodeid-0"}); diff != "" {
t.Fatalf("unexpected deleted devices (-got +want):\n%s", diff)
}
// The fake client does not clean up objects whose owner has been
// deleted, so we can't test for the owned resources getting deleted.
})
}
func expectProxyGroupResources(t *testing.T, fc client.WithWatch, pg *tsapi.ProxyGroup, shouldExist bool, cfgHash string) {
t.Helper()
role := pgRole(pg, tsNamespace)
roleBinding := pgRoleBinding(pg, tsNamespace)
serviceAccount := pgServiceAccount(pg, tsNamespace)
statefulSet, err := pgStatefulSet(pg, tsNamespace, testProxyImage, "auto", cfgHash)
if err != nil {
t.Fatal(err)
}
statefulSet.Annotations = defaultProxyClassAnnotations
if shouldExist {
expectEqual(t, fc, role, nil)
expectEqual(t, fc, roleBinding, nil)
expectEqual(t, fc, serviceAccount, nil)
expectEqual(t, fc, statefulSet, nil)
} else {
expectMissing[rbacv1.Role](t, fc, role.Namespace, role.Name)
expectMissing[rbacv1.RoleBinding](t, fc, roleBinding.Namespace, roleBinding.Name)
expectMissing[corev1.ServiceAccount](t, fc, serviceAccount.Namespace, serviceAccount.Name)
expectMissing[appsv1.StatefulSet](t, fc, statefulSet.Namespace, statefulSet.Name)
}
var expectedSecrets []string
if shouldExist {
for i := range pgReplicas(pg) {
expectedSecrets = append(expectedSecrets,
fmt.Sprintf("%s-%d", pg.Name, i),
fmt.Sprintf("%s-%d-config", pg.Name, i),
)
}
}
expectSecrets(t, fc, expectedSecrets)
}
func expectSecrets(t *testing.T, fc client.WithWatch, expected []string) {
t.Helper()
secrets := &corev1.SecretList{}
if err := fc.List(context.Background(), secrets); err != nil {
t.Fatal(err)
}
var actual []string
for _, secret := range secrets.Items {
actual = append(actual, secret.Name)
}
if diff := cmp.Diff(actual, expected); diff != "" {
t.Fatalf("unexpected secrets (-got +want):\n%s", diff)
}
}
func addNodeIDToStateSecrets(t *testing.T, fc client.WithWatch, pg *tsapi.ProxyGroup) {
const key = "profile-abc"
for i := range pgReplicas(pg) {
bytes, err := json.Marshal(map[string]any{
"Config": map[string]any{
"NodeID": fmt.Sprintf("nodeid-%d", i),
},
})
if err != nil {
t.Fatal(err)
}
mustUpdate(t, fc, tsNamespace, fmt.Sprintf("test-%d", i), func(s *corev1.Secret) {
s.Data = map[string][]byte{
currentProfileKey: []byte(key),
key: bytes,
}
})
}
}

View File

@@ -31,6 +31,7 @@ import (
"tailscale.com/ipn"
tsoperator "tailscale.com/k8s-operator"
tsapi "tailscale.com/k8s-operator/apis/v1alpha1"
"tailscale.com/kube/kubetypes"
"tailscale.com/net/netutil"
"tailscale.com/tailcfg"
"tailscale.com/types/opt"
@@ -46,11 +47,11 @@ const (
LabelParentType = "tailscale.com/parent-resource-type"
LabelParentName = "tailscale.com/parent-resource"
LabelParentNamespace = "tailscale.com/parent-resource-ns"
labelSecretType = "tailscale.com/secret-type" // "config" or "state".
// LabelProxyClass can be set by users on Connectors, tailscale
// Ingresses and Services that define cluster ingress or cluster egress,
// to specify that configuration in this ProxyClass should be applied to
// resources created for the Connector, Ingress or Service.
// LabelProxyClass can be set by users on tailscale Ingresses and Services that define cluster ingress or
// cluster egress, to specify that configuration in this ProxyClass should be applied to resources created for
// the Ingress or Service.
LabelProxyClass = "tailscale.com/proxy-class"
FinalizerName = "tailscale.com/finalizer"
@@ -64,6 +65,8 @@ const (
//MagicDNS name of tailnet node.
AnnotationTailnetTargetFQDN = "tailscale.com/tailnet-fqdn"
AnnotationProxyGroup = "tailscale.com/proxy-group"
// Annotations settable by users on ingresses.
AnnotationFunnel = "tailscale.com/funnel"
@@ -129,10 +132,13 @@ type tailscaleSTSConfig struct {
}
type connector struct {
// routes is a list of subnet routes that this Connector should expose.
// routes is a list of routes that this Connector should advertise either as a subnet router or as an app
// connector.
routes string
// isExitNode defines whether this Connector should act as an exit node.
isExitNode bool
// isAppConnector defines whether this Connector should act as an app connector.
isAppConnector bool
}
type tsnetServer interface {
CertDomains() []string
@@ -294,13 +300,14 @@ func (a *tailscaleSTSReconciler) reconcileHeadlessService(ctx context.Context, l
Selector: map[string]string{
"app": sts.ParentResourceUID,
},
IPFamilyPolicy: ptr.To(corev1.IPFamilyPolicyPreferDualStack),
},
}
logger.Debugf("reconciling headless service for StatefulSet")
return createOrUpdate(ctx, a.Client, a.operatorNamespace, hsvc, func(svc *corev1.Service) { svc.Spec = hsvc.Spec })
}
func (a *tailscaleSTSReconciler) createOrGetSecret(ctx context.Context, logger *zap.SugaredLogger, stsC *tailscaleSTSConfig, hsvc *corev1.Service) (secretName, hash string, configs tailscaleConfigs, _ error) {
func (a *tailscaleSTSReconciler) createOrGetSecret(ctx context.Context, logger *zap.SugaredLogger, stsC *tailscaleSTSConfig, hsvc *corev1.Service) (secretName, hash string, configs tailscaledConfigs, _ error) {
secret := &corev1.Secret{
ObjectMeta: metav1.ObjectMeta{
// Hardcode a -0 suffix so that in future, if we support
@@ -341,7 +348,7 @@ func (a *tailscaleSTSReconciler) createOrGetSecret(ctx context.Context, logger *
if len(tags) == 0 {
tags = a.defaultTags
}
authKey, err = a.newAuthKey(ctx, tags)
authKey, err = newAuthKey(ctx, a.tsClient, tags)
if err != nil {
return "", "", nil, err
}
@@ -358,7 +365,7 @@ func (a *tailscaleSTSReconciler) createOrGetSecret(ctx context.Context, logger *
latest := tailcfg.CapabilityVersion(-1)
var latestConfig ipn.ConfigVAlpha
for key, val := range configs {
fn := tsoperator.TailscaledConfigFileNameForCap(key)
fn := tsoperator.TailscaledConfigFileName(key)
b, err := json.Marshal(val)
if err != nil {
return "", "", nil, fmt.Errorf("error marshalling tailscaled config: %w", err)
@@ -417,6 +424,11 @@ func (a *tailscaleSTSReconciler) DeviceInfo(ctx context.Context, childLabels map
if sec == nil {
return "", "", nil, nil
}
return deviceInfo(sec)
}
func deviceInfo(sec *corev1.Secret) (id tailcfg.StableNodeID, hostname string, ips []string, err error) {
id = tailcfg.StableNodeID(sec.Data["device_id"])
if id == "" {
return "", "", nil, nil
@@ -440,7 +452,7 @@ func (a *tailscaleSTSReconciler) DeviceInfo(ctx context.Context, childLabels map
return id, hostname, ips, nil
}
func (a *tailscaleSTSReconciler) newAuthKey(ctx context.Context, tags []string) (string, error) {
func newAuthKey(ctx context.Context, tsClient tsClient, tags []string) (string, error) {
caps := tailscale.KeyCapabilities{
Devices: tailscale.KeyDeviceCapabilities{
Create: tailscale.KeyDeviceCreateCapabilities{
@@ -451,7 +463,7 @@ func (a *tailscaleSTSReconciler) newAuthKey(ctx context.Context, tags []string)
},
}
key, _, err := a.tsClient.CreateKey(ctx, caps)
key, _, err := tsClient.CreateKey(ctx, caps)
if err != nil {
return "", err
}
@@ -509,11 +521,6 @@ func (a *tailscaleSTSReconciler) reconcileSTS(ctx context.Context, logger *zap.S
Name: "TS_KUBE_SECRET",
Value: proxySecret,
},
corev1.EnvVar{
// Old tailscaled config key is still used for backwards compatibility.
Name: "EXPERIMENTAL_TS_CONFIGFILE_PATH",
Value: "/etc/tsconfig/tailscaled",
},
corev1.EnvVar{
// New style is in the form of cap-<capability-version>.hujson.
Name: "TS_EXPERIMENTAL_VERSIONED_CONFIG_DIR",
@@ -597,6 +604,18 @@ func (a *tailscaleSTSReconciler) reconcileSTS(ctx context.Context, logger *zap.S
},
})
}
app, err := appInfoForProxy(sts)
if err != nil {
// No need to error out if now or in future we end up in a
// situation where app info cannot be determined for one of the
// many proxy configurations that the operator can produce.
logger.Error("[unexpected] unable to determine proxy type")
} else {
container.Env = append(container.Env, corev1.EnvVar{
Name: "TS_INTERNAL_APP",
Value: app,
})
}
logger.Debugf("reconciling statefulset %s/%s", ss.GetNamespace(), ss.GetName())
if sts.ProxyClassName != "" {
logger.Debugf("configuring proxy resources with ProxyClass %s", sts.ProxyClassName)
@@ -610,6 +629,22 @@ func (a *tailscaleSTSReconciler) reconcileSTS(ctx context.Context, logger *zap.S
return createOrUpdate(ctx, a.Client, a.operatorNamespace, ss, updateSS)
}
func appInfoForProxy(cfg *tailscaleSTSConfig) (string, error) {
if cfg.ClusterTargetDNSName != "" || cfg.ClusterTargetIP != "" {
return kubetypes.AppIngressProxy, nil
}
if cfg.TailnetTargetFQDN != "" || cfg.TailnetTargetIP != "" {
return kubetypes.AppEgressProxy, nil
}
if cfg.ServeConfig != nil {
return kubetypes.AppIngressResource, nil
}
if cfg.Connector != nil {
return kubetypes.AppConnector, nil
}
return "", errors.New("unable to determine proxy type")
}
// mergeStatefulSetLabelsOrAnnots returns a map that contains all keys/values
// present in 'custom' map as well as those keys/values from the current map
// whose keys are present in the 'managed' map. The reason why this merge is
@@ -635,9 +670,9 @@ func applyProxyClassToStatefulSet(pc *tsapi.ProxyClass, ss *appsv1.StatefulSet,
if pc == nil || ss == nil {
return ss
}
if pc.Spec.Metrics != nil && pc.Spec.Metrics.Enable {
if stsCfg != nil && pc.Spec.Metrics != nil && pc.Spec.Metrics.Enable {
if stsCfg.TailnetTargetFQDN == "" && stsCfg.TailnetTargetIP == "" && !stsCfg.ForwardClusterTrafficViaL7IngressProxy {
enableMetrics(ss, pc)
enableMetrics(ss)
} else if stsCfg.ForwardClusterTrafficViaL7IngressProxy {
// TODO (irbekrm): fix this
// For Ingress proxies that have been configured with
@@ -681,6 +716,7 @@ func applyProxyClassToStatefulSet(pc *tsapi.ProxyClass, ss *appsv1.StatefulSet,
ss.Spec.Template.Spec.NodeSelector = wantsPod.NodeSelector
ss.Spec.Template.Spec.Affinity = wantsPod.Affinity
ss.Spec.Template.Spec.Tolerations = wantsPod.Tolerations
ss.Spec.Template.Spec.TopologySpreadConstraints = wantsPod.TopologySpreadConstraints
// Update containers.
updateContainer := func(overlay *tsapi.Container, base corev1.Container) corev1.Container {
@@ -725,7 +761,7 @@ func applyProxyClassToStatefulSet(pc *tsapi.ProxyClass, ss *appsv1.StatefulSet,
return ss
}
func enableMetrics(ss *appsv1.StatefulSet, pc *tsapi.ProxyClass) {
func enableMetrics(ss *appsv1.StatefulSet) {
for i, c := range ss.Spec.Template.Spec.Containers {
if c.Name == "tailscale" {
// Serve metrics on on <pod-ip>:9001/debug/metrics. If
@@ -748,16 +784,10 @@ func readAuthKey(secret *corev1.Secret, key string) (*string, error) {
return origConf.AuthKey, nil
}
// tailscaledConfig takes a proxy config, a newly generated auth key if
// generated and a Secret with the previous proxy state and auth key and
// returns tailscaled configuration and a hash of that configuration.
//
// As of 2024-05-09 it also returns legacy tailscaled config without the
// later added NoStatefulFilter field to support proxies older than cap95.
// TODO (irbekrm): remove the legacy config once we no longer need to support
// versions older than cap94,
// https://tailscale.com/kb/1236/kubernetes-operator#operator-and-proxies
func tailscaledConfig(stsC *tailscaleSTSConfig, newAuthkey string, oldSecret *corev1.Secret) (tailscaleConfigs, error) {
// tailscaledConfig takes a proxy config, a newly generated auth key if generated and a Secret with the previous proxy
// state and auth key and returns tailscaled config files for currently supported proxy versions and a hash of that
// configuration.
func tailscaledConfig(stsC *tailscaleSTSConfig, newAuthkey string, oldSecret *corev1.Secret) (tailscaledConfigs, error) {
conf := &ipn.ConfigVAlpha{
Version: "alpha0",
AcceptDNS: "false",
@@ -765,11 +795,13 @@ func tailscaledConfig(stsC *tailscaleSTSConfig, newAuthkey string, oldSecret *co
Locked: "false",
Hostname: &stsC.Hostname,
NoStatefulFiltering: "false",
AppConnector: &ipn.AppConnectorPrefs{Advertise: false},
}
// For egress proxies only, we need to ensure that stateful filtering is
// not in place so that traffic from cluster can be forwarded via
// Tailscale IPs.
// TODO (irbekrm): set it to true always as this is now the default in core.
if stsC.TailnetTargetFQDN != "" || stsC.TailnetTargetIP != "" {
conf.NoStatefulFiltering = "true"
}
@@ -779,6 +811,9 @@ func tailscaledConfig(stsC *tailscaleSTSConfig, newAuthkey string, oldSecret *co
return nil, fmt.Errorf("error calculating routes: %w", err)
}
conf.AdvertiseRoutes = routes
if stsC.Connector.isAppConnector {
conf.AppConnector.Advertise = true
}
}
if shouldAcceptRoutes(stsC.ProxyClass) {
conf.AcceptRoutes = "true"
@@ -786,40 +821,56 @@ func tailscaledConfig(stsC *tailscaleSTSConfig, newAuthkey string, oldSecret *co
if newAuthkey != "" {
conf.AuthKey = &newAuthkey
} else if oldSecret != nil {
var err error
latest := tailcfg.CapabilityVersion(-1)
latestStr := ""
for k, data := range oldSecret.Data {
// write to StringData, read from Data as StringData is write-only
if len(data) == 0 {
continue
}
v, err := tsoperator.CapVerFromFileName(k)
if err != nil {
continue
}
if v > latest {
latestStr = k
latest = v
}
} else if shouldRetainAuthKey(oldSecret) {
key, err := authKeyFromSecret(oldSecret)
if err != nil {
return nil, fmt.Errorf("error retrieving auth key from Secret: %w", err)
}
// Allow for configs that don't contain an auth key. Perhaps
// users have some mechanisms to delete them. Auth key is
// normally not needed after the initial login.
if latestStr != "" {
conf.AuthKey, err = readAuthKey(oldSecret, latestStr)
if err != nil {
return nil, err
}
conf.AuthKey = key
}
capVerConfigs := make(map[tailcfg.CapabilityVersion]ipn.ConfigVAlpha)
capVerConfigs[107] = *conf
// AppConnector config option is only understood by clients of capver 107 and newer.
conf.AppConnector = nil
capVerConfigs[95] = *conf
return capVerConfigs, nil
}
func authKeyFromSecret(s *corev1.Secret) (key *string, err error) {
latest := tailcfg.CapabilityVersion(-1)
latestStr := ""
for k, data := range s.Data {
// write to StringData, read from Data as StringData is write-only
if len(data) == 0 {
continue
}
v, err := tsoperator.CapVerFromFileName(k)
if err != nil {
continue
}
if v > latest {
latestStr = k
latest = v
}
}
capVerConfigs := make(map[tailcfg.CapabilityVersion]ipn.ConfigVAlpha)
capVerConfigs[95] = *conf
// legacy config should not contain NoStatefulFiltering field.
conf.NoStatefulFiltering.Clear()
capVerConfigs[94] = *conf
return capVerConfigs, nil
// Allow for configs that don't contain an auth key. Perhaps
// users have some mechanisms to delete them. Auth key is
// normally not needed after the initial login.
if latestStr != "" {
return readAuthKey(s, latestStr)
}
return key, nil
}
// shouldRetainAuthKey returns true if the state stored in a proxy's state Secret suggests that auth key should be
// retained (because the proxy has not yet successfully authenticated).
func shouldRetainAuthKey(s *corev1.Secret) bool {
if s == nil {
return false // nothing to retain here
}
return len(s.Data["device_id"]) == 0 // proxy has not authed yet
}
func shouldAcceptRoutes(pc *tsapi.ProxyClass) bool {
@@ -833,7 +884,7 @@ type ptrObject[T any] interface {
*T
}
type tailscaleConfigs map[tailcfg.CapabilityVersion]ipn.ConfigVAlpha
type tailscaledConfigs map[tailcfg.CapabilityVersion]ipn.ConfigVAlpha
// hashBytes produces a hash for the provided tailscaled config that is the same across
// different invocations of this code. We do not use the
@@ -844,7 +895,7 @@ type tailscaleConfigs map[tailcfg.CapabilityVersion]ipn.ConfigVAlpha
// thing that changed is operator version (the hash is also exposed to users via
// an annotation and might be confusing if it changes without the config having
// changed).
func tailscaledConfigHash(c tailscaleConfigs) (string, error) {
func tailscaledConfigHash(c tailscaledConfigs) (string, error) {
b, err := json.Marshal(c)
if err != nil {
return "", fmt.Errorf("error marshalling tailscaled configs: %w", err)

View File

@@ -18,6 +18,7 @@ import (
appsv1 "k8s.io/api/apps/v1"
corev1 "k8s.io/api/core/v1"
"k8s.io/apimachinery/pkg/api/resource"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
"sigs.k8s.io/yaml"
tsapi "tailscale.com/k8s-operator/apis/v1alpha1"
"tailscale.com/types/ptr"
@@ -73,6 +74,16 @@ func Test_applyProxyClassToStatefulSet(t *testing.T) {
NodeSelector: map[string]string{"beta.kubernetes.io/os": "linux"},
Affinity: &corev1.Affinity{NodeAffinity: &corev1.NodeAffinity{RequiredDuringSchedulingIgnoredDuringExecution: &corev1.NodeSelector{}}},
Tolerations: []corev1.Toleration{{Key: "", Operator: "Exists"}},
TopologySpreadConstraints: []corev1.TopologySpreadConstraint{
{
WhenUnsatisfiable: "DoNotSchedule",
TopologyKey: "kubernetes.io/hostname",
MaxSkew: 3,
LabelSelector: &metav1.LabelSelector{
MatchLabels: map[string]string{"foo": "bar"},
},
},
},
TailscaleContainer: &tsapi.Container{
SecurityContext: &corev1.SecurityContext{
Privileged: ptr.To(true),
@@ -159,6 +170,7 @@ func Test_applyProxyClassToStatefulSet(t *testing.T) {
wantSS.Spec.Template.Spec.NodeSelector = proxyClassAllOpts.Spec.StatefulSet.Pod.NodeSelector
wantSS.Spec.Template.Spec.Affinity = proxyClassAllOpts.Spec.StatefulSet.Pod.Affinity
wantSS.Spec.Template.Spec.Tolerations = proxyClassAllOpts.Spec.StatefulSet.Pod.Tolerations
wantSS.Spec.Template.Spec.TopologySpreadConstraints = proxyClassAllOpts.Spec.StatefulSet.Pod.TopologySpreadConstraints
wantSS.Spec.Template.Spec.Containers[0].SecurityContext = proxyClassAllOpts.Spec.StatefulSet.Pod.TailscaleContainer.SecurityContext
wantSS.Spec.Template.Spec.InitContainers[0].SecurityContext = proxyClassAllOpts.Spec.StatefulSet.Pod.TailscaleInitContainer.SecurityContext
wantSS.Spec.Template.Spec.Containers[0].Resources = proxyClassAllOpts.Spec.StatefulSet.Pod.TailscaleContainer.Resources
@@ -201,6 +213,7 @@ func Test_applyProxyClassToStatefulSet(t *testing.T) {
wantSS.Spec.Template.Spec.NodeSelector = proxyClassAllOpts.Spec.StatefulSet.Pod.NodeSelector
wantSS.Spec.Template.Spec.Affinity = proxyClassAllOpts.Spec.StatefulSet.Pod.Affinity
wantSS.Spec.Template.Spec.Tolerations = proxyClassAllOpts.Spec.StatefulSet.Pod.Tolerations
wantSS.Spec.Template.Spec.TopologySpreadConstraints = proxyClassAllOpts.Spec.StatefulSet.Pod.TopologySpreadConstraints
wantSS.Spec.Template.Spec.Containers[0].SecurityContext = proxyClassAllOpts.Spec.StatefulSet.Pod.TailscaleContainer.SecurityContext
wantSS.Spec.Template.Spec.Containers[0].Resources = proxyClassAllOpts.Spec.StatefulSet.Pod.TailscaleContainer.Resources
wantSS.Spec.Template.Spec.Containers[0].Env = append(wantSS.Spec.Template.Spec.Containers[0].Env, []corev1.EnvVar{{Name: "foo", Value: "bar"}, {Name: "TS_USERSPACE", Value: "true"}, {Name: "bar"}}...)

View File

@@ -25,6 +25,7 @@ import (
"sigs.k8s.io/controller-runtime/pkg/reconcile"
tsoperator "tailscale.com/k8s-operator"
tsapi "tailscale.com/k8s-operator/apis/v1alpha1"
"tailscale.com/kube/kubetypes"
"tailscale.com/net/dns/resolvconffile"
"tailscale.com/tstime"
"tailscale.com/util/clientmetric"
@@ -62,15 +63,17 @@ type ServiceReconciler struct {
tsNamespace string
clock tstime.Clock
defaultProxyClass string
}
var (
// gaugeEgressProxies tracks the number of egress proxies that we're
// currently managing.
gaugeEgressProxies = clientmetric.NewGauge("k8s_egress_proxies")
gaugeEgressProxies = clientmetric.NewGauge(kubetypes.MetricEgressProxyCount)
// gaugeIngressProxies tracks the number of ingress proxies that we're
// currently managing.
gaugeIngressProxies = clientmetric.NewGauge("k8s_ingress_proxies")
gaugeIngressProxies = clientmetric.NewGauge(kubetypes.MetricIngressProxyCount)
)
func childResourceLabels(name, ns, typ string) map[string]string {
@@ -109,6 +112,10 @@ func (a *ServiceReconciler) Reconcile(ctx context.Context, req reconcile.Request
return reconcile.Result{}, fmt.Errorf("failed to get svc: %w", err)
}
if _, ok := svc.Annotations[AnnotationProxyGroup]; ok {
return reconcile.Result{}, nil // this reconciler should not look at Services for ProxyGroup
}
if !svc.DeletionTimestamp.IsZero() || !a.isTailscaleService(svc) {
logger.Debugf("service is being deleted or is (no longer) referring to Tailscale ingress/egress, ensuring any created resources are cleaned up")
return reconcile.Result{}, a.maybeCleanup(ctx, logger, svc)
@@ -208,7 +215,7 @@ func (a *ServiceReconciler) maybeProvision(ctx context.Context, logger *zap.Suga
return nil
}
proxyClass := proxyClassForObject(svc)
proxyClass := proxyClassForObject(svc, a.defaultProxyClass)
if proxyClass != "" {
if ready, err := proxyClassIsReady(ctx, proxyClass, a.Client); err != nil {
errMsg := fmt.Errorf("error verifying ProxyClass for Service: %w", err)
@@ -325,7 +332,7 @@ func (a *ServiceReconciler) maybeProvision(ctx context.Context, logger *zap.Suga
if err != nil {
msg := fmt.Sprintf("failed to parse cluster IP: %v", err)
tsoperator.SetServiceCondition(svc, tsapi.ProxyReady, metav1.ConditionFalse, reasonProxyFailed, msg, a.clock, logger)
return fmt.Errorf(msg)
return errors.New(msg)
}
for _, ip := range tsIPs {
addr, err := netip.ParseAddr(ip)
@@ -351,6 +358,15 @@ func validateService(svc *corev1.Service) []string {
violations = append(violations, fmt.Sprintf("invalid value of annotation %s: %q does not appear to be a valid MagicDNS name", AnnotationTailnetTargetFQDN, fqdn))
}
}
if ipStr := svc.Annotations[AnnotationTailnetTargetIP]; ipStr != "" {
ip, err := netip.ParseAddr(ipStr)
if err != nil {
violations = append(violations, fmt.Sprintf("invalid value of annotation %s: %q could not be parsed as a valid IP Address, error: %s", AnnotationTailnetTargetIP, ipStr, err))
} else if !ip.IsValid() {
violations = append(violations, fmt.Sprintf("parsed IP address in annotation %s: %q is not valid", AnnotationTailnetTargetIP, ipStr))
}
}
svcName := nameForService(svc)
if err := dnsname.ValidLabel(svcName); err != nil {
if _, ok := svc.Annotations[AnnotationHostname]; ok {
@@ -404,8 +420,14 @@ func tailnetTargetAnnotation(svc *corev1.Service) string {
return svc.Annotations[annotationTailnetTargetIPOld]
}
func proxyClassForObject(o client.Object) string {
return o.GetLabels()[LabelProxyClass]
// proxyClassForObject returns the proxy class for the given object. If the
// object does not have a proxy class label, it returns the default proxy class
func proxyClassForObject(o client.Object, proxyDefaultClass string) string {
proxyClass, exists := o.GetLabels()[LabelProxyClass]
if !exists {
proxyClass = proxyDefaultClass
}
return proxyClass
}
func proxyClassIsReady(ctx context.Context, name string, cl client.Client) (bool, error) {

Some files were not shown because too many files have changed in this diff Show More