Compare commits

...

139 Commits

Author SHA1 Message Date
Andrew Dunham
fc4048014e control/keyfallback: add baked-in fallback for control key
Similar to how we bake in the DERPMap to ensure that we can reach the
DERP servers if DNS isn't working, also bake in the control key for the
default control server that we use if the control server is down.

Updates #13890

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I18ef0381e266bd3db10063685993bc3cb76b2f42
2024-10-24 11:35:48 -05:00
Andrew Dunham
b2665d9b89 net/netcheck: add a Now field to the netcheck Report
This allows us to print the time that a netcheck was run, which is
useful in debugging.

Updates #10972

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: Id48d30d4eb6d5208efb2b1526a71d83fe7f9320b
2024-10-22 15:52:42 -04:00
Brad Fitzpatrick
ae5bc88ebe health: fix spurious warning about DERP home region '0'
Updates #13650

Change-Id: I6b0f165f66da3f881a4caa25d2d9936dc2a7f22c
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-10-22 10:01:30 -05:00
Maisem Ali
85241f8408 net/tstun: use /10 as subnet for TAP mode; read IP from netmap
Few changes to resolve TODOs in the code:
- Instead of using a hardcoded IP, get it from the netmap.
- Use 100.100.100.100 as the gateway IP
- Use the /10 CGNAT range instead of a random /24

Updates #2589

Signed-off-by: Maisem Ali <maisem@tailscale.com>
2024-10-21 17:24:29 -07:00
Maisem Ali
d4d21a0bbf net/tstun: restore tap mode functionality
It had bit-rotted likely during the transition to vector io in
76389d8baf. Tested on Ubuntu 24.04
by creating a netns and doing the DHCP dance to get an IP.

Updates #2589

Signed-off-by: Maisem Ali <maisem@tailscale.com>
2024-10-21 17:02:53 -07:00
Nick Khyl
0f4c9c0ecb cmd/viewer: import types/views when generating a getter for a map field
Fixes #13873

Signed-off-by: Nick Khyl <nickk@tailscale.com>
2024-10-21 16:29:16 -05:00
Andrea Gottardo
f8f53bb6d4 health: remove SysDNSOS, add two Warnables for read+set system DNS config (#13874) 2024-10-21 13:40:43 -07:00
Erisa A
72587ab03c scripts/installer.sh: allow Archcraft for Arch packages (#13870)
Fixes #13869

Signed-off-by: Erisa A <erisa@tailscale.com>
2024-10-21 18:13:06 +01:00
Brad Fitzpatrick
c76a6e5167 derp: track client-advertised non-ideal DERP connections in more places
In f77821fd63 (released in v1.72.0), we made the client tell a DERP server
when the connection was not its ideal choice (the first node in its region).

But we didn't do anything with that information until now. This adds a
metric about how many such connections are on a given derper, and also
adds a bit to the PeerPresentFlags bitmask so watchers can identify
(and rebalance) them.

Updates tailscale/corp#372

Change-Id: Ief8af448750aa6d598e5939a57c062f4e55962be
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-10-20 19:56:28 -07:00
Andrea Gottardo
fd77965f23 net/tlsdial: call out firewalls blocking Tailscale in health warnings (#13840)
Updates tailscale/tailscale#13839

Adds a new blockblame package which can detect common MITM SSL certificates used by network appliances. We use this in `tlsdial` to display a dedicated health warning when we cannot connect to control, and a network appliance MITM attack is detected.

Signed-off-by: Andrea Gottardo <andrea@gottardo.me>
2024-10-19 00:35:46 +00:00
Mario Minardi
e711ee5d22 release/dist: clamp min / max version for synology package centre (#13857)
Clamp the min and max version for DSM 7.0 and DSM 7.2 packages when we
are building packages for the synology package centre. This change
leaves packages destined for pkgs.tailscale.com with just the min
version set to not break packages in the wild / our update flow.

Updates https://github.com/tailscale/corp/issues/22908

Signed-off-by: Mario Minardi <mario@tailscale.com>
2024-10-18 14:20:40 -06:00
Jordan Whited
877fa504b4 net/netcheck: remove arbitrary deadlines from GetReport() tests (#13832)
GetReport() may have side effects when the caller enforces a deadline
that is shorter than ReportTimeout.

Updates #13783
Updates #13394

Signed-off-by: Jordan Whited <jordan@tailscale.com>
2024-10-18 13:12:07 -07:00
Nick Khyl
874db2173b ipn/{ipnauth,ipnlocal,ipnserver}: send the auth URL to the user who started interactive login
We add the ClientID() method to the ipnauth.Actor interface and updated ipnserver.actor to implement it.
This method returns a unique ID of the connected client if the actor represents one. It helps link a series
of interactions initiated by the client, such as when a notification needs to be sent back to a specific session,
rather than all active sessions, in response to a certain request.

We also add LocalBackend.WatchNotificationsAs and LocalBackend.StartLoginInteractiveAs methods,
which are like WatchNotifications and StartLoginInteractive but accept an additional parameter
specifying an ipnauth.Actor who initiates the operation. We store these actor identities in
watchSession.owner and LocalBackend.authActor, respectively,and implement LocalBackend.sendTo
and related helper methods to enable sending notifications to watchSessions associated with actors
(or, more broadly, identifiable recipients).

We then use the above to change who receives the BrowseToURL notifications:
 - For user-initiated, interactive logins, the notification is delivered only to the user who initiated the
   process. If the initiating actor represents a specific connected client, the URL notification is sent back
   to the same LocalAPI client that called StartLoginInteractive. Otherwise, the notification is sent to all
   clients connected as that user.
   Currently, we only differentiate between users on Windows, as it is inherently a multi-user OS.
 - In all other cases (e.g., node key expiration), we send the notification to all connected users.

Updates tailscale/corp#18342

Signed-off-by: Nick Khyl <nickk@tailscale.com>
2024-10-18 15:10:02 -05:00
Jordan Whited
bb60da2764 derp: add sclient write deadline timeout metric (#13831)
Write timeouts can be indicative of stalled TCP streams. Understanding
changes in the rate of such events can be helpful in an ops context.

Updates tailscale/corp#23668

Signed-off-by: Jordan Whited <jordan@tailscale.com>
2024-10-18 10:53:49 -07:00
Brad Fitzpatrick
18fc093c0d derp: give trusted mesh peers longer write timeouts
Updates tailscale/corp#24014

Change-Id: I700872be48ab337dce8e11cabef7f82b97f0422a
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-10-18 09:37:20 -07:00
Andrew Dunham
c0a9895748 scripts/installer.sh: support DNF5
This fixes the installation on newer Fedora versions that use dnf5 as
the 'dnf' binary.

Updates #13828

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I39513243c81640fab244a32b7dbb3f32071e9fce
2024-10-17 20:28:41 -04:00
Andrea Gottardo
fa95318a47 tool/gocross: add support for tvOS Simulator (#13847)
Updates ENG-5321

Allow gocross to build a static library for the Apple TV Simulator.

Signed-off-by: Andrea Gottardo <andrea@gottardo.me>
2024-10-17 15:37:10 -07:00
Naman Sood
22c89fcb19 cmd/tailscale,ipn,tailcfg: add tailscale advertise subcommand behind envknob (#13734)
Signed-off-by: Naman Sood <mail@nsood.in>
2024-10-16 19:08:06 -04:00
Mario Minardi
d32d742af0 ipn/ipnlocal: error when trying to use exit node on unsupported platform (#13726)
Adds logic to `checkExitNodePrefsLocked` to return an error when
attempting to use exit nodes on a platform where this is not supported.
This mirrors logic that was added to error out when trying to use `ssh`
on an unsupported platform, and has very similar semantics.

Fixes https://github.com/tailscale/tailscale/issues/13724

Signed-off-by: Mario Minardi <mario@tailscale.com>
2024-10-16 14:09:53 -06:00
Brad Fitzpatrick
6a885dbc36 wgengine/magicsock: fix CI-only test warning of missing health tracker
While looking at deflaking TestTwoDevicePing/ping_1.0.0.2_via_SendPacket,
there were a bunch of distracting:

    WARNING: (non-fatal) nil health.Tracker (being strict in CI): ...

This pacifies those so it's easier to work on actually deflaking the test.

Updates #11762
Updates #11874

Change-Id: I08dcb44511d4996b68d5f1ce5a2619b555a2a773
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-10-16 09:40:49 -07:00
Christian
74dd24ce71 cmd/tsconnect, logpolicy: fixes for wasm_js.go
* updates to LocalBackend require metrics to be passed in which are now initialized
* os.MkdirTemp isn't supported in wasm/js so we simply return empty
  string for logger
* adds a UDP dialer which was missing and led to the dialer being
  incompletely initialized

Fixes #10454 and #8272

Signed-off-by: Christian <christian@devzero.io>
2024-10-16 09:39:48 -07:00
Nick Khyl
ff5f233c3a util/syspolicy: add rsop package that provides access to the resultant policy
In this PR we add syspolicy/rsop package that facilitates policy source registration
and provides access to the resultant policy merged from all registered sources for a
given scope.

Updates #12687

Signed-off-by: Nick Khyl <nickk@tailscale.com>
2024-10-16 00:06:14 -05:00
Andrew Dunham
2aa9125ac4 cmd/derpprobe: add /healthz endpoint
For a customer that wants to run their own DERP prober, let's add a
/healthz endpoint that can be used to monitor derpprobe itself.

Updates #6526

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: Iba315c999fc0b1a93d8c503c07cc733b4c8d5b6b
2024-10-15 16:35:24 -04:00
Tom Proctor
5f22f72636 hostinfo,build_docker.sh,tailcfg: more reliably detect being in a container (#13826)
Our existing container-detection tricks did not work on Kubernetes,
where Docker is no longer used as a container runtime. Extends the
existing go build tags for containers to the other container packages
and uses that to reliably detect builds that were created by Tailscale
for use in a container. Unfortunately this doesn't necessarily improve
detection for users' custom builds, but that's a separate issue.

Updates #13825

Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
2024-10-15 19:38:11 +01:00
License Updater
a8f9c0d6e4 licenses: update license notices
Signed-off-by: License Updater <noreply+license-updater@tailscale.com>
2024-10-14 08:10:13 -07:00
Kristoffer Dalby
e0d711c478 {net/connstats,wgengine/magicsock}: fix packet counting in connstats
connstats currently increments the packet counter whenever it is called
to store a length of data, however when udp batch sending was introduced
we pass the length for a series of packages, and it is only incremented
ones, making it count wrongly if we are on a platform supporting udp
batches.

Updates tailscale/corp#22075

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
2024-10-14 14:17:56 +02:00
Kristoffer Dalby
40c991f6b8 wgengine: instrument with usermetrics
Updates tailscale/corp#22075

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
2024-10-14 11:34:31 +02:00
Paul Scott
adc8368964 tstest: avoid Fatal in ResourceCheck to show panic (#13790)
Fixes #13789

Signed-off-by: Paul Scott <paul@tailscale.com>
2024-10-14 10:02:04 +01:00
Percy Wegmann
12e6094d9c ssh/tailssh: calculate passthrough environment at latest possible stage
This allows passing through any environment variables that we set ourselves, for example DBUS_SESSION_BUS_ADDRESS.

Updates #11175

Co-authored-by: Mario Minardi <mario@tailscale.com>
Signed-off-by: Percy Wegmann <percy@tailscale.com>
2024-10-11 15:25:30 -05:00
Joe Tsai
ecc8035f73 types/bools: add Compare to compare boolean values (#13792)
The bools.Compare function compares boolean values
by reporting -1, 0, +1 for ordering so that it can be easily
used with slices.SortFunc.

Updates #cleanup
Updates tailscale/corp#11038

Signed-off-by: Joe Tsai <joetsai@digital-static.net>
2024-10-11 13:12:18 -07:00
Nick Khyl
f07ff47922 net/dns/resolver: add tests for using a forwarder with multiple upstream resolvers
If multiple upstream DNS servers are available, quad-100 sends requests to all of them
and forwards the first successful response, if any. If no successful responses are received,
it propagates the first failure from any of them.

This PR adds some test coverage for these scenarios.

Updates #13571

Signed-off-by: Nick Khyl <nickk@tailscale.com>
2024-10-11 12:02:27 -05:00
Nick Hill
c2144c44a3 net/dns/resolver: update (*forwarder).forwardWithDestChan to always return an error unless it sends a response to responseChan
We currently have two executions paths where (*forwarder).forwardWithDestChan
returns nil, rather than an error, without sending a DNS response to responseChan.

These paths are accompanied by a comment that reads:
// Returning an error will cause an internal retry, there is
// nothing we can do if parsing failed. Just drop the packet.
But it is not (or no longer longer) accurate: returning an error from forwardWithDestChan
does not currently cause a retry.

Moreover, although these paths are currently unreachable due to implementation details,
if (*forwarder).forwardWithDestChan were to return nil without sending a response to
responseChan, it would cause a deadlock at one call site and a panic at another.

Therefore, we update (*forwarder).forwardWithDestChan to return errors in those two paths
and remove comments that were no longer accurate and misleading.

Updates #cleanup
Updates #13571

Signed-off-by: Nick Hill <mykola.khyl@gmail.com>
2024-10-11 12:02:27 -05:00
Nick Hill
e7545f2eac net/dns/resolver: translate 5xx DoH server errors into SERVFAIL DNS responses
If a DoH server returns an HTTP server error, rather than a SERVFAIL within
a successful HTTP response, we should handle it in the same way as SERVFAIL.

Updates #13571

Signed-off-by: Nick Hill <mykola.khyl@gmail.com>
2024-10-11 12:02:27 -05:00
Nick Hill
17335d2104 net/dns/resolver: forward SERVFAIL responses over PeerDNS
As per the docstring, (*forwarder).forwardWithDestChan should either send to responseChan
and returns nil, or returns a non-nil error (without sending to the channel).
However, this does not hold when all upstream DNS servers replied with an error.

We've been handling this special error path in (*Resolver).Query but not in (*Resolver).HandlePeerDNSQuery.
As a result, SERVFAIL responses from upstream servers were being converted into HTTP 503 responses,
instead of being properly forwarded as SERVFAIL within a successful HTTP response, as per RFC 8484, section 4.2.1:
A successful HTTP response with a 2xx status code (see Section 6.3 of [RFC7231]) is used for any valid DNS response,
regardless of the DNS response code. For example, a successful 2xx HTTP status code is used even with a DNS message
whose DNS response code indicates failure, such as SERVFAIL or NXDOMAIN.

In this PR we fix (*forwarder).forwardWithDestChan to no longer return an error when it sends a response to responseChan,
and remove the special handling in (*Resolver).Query, as it is no longer necessary.

Updates #13571

Signed-off-by: Nick Hill <mykola.khyl@gmail.com>
2024-10-11 12:02:27 -05:00
Percy Wegmann
f9949cde8b client/tailscale,cmd/{cli,get-authkey,k8s-operator}: set distinct User-Agents
This helps better distinguish what is generating activity to the
Tailscale public API.

Updates tailscale/corp#23838

Signed-off-by: Percy Wegmann <percy@tailscale.com>
2024-10-11 10:45:03 -05:00
Jordan Whited
33029d4486 net/netcheck: fix netcheck cli-triggered nil pointer deref (#13782)
Updates #13780

Signed-off-by: Jordan Whited <jordan@tailscale.com>
2024-10-10 15:52:47 -07:00
Jonathan Nobels
acb4a22dcc VERSION.txt: this is v1.77.0 (#13779) 2024-10-10 11:34:14 -07:00
Brad Fitzpatrick
508980603b ipn/conffile: don't depend on hujson on iOS/Android
Fixes #13772

Change-Id: I3ae03a5ee48c801f2e5ea12d1e54681df25d4604
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-10-10 09:14:36 -07:00
Andrew Dunham
91f58c5e63 tsnet: fix panic caused by logging after test finishes
Updates #13773

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I95e03eb6aef1639bd4a2efd3a415e2c10cdebc5a
2024-10-10 11:11:02 -04:00
Brad Fitzpatrick
1938685d39 clientupdate: don't link distsign on platforms that don't download
Updates tailscale/corp#20099

Change-Id: Ie3b782379b19d5f7890a8d3a378096b4f3e8a612
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-10-10 06:32:50 -07:00
Irbe Krumina
db1519cc9f k8s-operator/apis: revert ProxyGroup readiness cond name change (#13770)
No need to prefix this with 'Tailscale' for tailscale.com
custom resource types.

Updates tailscale/tailscale#13406

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2024-10-10 13:00:32 +01:00
Brad Fitzpatrick
2531065d10 clientupdate, ipn/localapi: don't use google/uuid, thin iOS deps
We were using google/uuid in two places and that brought in database/sql/driver.

We didn't need it in either place.

Updates #13760
Updates tailscale/corp#20099

Change-Id: Ieed32f1bebe35d35f47ec5a2a429268f24f11f1f
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-10-09 20:27:35 -07:00
Brad Fitzpatrick
fb420be176 safesocket: don't depend on go-ps on iOS
There's never a tailscaled on iOS. And we can't run child processes to
look for it anyway.

Updates tailscale/corp#20099

Change-Id: Ieb3776f4bb440c4f1c442fdd169bacbe17f23ddb
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-10-09 18:35:53 -07:00
Brad Fitzpatrick
367fba8520 control/controlhttp: don't link ts2021 server + websocket code on iOS
We probably shouldn't link it in anywhere, but let's fix iOS for now.

Updates #13762
Updates tailscale/corp#20099

Change-Id: Idac116e9340434334c256acba3866f02bd19827c
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-10-09 18:25:02 -07:00
Joe Tsai
52ef27ab7c taildrop: fix defer in loop (#13757)
However, this affects the scope of a defer.

Updates #11038

Signed-off-by: Joe Tsai <joetsai@digital-static.net>
2024-10-09 14:09:58 -07:00
Joe Tsai
5b7303817e syncs: allocate map with Map.WithLock (#13755)
One primary purpose of WithLock is to mutate the underlying map.
However, this can lead to a panic if it happens to be nil.
Thus, always allocate a map before passing it to f.

Updates tailscale/corp#11038

Signed-off-by: Joe Tsai <joetsai@digital-static.net>
2024-10-09 14:03:37 -07:00
Brad Fitzpatrick
c763b7a7db syncs: delete Map.Range, update callers to iterators
Updates #11038

Change-Id: I2819fed896cc4035aba5e4e141b52c12637373b1
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-10-09 13:56:13 -07:00
Percy Wegmann
2cadb80fb2 util/vizerror: add WrapWithMessage
Thus new function allows constructing vizerrors that combine a message
appropriate for display to users with a wrapped underlying error.

Updates tailscale/corp#23781

Signed-off-by: Percy Wegmann <percy@tailscale.com>
2024-10-09 12:59:25 -05:00
Joe Tsai
910b4e8e6a syncs: add iterators to Map (#13739)
Add Keys, Values, and All to iterate over
all keys, values, and entries, respectively.

Updates #11038

Signed-off-by: Joe Tsai <joetsai@digital-static.net>
2024-10-09 10:28:12 -07:00
Irbe Krumina
89ee6bbdae cmd/k8s-operator,k8s-operator/apis: set a readiness condition on egress Services for ProxyGroup (#13746)
cmd/k8s-operator,k8s-operator/apis: set a readiness condition on egress Services

Set a readiness condition on ExternalName Services that define a tailnet target
to route cluster traffic to via a ProxyGroup's proxies. The condition
is set to true if at least one proxy is currently set up to route.

Updates tailscale/tailscale#13406

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2024-10-09 18:23:40 +01:00
Brad Fitzpatrick
94c79659fa types/views: add iterators to the three Map view types
Their callers using Range are all kinda clunky feeling. Iterators
should make them more readable.

Updates #12912

Change-Id: I93461eba8e735276fda4a8558a4ae4bfd6c04922
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-10-09 10:00:29 -07:00
Irbe Krumina
f6d4d03355 cmd/k8s-operator: don't error out if ProxyClass for ProxyGroup not found. (#13736)
We don't need to error out and continuously reconcile if ProxyClass
has not (yet) been created, once it gets created the ProxyGroup
reconciler will get triggered.

Updates tailscale/tailscale#13406

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2024-10-09 13:23:00 +01:00
Irbe Krumina
60011e73b8 cmd/k8s-operator: fix Pod IP selection (#13743)
Ensure that .status.podIPs is used to select Pod's IP
in all reconcilers.

Updates tailscale/tailscale#13406

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2024-10-09 13:22:50 +01:00
Nick Khyl
da40609abd util/syspolicy, ipn: add "tailscale debug component-logs" support
Fixes #13313
Fixes #12687

Signed-off-by: Nick Khyl <nickk@tailscale.com>
2024-10-08 18:11:23 -05:00
Nick Khyl
29cf59a9b4 util/syspolicy/setting: update Snapshot to use Go 1.23 iterators
Updates #12912
Updates #12687

Signed-off-by: Nick Khyl <nickk@tailscale.com>
2024-10-08 15:02:23 -05:00
Tom Proctor
07c157ee9f cmd/k8s-operator: base ProxyGroup StatefulSet on common proxy.yaml definition (#13714)
As discussed in #13684, base the ProxyGroup's proxy definitions on the same
scaffolding as the existing proxies, as defined in proxy.yaml

Updates #13406

Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
2024-10-08 20:05:08 +01:00
Tom Proctor
83efadee9f kube/egressservices: improve egress ports config readability (#13722)
Instead of converting our PortMap struct to a string during marshalling
for use as a key, convert the whole collection of PortMaps to a list of
PortMap objects, which improves the readability of the JSON config while
still keeping the data structure we need in the code.

Updates #13406

Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
2024-10-08 19:48:18 +01:00
Brad Fitzpatrick
841eaacb07 net/sockstats: quiet some log spam in release builds
Updates #13731

Change-Id: Ibee85426827ebb9e43a1c42a9c07c847daa50117
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-10-08 11:02:46 -07:00
Irbe Krumina
861dc3631c cmd/{k8s-operator,containerboot},kube/egressservices: fix Pod IP check for dual stack clusters (#13721)
Currently egress Services for ProxyGroup only work for Pods and Services
with IPv4 addresses. Ensure that it works on dual stack clusters by reading
proxy Pod's IP from the .status.podIPs list that always contains both
IPv4 and IPv6 address (if the Pod has them) rather than .status.podIP that
could contain IPv6 only for a dual stack cluster.

Updates tailscale/tailscale#13406

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2024-10-08 18:35:23 +01:00
Andrew Dunham
8ee7f82bf4 net/netcheck: don't panic if a region has no Nodes
Updates #13728

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I1e8319d6b2da013ae48f15113b30c9333e69cc0b
2024-10-08 12:52:27 -04:00
Tom Proctor
36cb2e4e5f cmd/k8s-operator,k8s-operator: use default ProxyClass if set for ProxyGroup (#13720)
The default ProxyClass can be set via helm chart or env var, and applies
to all proxies that do not otherwise have an explicit ProxyClass set.
This ensures proxies created by the new ProxyGroup CRD are consistent
with the behaviour of existing proxies

Nearby but unrelated changes:

* Fix up double error logs (controller runtime logs returned errors)
* Fix a couple of variable names

Updates #13406

Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
2024-10-08 17:34:34 +01:00
Tom Proctor
cba2e76568 cmd/containerboot: simplify k8s setup logic (#13627)
Rearrange conditionals to reduce indentation and make it a bit easier to read
the logic. Also makes some error message updates for better consistency
with the recent decision around capitalising resource names and the
upcoming addition of config secrets.

Updates #cleanup

Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
2024-10-08 17:13:00 +01:00
dependabot[bot]
866714a894 .github: Bump github/codeql-action from 3.26.9 to 3.26.11 (#13710)
Bumps [github/codeql-action](https://github.com/github/codeql-action) from 3.26.9 to 3.26.11.
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](461ef6c76d...6db8d6351f)

---
updated-dependencies:
- dependency-name: github/codeql-action
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-07 22:15:40 -06:00
dependabot[bot]
266c14d6ca .github: Bump actions/cache from 4.0.2 to 4.1.0 (#13711)
Bumps [actions/cache](https://github.com/actions/cache) from 4.0.2 to 4.1.0.
- [Release notes](https://github.com/actions/cache/releases)
- [Changelog](https://github.com/actions/cache/blob/main/RELEASES.md)
- [Commits](0c45773b62...2cdf405574)

---
updated-dependencies:
- dependency-name: actions/cache
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-07 20:48:06 -06:00
Nick Hill
9a73462ea4 types/lazy: add DeferredInit type
It is sometimes necessary to defer initialization steps until the first actual usage
or until certain prerequisites have been met. For example, policy setting and
policy source registration should not occur during package initialization.
Instead, they should be deferred until the syspolicy package is actually used.
Additionally, any errors should be properly handled and reported, rather than
causing a panic within the package's init function.

In this PR, we add DeferredInit, to facilitate the registration and invocation
of deferred initialization functions.

Updates #12687

Signed-off-by: Nick Hill <mykola.khyl@gmail.com>
2024-10-07 15:43:22 -05:00
Brad Fitzpatrick
f3de4e96a8 derp: fix omitted word in comment
Fix comment just added in 38f236c725.

Updates tailscale/corp#23668
Updates #cleanup

Change-Id: Icbe112e24fcccf8c61c759c631ad09f3e5480547
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-10-07 12:21:10 -07:00
Irbe Krumina
7f016baa87 cmd/k8s-operator,k8s-operator: create ConfigMap for egress services + small fixes for egress services (#13715)
cmd/k8s-operator, k8s-operator: create ConfigMap for egress services + small reconciler fixes

Updates tailscale/tailscale#13406

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2024-10-07 20:12:56 +01:00
Brad Fitzpatrick
38f236c725 derp: add server metric for batch write sizes
Updates tailscale/corp#23668

Change-Id: Ie6268c4035a3b29fd53c072c5793e4cbba93d031
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-10-07 11:22:51 -07:00
Erisa A
c588c36233 types/key: use tlpub: in error message (#13707)
Fixes tailscale/corp#19442

Signed-off-by: Erisa A <erisa@tailscale.com>
2024-10-07 17:28:45 +01:00
Brad Fitzpatrick
cb10eddc26 tool/gocross: fix argument order to find
To avoid warning:

    find: warning: you have specified the global option -maxdepth after the argument -type, but global options are not positional, i.e., -maxdepth affects tests specified before it as well as those specified after it.  Please specify global options before other arguments.

Fixes tailscale/corp#23689

Change-Id: I91ee260b295c552c0a029883d5e406733e081478
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-10-07 08:07:03 -07:00
Tom Proctor
e48cddfbb3 cmd/{containerboot,k8s-operator},k8s-operator,kube: add ProxyGroup controller (#13684)
Implements the controller for the new ProxyGroup CRD, designed for
running proxies in a high availability configuration. Each proxy gets
its own config and state Secret, and its own tailscale node ID.

We are currently mounting all of the config secrets into the container,
but will stop mounting them and instead read them directly from the kube
API once #13578 is implemented.

Updates #13406

Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
2024-10-07 14:58:45 +01:00
Brad Fitzpatrick
1005cbc1e4 tailscaleroot: panic if tailscale_go build tag but Go toolchain mismatch
Fixes #13527

Change-Id: I05921969a84a303b60d1b3b9227aff9865662831
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-10-06 15:22:04 -07:00
Brad Fitzpatrick
c48cc08de2 wgengine: stop conntrack log spam about Canonical net probes
Like we do for the ones on iOS.

As a bonus, this removes a caller of tsaddr.IsTailscaleIP which we
want to revamp/remove soonish.

Updates #13687

Change-Id: Iab576a0c48e9005c7844ab52a0aba5ba343b750e
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-10-05 12:51:55 -07:00
Andrew Dunham
12f1bc7c77 envknob: support disk-based envknobs on the macsys build
Per my investigation just now, the $HOME environment variable is unset
on the macsys (standalone macOS GUI) variant, but the current working
directory is valid. Look for the environment variable file in that
location in addition to inside the home directory.

Updates #3707

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I481ae2e0d19b316244373e06865e3b5c3a9f3b88
2024-10-04 17:12:27 -04:00
Patrick O'Doherty
4ad3f01225 safeweb: allow passing http.Server in safeweb.Config (#13688)
Extend safeweb.Config with the ability to pass a http.Server that
safeweb will use to server traffic.

Updates corp#8207

Signed-off-by: Patrick O'Doherty <patrick@tailscale.com>
2024-10-04 11:57:00 -07:00
kari-ts
8fdffb8da0 hostinfo: update SetPackage doc with new Android values (#13537)
Fixes tailscale/corp#23283

Signed-off-by: kari-ts <kari@tailscale.com>
2024-10-04 16:35:19 +00:00
Erisa A
f30d85310c cmd/tailscale/cli: don't print disablement secrets if init fails (#13673)
* cmd/tailscale/cli: don't print disablement secrets if init fails

Fixes tailscale/corp#11355

Signed-off-by: Erisa A <erisa@tailscale.com>

* cmd/tailscale/cli: changes from code review

Signed-off-by: Erisa A <erisa@tailscale.com>

* cmd/tailscale/cli: small grammar change

Signed-off-by: Erisa A <erisa@tailscale.com>

---------

Signed-off-by: Erisa A <erisa@tailscale.com>
2024-10-04 16:01:48 +01:00
Irbe Krumina
e8bb5d1be5 cmd/{k8s-operator,containerboot},k8s-operator,kube: reconcile ExternalName Services for ProxyGroup (#13635)
Adds a new reconciler that reconciles ExternalName Services that define a
tailnet target that should be exposed to cluster workloads on a ProxyGroup's
proxies.
The reconciler ensures that for each such service, the config mounted to
the proxies is updated with the tailnet target definition and that
and EndpointSlice and ClusterIP Service are created for the service.

Adds a new reconciler that ensures that as proxy Pods become ready to route
traffic to a tailnet target, the EndpointSlice for the target is updated
with the Pods' endpoints.

Updates tailscale/tailscale#13406

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2024-10-04 13:11:35 +01:00
Irbe Krumina
9bd158cc09 cmd/containerboot,util/linuxfw: create a SNAT rule for dst/src only once, clean up if needed (#13658)
The AddSNATRuleForDst rule was adding a new rule each time it was called including:
- if a rule already existed
- if a rule matching the destination, but with different desired source already existed

This was causing issues especially for the in-progress egress HA proxies work,
where the rules are now refreshed more frequently, so more redundant rules
were being created.

This change:
- only creates the rule if it doesn't already exist
- if a rule for the same dst, but different source is found, delete it
- also ensures that egress proxies refresh firewall rules
if the node's tailnet IP changes

Updates tailscale/tailscale#13406

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2024-10-03 20:15:00 +01:00
Patrick O'Doherty
a3c6a3a34f safeweb: add StrictTransportSecurityOptions config (#13679)
Add the ability to specify Strict-Transport-Security options in response
to BrowserMux HTTP requests in safeweb.

Updates https://github.com/tailscale/corp/issues/23375

Signed-off-by: Patrick O'Doherty <patrick@tailscale.com>
2024-10-03 18:38:29 +00:00
Brad Fitzpatrick
dc60c8d786 ssh/tailssh: pass window size pixels in IoctlSetWinsize events
Fixes #13669

Change-Id: Id44cfbb83183f1bbcbdc38c29238287b9d288707
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-10-03 09:24:28 -07:00
Andrea Gottardo
58c6bc2991 logpolicy: force TLS 1.3 handshake
Updates tailscale/tailscale#3363

We know `log.tailscale.io` supports TLS 1.3, so we can enforce its usage in the client to shake some bytes off the TLS handshake each time a connection is opened to upload logs.

Signed-off-by: Andrea Gottardo <andrea@gottardo.me>
2024-10-03 09:16:23 -07:00
Brad Fitzpatrick
5f88b65764 wgengine/netstack: check userspace ping success on Windows
Hacky temporary workaround until we do #13654 correctly.

Updates #13654

Change-Id: I764eaedbb112fb3a34dddb89572fec1b2543fd4a
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-10-03 09:07:39 -07:00
Brad Fitzpatrick
1f8eea53a8 control/controlclient: include HTTP status string in error message too
Not just its code.

Updates tailscale/corp#23584

Change-Id: I8001a675372fe15da797adde22f04488d8683448
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-10-03 08:37:16 -07:00
Brad Fitzpatrick
6f694da912 wgengine/magicsock: avoid log spam from ReceiveFunc on shutdown
The new logging in 2dd71e64ac is spammy at shutdown:

    Receive func ReceiveIPv6 exiting with error: *net.OpError, read udp [::]:38869: raw-read udp6 [::]:38869: use of closed network connection
    Receive func ReceiveIPv4 exiting with error: *net.OpError, read udp 0.0.0.0:36123: raw-read udp4 0.0.0.0:36123: use of closed network connection

Skip it if we're in the process of shutting down.

Updates #10976

Change-Id: I4f6d1c68465557eb9ffe335d43d740e499ba9786
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-10-02 20:22:12 -07:00
Naman Sood
09ec2f39b5 tailcfg: add func to check for known valid ServiceProtos (#13668)
Updates tailscale/corp#23574.

Signed-off-by: Naman Sood <mail@nsood.in>
2024-10-02 22:54:02 -04:00
Brad Fitzpatrick
383120c534 ipn/ipnlocal: don't run portlist code unless service collection is on
We were selectively uploading it, but we were still gathering it,
which can be a waste of CPU.

Also remove a bunch of complexity that I don't think matters anymore.

And add an envknob to force service collection off on a single node,
even if the tailnet policy permits it.

Fixes #13463

Change-Id: Ib6abe9e29d92df4ffa955225289f045eeeb279cf
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-10-02 18:08:31 -07:00
Nick Khyl
d837e0252f wf/firewall: allow link-local multicast for permitted local routes when the killswitch is on on Windows
When an Exit Node is used, we create a WFP rule to block all inbound and outbound traffic,
along with several rules to permit specific types of traffic. Notably, we allow all inbound and
outbound traffic to and from LocalRoutes specified in wgengine/router.Config. The list of allowed
routes always includes routes for internal interfaces, such as loopback and virtual Hyper-V/WSL2
interfaces, and may also include LAN routes if the "Allow local network access" option is enabled.
However, these permitting rules do not allow link-local multicast on the corresponding interfaces.
This results in broken mDNS/LLMNR, and potentially other similar issues, whenever an exit node is used.

In this PR, we update (*wf.Firewall).UpdatePermittedRoutes() to create rules allowing outbound and
inbound link-local multicast traffic to and from the permitted IP ranges, partially resolving the mDNS/LLMNR
and *.local name resolution issue.

Since Windows does not attempt to send mDNS/LLMNR queries if a catch-all NRPT rule is present,
it is still necessary to disable the creation of that rule using the disable-local-dns-override-via-nrpt nodeAttr.

Updates #13571

Signed-off-by: Nick Khyl <nickk@tailscale.com>
2024-10-02 18:36:01 -05:00
Brad Fitzpatrick
b8af93310a tstest: add the start of a testing wishlist
Of tests we wish we could easily add. One day.

Updates #13038

Change-Id: If44646f8d477674bbf2c9a6e58c3cd8f94a4e8df
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-10-02 16:08:41 -07:00
Andrea Gottardo
6de6ab015f net/dns: tweak DoH timeout, limit MaxConnsPerHost, require TLS 1.3 (#13564)
Updates tailscale/tailscale#6148

This is the result of some observations we made today with @raggi. The DNS over HTTPS client currently doesn't cap the number of connections it uses, either in-use or idle. A burst of DNS queries will open multiple connections. Idle connections remain open for 30 seconds (this interval is defined in the dohTransportTimeout constant). For DoH providers like NextDNS which send keep-alives, this means the cellular modem will remain up more than expected to send ACKs if any keep-alives are received while a connection remains idle during those 30 seconds. We can set the IdleConnTimeout to 10 seconds to ensure an idle connection is terminated if no other DNS queries come in after 10 seconds. Additionally, we can cap the number of connections to 1. This ensures that at all times there is only one open DoH connection, either active or idle. If idle, it will be terminated within 10 seconds from the last query.

We also observed all the DoH providers we support are capable of TLS 1.3. We can force this TLS version to reduce the number of packets sent/received each time a TLS connection is established.

Signed-off-by: Andrea Gottardo <andrea@gottardo.me>
2024-10-02 09:26:11 -07:00
Brad Fitzpatrick
a01b545441 control/control{client,http}: don't noise dial localhost:443 in http-only tests
1eaad7d3de regressed some tests in another repo that were starting up
a control server on `http://127.0.0.1:nnn`. Because there was no https
running, and because of a bug in 1eaad7d3de (which ended up checking
the recently-dialed-control check twice in a single dial call), we
ended up forcing only the use of TLS dials in a test that only had
plaintext HTTP running.

Instead, plumb down support for explicitly disabling TLS fallbacks and
use it only when running in a test and using `http` scheme control
plane URLs to 127.0.0.1 or localhost.

This fixes the tests elsewhere.

Updates #13597

Change-Id: I97212ded21daf0bd510891a278078daec3eebaa6
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-10-02 10:41:08 -05:00
Brad Fitzpatrick
6b03e18975 control/controlhttp: rename a param from addr to optAddr for clarity
And update docs.

Updates #cleanup
Updates #13597 (tangentially; noted this cleanup while debugging)

Change-Id: I62440294c78b0bb3f5673be10318dd89af1e1bfe
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-10-02 10:41:08 -05:00
Brad Fitzpatrick
f49d218cfe net/dnscache: don't fall back to an IPv6 dial if we don't have IPv6
I noticed while debugging a test failure elsewhere that our failure
logs (when verbosity is cranked up) were uselessly attributing dial
failures to failure to dial an invalid IP address (this IPv6 address
we didn't have), rather than showing me the actual IPv4 connection
failure.

Updates #13597 (tangentially)

Change-Id: I45ffbefbc7e25ebfb15768006413a705b941dae5
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-10-02 10:41:08 -05:00
Brad Fitzpatrick
30f0fa95d9 control/controlclient: bound ReportHealthChange context lifetime to Direct client's
Fixes #13651

Change-Id: I8154d3cc0ca40fe7a0223b26ae2e77e8d6ba874b
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-10-02 10:40:39 -05:00
Andrea Gottardo
ed1ac799c8 net/captivedetection: set Timeout on net.Dialer (#13613)
Updates tailscale/tailscale#1634
Updates tailscale/tailscale#13265

Captive portal detection uses a custom `net.Dialer` in its `http.Client`. This custom Dialer ensures that the socket is bound specifically to the Wi-Fi interface. This is crucial because without it, if any default routes are set, the outgoing requests for detecting a captive portal would bypass Wi-Fi and go through the default route instead.

The Dialer did not have a Timeout property configured, so the default system timeout was applied. This caused issues in #13265, where we attempted to make captive portal detection requests over an IPsec interface used for Wi-Fi Calling. The call to `connect()` would fail and remain blocked until the system timeout (approximately 1 minute) was reached.

In #13598, I simply excluded the IPsec interface from captive portal detection. This was a quick and safe mitigation for the issue. This PR is a follow-up to make the process more robust, by setting a 3 seconds timeout on any connection establishment on any interface (this is the same timeout interval we were already setting on the HTTP client).

Signed-off-by: Andrea Gottardo <andrea@gottardo.me>
2024-10-02 15:29:46 +00:00
Nick Khyl
e66fe1f2e8 docs/windows/policy: add ADMX policy setting to configure the AuthKey
Updates tailscale/corp#22120

Signed-off-by: Nick Khyl <nickk@tailscale.com>
2024-10-02 09:19:19 -05:00
dependabot[bot]
992ee6dd0b .github: Bump github/codeql-action from 3.26.8 to 3.26.9 (#13625)
Bumps [github/codeql-action](https://github.com/github/codeql-action) from 3.26.8 to 3.26.9.
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](294a9d9291...461ef6c76d)

---
updated-dependencies:
- dependency-name: github/codeql-action
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-01 23:27:30 -06:00
Brad Fitzpatrick
262c526c4e net/portmapper: don't treat 0.0.0.0 as a valid IP
Updates tailscale/corp#23538

Change-Id: I58b8c30abe43f1d1829f01eb9fb2c1e6e8db9476
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-10-01 16:11:47 -05:00
Andrew Dunham
16ef88754d net/portmapper: don't return unspecified/local external IPs
We were previously not checking that the external IP that we got back
from a UPnP portmap was a valid endpoint; add minimal validation that
this endpoint is something that is routeable by another host.

Updates tailscale/corp#23538

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: Id9649e7683394aced326d5348f4caa24d0efd532
2024-10-01 14:13:40 -04:00
Brad Fitzpatrick
1eaad7d3de control/controlhttp: fix connectivity on Alaska Air wifi
Updates #13597

Change-Id: Ifbf52b93fd35d64fcf80f8fddbfd610008fd8742
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-10-01 11:58:20 -05:00
Brad Fitzpatrick
fd32f0ddf4 control/controlhttp: factor out some code in prep for future change
This pulls out the clock and forceNoise443 code into methods on the
Dialer as cleanup in its own commit to make a future change less
distracting.

Updates #13597

Change-Id: I7001e57fe7b508605930c5b141a061b6fb908733
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-10-01 11:28:59 -05:00
Brad Fitzpatrick
d3f302d8e2 cmd/tailscale/cli: make 'tailscale debug ts2021' try twice
In prep for a future port 80 MITM fix, make the 'debug ts2021' command
retry once after a failure to give it a chance to pick a new strategy.

Updates #13597

Change-Id: Icb7bad60cbf0dbec78097df4a00e9795757bc8e4
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-10-01 11:28:59 -05:00
Mario Minardi
8f44ba1cd6 ssh: Add logic to set accepted environment variables in SSH session (#13559)
Add logic to set environment variables that match the SSH rule's
`acceptEnv` settings in the SSH session's environment.

Updates https://github.com/tailscale/corp/issues/22775

Signed-off-by: Mario Minardi <mario@tailscale.com>
2024-09-30 21:47:45 -06:00
dependabot[bot]
dd6b808acf .github: Bump peter-evans/create-pull-request from 7.0.1 to 7.0.5 (#13626)
Bumps [peter-evans/create-pull-request](https://github.com/peter-evans/create-pull-request) from 7.0.1 to 7.0.5.
- [Release notes](https://github.com/peter-evans/create-pull-request/releases)
- [Commits](8867c4aba1...5e914681df)

---
updated-dependencies:
- dependency-name: peter-evans/create-pull-request
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-30 21:12:44 -06:00
Anton Tolchanov
a70287d324 logpolicy: don't create a filch buffer if logging is disabled
Updates #9549

Signed-off-by: Anton Tolchanov <commits@knyar.net>
2024-09-30 11:36:08 +02:00
Maisem Ali
fb0f8fc0ae cmd/tsidp: add --dir flag
To better control where the tsnet state is being stored.

Updates #10263

Signed-off-by: Maisem Ali <maisem@tailscale.com>
2024-09-29 16:15:22 -07:00
Irbe Krumina
096b090caf cmd/containerboot,kube,util/linuxfw: configure kube egress proxies to route to 1+ tailnet targets (#13531)
* cmd/containerboot,kube,util/linuxfw: configure kube egress proxies to route to 1+ tailnet targets

This commit is first part of the work to allow running multiple
replicas of the Kubernetes operator egress proxies per tailnet service +
to allow exposing multiple tailnet services via each proxy replica.

This expands the existing iptables/nftables-based proxy configuration
mechanism.

A proxy can now be configured to route to one or more tailnet targets
via a (mounted) config file that, for each tailnet target, specifies:
- the target's tailnet IP or FQDN
- mappings of container ports to which cluster workloads will send traffic to
tailnet target ports where the traffic should be forwarded.

Example configfile contents:
{
  "some-svc": {"tailnetTarget":{"fqdn":"foo.tailnetxyz.ts.net","ports"{"tcp:4006:80":{"protocol":"tcp","matchPort":4006,"targetPort":80},"tcp:4007:443":{"protocol":"tcp","matchPort":4007,"targetPort":443}}}}
}

A proxy that is configured with this config file will configure firewall rules
to route cluster traffic to the tailnet targets. It will then watch the config file
for updates as well as monitor relevant netmap updates and reconfigure firewall
as needed.

This adds a bunch of new iptables/nftables functionality to make it easier to dynamically update
the firewall rules without needing to restart the proxy Pod as well as to make
it easier to debug/understand the rules:

- for iptables, each portmapping is a DNAT rule with a comment pointing
at the 'service',i.e:

-A PREROUTING ! -i tailscale0 -p tcp -m tcp --dport 4006 -m comment --comment "some-svc:tcp:4006 -> tcp:80" -j DNAT --to-destination 100.64.1.18:80
Additionally there is a SNAT rule for each tailnet target, to mask the source address.

- for nftables, a separate prerouting chain is created for each tailnet target
and all the portmapping rules are placed in that chain. This makes it easier
to look up rules and delete services when no longer needed.
(nftables allows hooking a custom chain to a prerouting hook, so no extra work
is needed to ensure that the rules in the service chains are evaluated).

The next steps will be to get the Kubernetes Operator to generate
the configfile and ensure it is mounted to the relevant proxy nodes.

Updates tailscale/tailscale#13406

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2024-09-29 16:30:53 +01:00
Irbe Krumina
c62b0732d2 cmd/k8s-operator: remove auth key once proxy has logged in (#13612)
The operator creates a non-reusable auth key for each of
the cluster proxies that it creates and puts in the tailscaled
configfile mounted to the proxies.
The proxies are always tagged, and their state is persisted
in a Kubernetes Secret, so their node keys are expected to never
be regenerated, so that they don't need to re-auth.

Some tailnet configurations however have seen issues where the auth
keys being left in the tailscaled configfile cause the proxies
to end up in unauthorized state after a restart at a later point
in time.
Currently, we have not found a way to reproduce this issue,
however this commit removes the auth key from the config once
the proxy can be assumed to have logged in.

If an existing, logged-in proxy is upgraded to this version,
its redundant auth key will be removed from the conffile.

If an existing, logged-in proxy is downgraded from this version
to a previous version, it will work as before without re-issuing key
as the previous code did not enforce that a key must be present.

Updates tailscale/tailscale#13451

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2024-09-27 17:47:27 +01:00
Kristoffer Dalby
77832553e5 ipn/ipnlocal: add advertised and primary route metrics
Updates tailscale/corp#22075

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
2024-09-27 16:05:14 +02:00
Tom Proctor
cab2e6ea67 cmd/k8s-operator,k8s-operator: add ProxyGroup CRD (#13591)
The ProxyGroup CRD specifies a set of N pods which will each be a
tailnet device, and will have M different ingress or egress services
mapped onto them. It is the mechanism for specifying how highly
available proxies need to be. This commit only adds the definition, no
controller loop, and so it is not currently functional.

This commit also splits out TailnetDevice and RecorderTailnetDevice
into separate structs because the URL field is specific to recorders,
but we want a more generic struct for use in the ProxyGroup status field.

Updates #13406

Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
2024-09-27 01:05:56 +01:00
Andrew Dunham
7ec8bdf8b1 go.mod: upgrade golangci-lint
To pull in the fix for mgechev/revive#863 - seen in the GitHub Actions
check below:
    https://github.com/tailscale/tailscale/actions/runs/11057524933/job/30721507353?pr=13600

Updates #13602

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: Ia04adc5d74bdbde14204645ca948794447b16776
2024-09-26 17:08:54 -04:00
Andrea Gottardo
69be54c7b6 net/captivedetection: exclude ipsec interfaces from captive portal detection (#13598)
Updates tailscale/tailscale#1634

Logs from some iOS users indicate that we're pointlessly performing captive portal detection on certain interfaces named ipsec*. These are tunnels with the cellular carrier that do not offer Internet access, and are only used to provide internet calling functionality (VoLTE / VoWiFi).

```
attempting to do captive portal detection on interface ipsec1
attempting to do captive portal detection on interface ipsec6
```

This PR excludes interfaces with the `ipsec` prefix from captive portal detection.

Signed-off-by: Andrea Gottardo <andrea@gottardo.me>
2024-09-26 17:28:10 +00:00
Kristoffer Dalby
5550a17391 wgengine: make opts.Metrics mandatory
Fixes #13582

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
2024-09-26 13:09:47 +02:00
Kristoffer Dalby
7d1160ddaa {ipn,net,tsnet}: use tsaddr helpers
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
2024-09-26 12:17:31 +02:00
Kristoffer Dalby
f03e82a97c client/web: use tsaddr helpers
Updates #cleanup

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
2024-09-26 12:17:31 +02:00
Kristoffer Dalby
0909431660 cmd/tailscale: use tsaddr helpers
Updates #cleanup

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
2024-09-26 12:17:31 +02:00
Kristoffer Dalby
3dc33a0a5b net/tsaddr: add WithoutExitRoutes and IsExitRoute
Updates #cleanup

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
2024-09-26 12:17:31 +02:00
Mario Minardi
c90c9938c8 ssh/tailssh: add logic for matching against AcceptEnv patterns (#13466)
Add logic for parsing and matching against our planned format for
AcceptEnv values. Namely, this supports direct matches against string
values and matching where * and ? are treated as wildcard characters
which match against an arbitrary number of characters and a single
character respectively.

Actually using this logic in non-test code will come in subsequent
changes.

Updates https://github.com/tailscale/corp/issues/22775

Signed-off-by: Mario Minardi <mario@tailscale.com>
2024-09-25 21:09:05 -06:00
James Tucker
9eb59c72c1 wgengine/magicsock: fix check for EPERM on macOS
Like Linux, macOS will reply to sendto(2) with EPERM if the firewall is
currently blocking writes, though this behavior is like Linux
undocumented. This is often caused by a faulting network extension or
content filter from EDR software.

Updates #11710
Updates #12891
Updates #13511

Signed-off-by: James Tucker <james@tailscale.com>
2024-09-25 16:33:36 -07:00
Andrew Dunham
717d589149 metrics: revert changes to MultiLabelMap's String method
This breaks its ability to be used as an expvar and is blocking a trunkd
deploy. Revert for now, and add a test to ensure that we don't break it
in a future change.

Updates #13550

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I1f1221c257c1de47b4bff0597c12f8530736116d
2024-09-25 19:20:50 -04:00
Cameron Stokes
65c26357b1 cmd/k8s-operator, k8s-operator: fix outdated kb links (#13585)
updates #13583

Signed-off-by: Cameron Stokes <cameron@tailscale.com>
2024-09-25 22:15:42 +01:00
Adrian Dewhurst
2fdbcbdf86 wgengine/magicsock: only used cached results for GetLastNetcheckReport
When querying for an exit node suggestion, occasionally it triggers a
new report concurrently with an existing report in progress. Generally,
there should always be a recent report or one in progress, so it is
redundant to start one there, and it causes concurrency issues.

Fixes #12643

Change-Id: I66ab9003972f673e5d4416f40eccd7c6676272a5
Signed-off-by: Adrian Dewhurst <adrian@tailscale.com>
2024-09-25 16:50:33 -04:00
Brad Fitzpatrick
c2f0c705e7 health: clean up updateBuiltinWarnablesLocked a bit, fix DERP warnings
Updates #13265

Change-Id: Iabe4a062204a7859d869f6acfb9274437b4ea1ea
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-09-25 12:52:02 -07:00
Kristoffer Dalby
0e0e53d3b3 util/usermetrics: make usermetrics non-global
this commit changes usermetrics to be non-global, this is a building
block for correct metrics if a go process runs multiple tsnets or
in tests.

Updates #13420
Updates tailscale/corp#22075

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
2024-09-25 15:57:00 +02:00
Brad Fitzpatrick
e1bbe1bf45 derp: document the RunWatchConnectionLoop callback gotchas
Updates #13566

Change-Id: I497b5adc57f8b1b97dbc3f74c0dc67140caad436
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-09-24 15:32:08 -07:00
Brad Fitzpatrick
6f7e7a30e3 tool/gocross: make gocross-wrapper.sh keep multiple Go toolchains around
So it doesn't delete and re-pull when switching between branches.

Updates tailscale/corp#17686

Change-Id: Iffb989781db42fcd673c5f03dbd0ce95972ede0f
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-09-24 14:17:45 -07:00
Mario Minardi
43f4131d7a {release,version}: add DSM7.2 specific synology builds (#13405)
Add separate builds for DSM7.2 for synology so that we can encode
separate versioning information in the INFO file to distinguish between
the two.

Fixes https://github.com/tailscale/corp/issues/22908

Signed-off-by: Mario Minardi <mario@tailscale.com>
2024-09-24 15:00:37 -06:00
Andrea Gottardo
8a6f48b455 cli: add tailscale dns query (#13368)
Updates tailscale/tailscale#13326

Adds a CLI subcommand to perform DNS queries using the internal DNS forwarder and observe its internals (namely, which upstream resolvers are being used).

Signed-off-by: Andrea Gottardo <andrea@gottardo.me>
2024-09-24 20:18:45 +00:00
dependabot[bot]
a98f75b783 .github: Bump tibdex/github-app-token from 1.8.0 to 2.1.0 (#9529)
Bumps [tibdex/github-app-token](https://github.com/tibdex/github-app-token) from 1.8.0 to 2.1.0.
- [Release notes](https://github.com/tibdex/github-app-token/releases)
- [Commits](b62528385c...3beb63f4bd)

---
updated-dependencies:
- dependency-name: tibdex/github-app-token
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Signed-off-by: Mario Minardi <mario@tailscale.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-24 10:01:00 -06:00
Mario Minardi
05d82fb0d8 .github: pin re-actors/alls-green to latest 1.x (#13558)
Pin re-actors/alls-green usage to latest 1.x. This was previously
pointing to `@release/v2` which pulls in the latest changes from this
branch as they are released, with the potential to break our workflows
if a breaking change or malicious version on this stream is ever pushed.

Changing this to a pinned version also means that dependabot will keep
this in the pinned version format (e.g., referencing a SHA) when it
opens a PR to bump the dependency.

Updates #cleanup

Signed-off-by: Mario Minardi <mario@tailscale.com>
2024-09-23 17:35:53 -06:00
Mario Minardi
04bbef0e8b .github: update and pin actions/upload-artifact to latest 4.x (#13556)
Update and pin actions/upload-artifact usage to latest 4.x. These were
previously pointing to @3 which pulls in the latest v3 as they are
released, with the potential to break our workflows if a breaking change
or malicious version on the @3 stream is ever pushed.

Changing this to a pinned version also means that dependabot will keep
this in the pinned version format (e.g., referencing a SHA) when it
opens a PR to bump the dependency.

Updates #cleanup

Signed-off-by: Mario Minardi <mario@tailscale.com>
2024-09-23 16:44:26 -06:00
Mario Minardi
a8bd0cb9c2 .github: update and pin actions/cache to latest 4.x (#13555)
Update and pin actions/cache usage to latest 4.x. These were previously
pointing to `@3` which pulls in the latest v3 as they are released, with
the potential to break our workflows if a breaking change or malicious
version on the `@3` stream is ever pushed.

Changing this to a pinned version also means that dependabot will keep
this in the pinned version format (e.g., referencing a SHA) when it
opens a PR to bump the dependency.

The breaking change between v3 and v4 is that v4 requires Node 20 which
should be a non-issue where this is run.

Updates #cleanup

Signed-off-by: Mario Minardi <mario@tailscale.com>
2024-09-23 16:34:55 -06:00
Mario Minardi
a3f7e72321 .github: use and pin slackapi/slack-github-action to latest 1.x (#13554)
Use slackapi/slack-github-action across the board and pin to latest 1.x.
Previously we were referencing the 1.27.0 tag directly which is
vulnerable to someone replacing that version tag with malicious code.

Replace usage of ruby/action-slack with slackapi/slack-github-action as
the latter is the officially supported action from slack.

Updates #cleanup

Signed-off-by: Mario Minardi <mario@tailscale.com>
2024-09-23 16:11:13 -06:00
Mario Minardi
22e98cf95e .github: pin codeql actions to latest 3.x (#13552)
Pin codeql actions usage to latest 3.x. These were previously pointing
to `@2` which pulls in the latest v2 as they are released, with the
potential to break our workflows if a breaking change or malicious
version on the `@2` stream is ever pushed.

Changing this to a pinned version also means that dependabot will keep
this in the pinend version format (e.g., referencing a SHA) when it
opens a PR to bump the dependency.

The breaking change between v2 and v3 is that v3 requires Node 20 which
is a non-issue as we are running this on ubuntu latest.

Updates #cleanup

Signed-off-by: Mario Minardi <mario@tailscale.com>
2024-09-23 15:52:26 -06:00
Mario Minardi
2c1bbfb902 .github: pin actions/setup-go usage to latest 5.x (#13553)
Pin actions/checkout usage to latest 5.x. These were previously pointing
to `@4` which pulls in the latest v4 as they are released, with the
potential to break our workflows if a breaking change or malicious
version on the `@4` stream is ever pushed.

Changing this to a pinned version also means that dependabot will keep
this in the pinend version format (e.g., referencing a SHA) when it
opens a PR to bump the dependency.

The breaking change between v4 and v5 is that v5 requires Node 20 which
should be a non-issue where it is used.

Updates #cleanup

Signed-off-by: Mario Minardi <mario@tailscale.com>
2024-09-23 15:14:49 -06:00
Mario Minardi
07991dec83 .github: pin actions/checkout to latest v3 or v4 as appropriate (#13551)
Pin actions/checkout usage to latest 3.x or 4.x as appropriate. These
were previously pointing to `@4` or `@3` which pull in the latest
versions at these tags as they are released, with the potential to break
our workflows if a breaking change or malicious version for either of
these streams are released.

Changing this to a pinned version also means that dependabot will keep
this in the pinend version format (e.g., referencing a SHA) when it
opens a PR to bump the dependency.

Updates #cleanup

Signed-off-by: Mario Minardi <mario@tailscale.com>
2024-09-23 14:52:19 -06:00
Mario Minardi
8d508712c9 tailcfg: add AcceptEnv field to SSHRule (#13523)
Add an `AcceptEnv` field to `SSHRule`. This will contain the collection
of environment variable names / patterns that are specified in the
`acceptEnv` block for the SSH rule within the policy file. This will be
used in the tailscale client to filter out unacceptable environment
variables.

Updates: https://github.com/tailscale/corp/issues/22775

Signed-off-by: Mario Minardi <mario@tailscale.com>
2024-09-22 20:15:26 -06:00
Joe Tsai
dc86d3589c types/views: add SliceView.All iterator (#13536)
And convert a all relevant usages.

Updates #12912

Signed-off-by: Joe Tsai <joetsai@digital-static.net>
2024-09-20 13:55:33 -07:00
Brad Fitzpatrick
3e9ca6c64b go.toolchain.rev: bump oss, test toolchain matches go.toolchain.rev
Update go.toolchain.rev for https://github.com/tailscale/go/pull/104 and
add a test that, when using the tailscale_go build tag, we use the
right Go toolchain.

We'll crank up the strictness in later commits.

Updates #13527

Change-Id: Ifb09a844858be2beb144a420e4e9dbdc5c03ae3a
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-09-19 20:27:59 -07:00
261 changed files with 14759 additions and 1844 deletions

View File

@@ -18,7 +18,7 @@ jobs:
runs-on: [ ubuntu-latest ]
steps:
- name: Check out code
uses: actions/checkout@v4
uses: actions/checkout@692973e3d937129bcbf40652eb9f2f61becf3332 # v4.1.7
- name: Build checklocks
run: ./tool/go build -o /tmp/checklocks gvisor.dev/gvisor/tools/checklocks/cmd/checklocks

View File

@@ -45,17 +45,17 @@ jobs:
steps:
- name: Checkout repository
uses: actions/checkout@v4
uses: actions/checkout@692973e3d937129bcbf40652eb9f2f61becf3332 # v4.1.7
# Install a more recent Go that understands modern go.mod content.
- name: Install Go
uses: actions/setup-go@v4
uses: actions/setup-go@0a12ed9d6a96ab950c8f026ed9f722fe0da7ef32 # v5.0.2
with:
go-version-file: go.mod
# Initializes the CodeQL tools for scanning.
- name: Initialize CodeQL
uses: github/codeql-action/init@v2
uses: github/codeql-action/init@6db8d6351fd0be61f9ed8ebd12ccd35dcec51fea # v3.26.11
with:
languages: ${{ matrix.language }}
# If you wish to specify custom queries, you can do so here or in a config file.
@@ -66,7 +66,7 @@ jobs:
# Autobuild attempts to build any compiled languages (C/C++, C#, or Java).
# If this step fails, then you should remove it and run the build manually (see below)
- name: Autobuild
uses: github/codeql-action/autobuild@v2
uses: github/codeql-action/autobuild@6db8d6351fd0be61f9ed8ebd12ccd35dcec51fea # v3.26.11
# Command-line programs to run using the OS shell.
# 📚 https://git.io/JvXDl
@@ -80,4 +80,4 @@ jobs:
# make release
- name: Perform CodeQL Analysis
uses: github/codeql-action/analyze@v2
uses: github/codeql-action/analyze@6db8d6351fd0be61f9ed8ebd12ccd35dcec51fea # v3.26.11

View File

@@ -10,6 +10,6 @@ jobs:
deploy:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/checkout@692973e3d937129bcbf40652eb9f2f61becf3332 # v4.1.7
- name: "Build Docker image"
run: docker build .

View File

@@ -17,7 +17,7 @@ jobs:
id-token: "write"
contents: "read"
steps:
- uses: "actions/checkout@v4"
- uses: actions/checkout@692973e3d937129bcbf40652eb9f2f61becf3332 # v4.1.7
with:
ref: "${{ (inputs.tag != null) && format('refs/tags/{0}', inputs.tag) || '' }}"
- uses: "DeterminateSystems/nix-installer-action@main"

View File

@@ -23,9 +23,9 @@ jobs:
name: lint
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/checkout@692973e3d937129bcbf40652eb9f2f61becf3332 # v4.1.7
- uses: actions/setup-go@v4
- uses: actions/setup-go@0a12ed9d6a96ab950c8f026ed9f722fe0da7ef32 # v5.0.2
with:
go-version-file: go.mod
cache: false

View File

@@ -14,7 +14,7 @@ jobs:
steps:
- name: Check out code into the Go module directory
uses: actions/checkout@v4
uses: actions/checkout@692973e3d937129bcbf40652eb9f2f61becf3332 # v4.1.7
- name: Install govulncheck
run: ./tool/go install golang.org/x/vuln/cmd/govulncheck@latest
@@ -24,7 +24,7 @@ jobs:
- name: Post to slack
if: failure() && github.event_name == 'schedule'
uses: slackapi/slack-github-action@v1.24.0
uses: slackapi/slack-github-action@37ebaef184d7626c5f204ab8d3baff4262dd30f0 # v1.27.0
env:
SLACK_BOT_TOKEN: ${{ secrets.GOVULNCHECK_BOT_TOKEN }}
with:

View File

@@ -98,7 +98,7 @@ jobs:
# We cannot use v4, as it requires a newer glibc version than some of the
# tested images provide. See
# https://github.com/actions/checkout/issues/1487
uses: actions/checkout@v3
uses: actions/checkout@f43a0e5ff2bd294095638e18286ca9a3d1956744 # v3.6.0
- name: run installer
run: scripts/installer.sh
# Package installation can fail in docker because systemd is not running

View File

@@ -17,7 +17,7 @@ jobs:
runs-on: [ ubuntu-latest ]
steps:
- name: Check out code
uses: actions/checkout@v4
uses: actions/checkout@692973e3d937129bcbf40652eb9f2f61becf3332 # v4.1.7
- name: Build and lint Helm chart
run: |
eval `./tool/go run ./cmd/mkversion`

View File

@@ -17,7 +17,7 @@ jobs:
runs-on: ubuntu-latest
steps:
- name: Check out code
uses: actions/checkout@v4
uses: actions/checkout@692973e3d937129bcbf40652eb9f2f61becf3332 # v4.1.7
- name: Run SSH integration tests
run: |
make sshintegrationtest

View File

@@ -50,7 +50,7 @@ jobs:
- shard: '4/4'
steps:
- name: checkout
uses: actions/checkout@v4
uses: actions/checkout@692973e3d937129bcbf40652eb9f2f61becf3332 # v4.1.7
- name: build test wrapper
run: ./tool/go build -o /tmp/testwrapper ./cmd/testwrapper
- name: integration tests as root
@@ -78,9 +78,9 @@ jobs:
runs-on: ubuntu-22.04
steps:
- name: checkout
uses: actions/checkout@v4
uses: actions/checkout@692973e3d937129bcbf40652eb9f2f61becf3332 # v4.1.7
- name: Restore Cache
uses: actions/cache@v3
uses: actions/cache@2cdf405574d6ef1f33a1d12acccd3ae82f47b3f2 # v4.1.0
with:
# Note: unlike the other setups, this is only grabbing the mod download
# cache, rather than the whole mod directory, as the download cache
@@ -150,16 +150,16 @@ jobs:
runs-on: windows-2022
steps:
- name: checkout
uses: actions/checkout@v4
uses: actions/checkout@692973e3d937129bcbf40652eb9f2f61becf3332 # v4.1.7
- name: Install Go
uses: actions/setup-go@v4
uses: actions/setup-go@0a12ed9d6a96ab950c8f026ed9f722fe0da7ef32 # v5.0.2
with:
go-version-file: go.mod
cache: false
- name: Restore Cache
uses: actions/cache@v3
uses: actions/cache@2cdf405574d6ef1f33a1d12acccd3ae82f47b3f2 # v4.1.0
with:
# Note: unlike the other setups, this is only grabbing the mod download
# cache, rather than the whole mod directory, as the download cache
@@ -190,7 +190,7 @@ jobs:
options: --privileged
steps:
- name: checkout
uses: actions/checkout@v4
uses: actions/checkout@692973e3d937129bcbf40652eb9f2f61becf3332 # v4.1.7
- name: chown
run: chown -R $(id -u):$(id -g) $PWD
- name: privileged tests
@@ -202,7 +202,7 @@ jobs:
if: github.repository == 'tailscale/tailscale'
steps:
- name: checkout
uses: actions/checkout@v4
uses: actions/checkout@692973e3d937129bcbf40652eb9f2f61becf3332 # v4.1.7
- name: Run VM tests
run: ./tool/go test ./tstest/integration/vms -v -no-s3 -run-vm-tests -run=TestRunUbuntu2004
env:
@@ -214,7 +214,7 @@ jobs:
runs-on: ubuntu-22.04
steps:
- name: checkout
uses: actions/checkout@v4
uses: actions/checkout@692973e3d937129bcbf40652eb9f2f61becf3332 # v4.1.7
- name: build all
run: ./tool/go install -race ./cmd/...
- name: build tests
@@ -258,9 +258,9 @@ jobs:
runs-on: ubuntu-22.04
steps:
- name: checkout
uses: actions/checkout@v4
uses: actions/checkout@692973e3d937129bcbf40652eb9f2f61becf3332 # v4.1.7
- name: Restore Cache
uses: actions/cache@v3
uses: actions/cache@2cdf405574d6ef1f33a1d12acccd3ae82f47b3f2 # v4.1.0
with:
# Note: unlike the other setups, this is only grabbing the mod download
# cache, rather than the whole mod directory, as the download cache
@@ -295,7 +295,7 @@ jobs:
runs-on: ubuntu-22.04
steps:
- name: checkout
uses: actions/checkout@v4
uses: actions/checkout@692973e3d937129bcbf40652eb9f2f61becf3332 # v4.1.7
- name: build some
run: ./tool/go build ./ipn/... ./wgengine/ ./types/... ./control/controlclient
env:
@@ -317,9 +317,9 @@ jobs:
runs-on: ubuntu-22.04
steps:
- name: checkout
uses: actions/checkout@v4
uses: actions/checkout@692973e3d937129bcbf40652eb9f2f61becf3332 # v4.1.7
- name: Restore Cache
uses: actions/cache@v3
uses: actions/cache@2cdf405574d6ef1f33a1d12acccd3ae82f47b3f2 # v4.1.0
with:
# Note: unlike the other setups, this is only grabbing the mod download
# cache, rather than the whole mod directory, as the download cache
@@ -350,7 +350,7 @@ jobs:
runs-on: ubuntu-22.04
steps:
- name: checkout
uses: actions/checkout@v4
uses: actions/checkout@692973e3d937129bcbf40652eb9f2f61becf3332 # v4.1.7
# Super minimal Android build that doesn't even use CGO and doesn't build everything that's needed
# and is only arm64. But it's a smoke build: it's not meant to catch everything. But it'll catch
# some Android breakages early.
@@ -365,9 +365,9 @@ jobs:
runs-on: ubuntu-22.04
steps:
- name: checkout
uses: actions/checkout@v4
uses: actions/checkout@692973e3d937129bcbf40652eb9f2f61becf3332 # v4.1.7
- name: Restore Cache
uses: actions/cache@v3
uses: actions/cache@2cdf405574d6ef1f33a1d12acccd3ae82f47b3f2 # v4.1.0
with:
# Note: unlike the other setups, this is only grabbing the mod download
# cache, rather than the whole mod directory, as the download cache
@@ -399,7 +399,7 @@ jobs:
runs-on: ubuntu-22.04
steps:
- name: checkout
uses: actions/checkout@v4
uses: actions/checkout@692973e3d937129bcbf40652eb9f2f61becf3332 # v4.1.7
- name: test tailscale_go
run: ./tool/go test -tags=tailscale_go,ts_enable_sockstats ./net/sockstats/...
@@ -456,18 +456,22 @@ jobs:
fuzz-seconds: 300
dry-run: false
language: go
- name: Set artifacts_path in env (workaround for actions/upload-artifact#176)
if: steps.run.outcome != 'success' && steps.build.outcome == 'success'
run: |
echo "artifacts_path=$(realpath .)" >> $GITHUB_ENV
- name: upload crash
uses: actions/upload-artifact@v3
uses: actions/upload-artifact@50769540e7f4bd5e21e526ee35c689e35e0d6874 # v4.4.0
if: steps.run.outcome != 'success' && steps.build.outcome == 'success'
with:
name: artifacts
path: ./out/artifacts
path: ${{ env.artifacts_path }}/out/artifacts
depaware:
runs-on: ubuntu-22.04
steps:
- name: checkout
uses: actions/checkout@v4
uses: actions/checkout@692973e3d937129bcbf40652eb9f2f61becf3332 # v4.1.7
- name: check depaware
run: |
export PATH=$(./tool/go env GOROOT)/bin:$PATH
@@ -477,7 +481,7 @@ jobs:
runs-on: ubuntu-22.04
steps:
- name: checkout
uses: actions/checkout@v4
uses: actions/checkout@692973e3d937129bcbf40652eb9f2f61becf3332 # v4.1.7
- name: check that 'go generate' is clean
run: |
pkgs=$(./tool/go list ./... | grep -Ev 'dnsfallback|k8s-operator|xdp')
@@ -490,7 +494,7 @@ jobs:
runs-on: ubuntu-22.04
steps:
- name: checkout
uses: actions/checkout@v4
uses: actions/checkout@692973e3d937129bcbf40652eb9f2f61becf3332 # v4.1.7
- name: check that 'go mod tidy' is clean
run: |
./tool/go mod tidy
@@ -502,7 +506,7 @@ jobs:
runs-on: ubuntu-22.04
steps:
- name: checkout
uses: actions/checkout@v4
uses: actions/checkout@692973e3d937129bcbf40652eb9f2f61becf3332 # v4.1.7
- name: check licenses
run: ./scripts/check_license_headers.sh .
@@ -518,7 +522,7 @@ jobs:
goarch: "386"
steps:
- name: checkout
uses: actions/checkout@v4
uses: actions/checkout@692973e3d937129bcbf40652eb9f2f61becf3332 # v4.1.7
- name: install staticcheck
run: GOBIN=~/.local/bin ./tool/go install honnef.co/go/tools/cmd/staticcheck
- name: run staticcheck
@@ -559,7 +563,7 @@ jobs:
# By having the job always run, but skipping its only step as needed, we
# let the CI output collapse nicely in PRs.
if: failure() && github.event_name == 'push'
uses: ruby/action-slack@v3.2.1
uses: slackapi/slack-github-action@37ebaef184d7626c5f204ab8d3baff4262dd30f0 # v1.27.0
with:
payload: |
{
@@ -574,6 +578,7 @@ jobs:
}
env:
SLACK_WEBHOOK_URL: ${{ secrets.SLACK_WEBHOOK_URL }}
SLACK_WEBHOOK_TYPE: INCOMING_WEBHOOK
check_mergeability:
if: always()
@@ -596,6 +601,6 @@ jobs:
steps:
- name: Decide if change is okay to merge
if: github.event_name != 'push'
uses: re-actors/alls-green@release/v1
uses: re-actors/alls-green@05ac9388f0aebcb5727afa17fcccfecd6f8ec5fe # v1.2.2
with:
jobs: ${{ toJSON(needs) }}

View File

@@ -21,21 +21,22 @@ jobs:
steps:
- name: Check out code
uses: actions/checkout@v4
uses: actions/checkout@692973e3d937129bcbf40652eb9f2f61becf3332 # v4.1.7
- name: Run update-flakes
run: ./update-flake.sh
- name: Get access token
uses: tibdex/github-app-token@b62528385c34dbc9f38e5f4225ac829252d1ea92 # v1.8.0
uses: tibdex/github-app-token@3beb63f4bd073e61482598c45c71c1019b59b73a # v2.1.0
id: generate-token
with:
app_id: ${{ secrets.LICENSING_APP_ID }}
installation_id: ${{ secrets.LICENSING_APP_INSTALLATION_ID }}
installation_retrieval_mode: "id"
installation_retrieval_payload: ${{ secrets.LICENSING_APP_INSTALLATION_ID }}
private_key: ${{ secrets.LICENSING_APP_PRIVATE_KEY }}
- name: Send pull request
uses: peter-evans/create-pull-request@8867c4aba1b742c39f8d0ba35429c2dfa4b6cb20 #v7.0.1
uses: peter-evans/create-pull-request@5e914681df9dc83aa4e4905692ca88beb2f9e91f #v7.0.5
with:
token: ${{ steps.generate-token.outputs.token }}
author: Flakes Updater <noreply+flakes-updater@tailscale.com>

View File

@@ -14,7 +14,7 @@ jobs:
steps:
- name: Check out code
uses: actions/checkout@v4
uses: actions/checkout@692973e3d937129bcbf40652eb9f2f61becf3332 # v4.1.7
- name: Run go get
run: |
@@ -23,18 +23,19 @@ jobs:
./tool/go mod tidy
- name: Get access token
uses: tibdex/github-app-token@b62528385c34dbc9f38e5f4225ac829252d1ea92 # v1.8.0
uses: tibdex/github-app-token@3beb63f4bd073e61482598c45c71c1019b59b73a # v2.1.0
id: generate-token
with:
# TODO(will): this should use the code updater app rather than licensing.
# It has the same permissions, so not a big deal, but still.
app_id: ${{ secrets.LICENSING_APP_ID }}
installation_id: ${{ secrets.LICENSING_APP_INSTALLATION_ID }}
installation_retrieval_mode: "id"
installation_retrieval_payload: ${{ secrets.LICENSING_APP_INSTALLATION_ID }}
private_key: ${{ secrets.LICENSING_APP_PRIVATE_KEY }}
- name: Send pull request
id: pull-request
uses: peter-evans/create-pull-request@8867c4aba1b742c39f8d0ba35429c2dfa4b6cb20 #v7.0.1
uses: peter-evans/create-pull-request@5e914681df9dc83aa4e4905692ca88beb2f9e91f #v7.0.5
with:
token: ${{ steps.generate-token.outputs.token }}
author: OSS Updater <noreply+oss-updater@tailscale.com>

View File

@@ -24,7 +24,7 @@ jobs:
steps:
- name: Check out code
uses: actions/checkout@v4
uses: actions/checkout@692973e3d937129bcbf40652eb9f2f61becf3332 # v4.1.7
- name: Install deps
run: ./tool/yarn --cwd client/web
- name: Run lint

View File

@@ -1 +1 @@
1.75.0
1.77.0

View File

@@ -0,0 +1,27 @@
// Copyright (c) Tailscale Inc & AUTHORS
// SPDX-License-Identifier: BSD-3-Clause
//go:build tailscale_go
package tailscaleroot
import (
"fmt"
"os"
"strings"
)
func init() {
tsRev, ok := tailscaleToolchainRev()
if !ok {
panic("binary built with tailscale_go build tag but failed to read build info or find tailscale.toolchain.rev in build info")
}
want := strings.TrimSpace(GoToolchainRev)
if tsRev != want {
if os.Getenv("TS_PERMIT_TOOLCHAIN_MISMATCH") == "1" {
fmt.Fprintf(os.Stderr, "tailscale.toolchain.rev = %q, want %q; but ignoring due to TS_PERMIT_TOOLCHAIN_MISMATCH=1\n", tsRev, want)
return
}
panic(fmt.Sprintf("binary built with tailscale_go build tag but Go toolchain %q doesn't match github.com/tailscale/tailscale expected value %q; override this failure with TS_PERMIT_TOOLCHAIN_MISMATCH=1", tsRev, want))
}
}

View File

@@ -56,6 +56,7 @@ case "$TARGET" in
-X tailscale.com/version.gitCommitStamp=${VERSION_GIT_HASH}" \
--base="${BASE}" \
--tags="${TAGS}" \
--gotags="ts_kube,ts_package_container" \
--repos="${REPOS}" \
--push="${PUSH}" \
--target="${PLATFORM}" \
@@ -72,6 +73,7 @@ case "$TARGET" in
-X tailscale.com/version.gitCommitStamp=${VERSION_GIT_HASH}" \
--base="${BASE}" \
--tags="${TAGS}" \
--gotags="ts_kube,ts_package_container" \
--repos="${REPOS}" \
--push="${PUSH}" \
--target="${PLATFORM}" \

View File

@@ -4,7 +4,10 @@
// Package apitype contains types for the Tailscale LocalAPI and control plane API.
package apitype
import "tailscale.com/tailcfg"
import (
"tailscale.com/tailcfg"
"tailscale.com/types/dnstype"
)
// LocalAPIHost is the Host header value used by the LocalAPI.
const LocalAPIHost = "local-tailscaled.sock"
@@ -65,3 +68,11 @@ type DNSOSConfig struct {
SearchDomains []string
MatchDomains []string
}
// DNSQueryResponse is the response to a DNS query request sent via LocalAPI.
type DNSQueryResponse struct {
// Bytes is the raw DNS response bytes.
Bytes []byte
// Resolvers is the list of resolvers that the forwarder deemed able to resolve the query.
Resolvers []*dnstype.Resolver
}

View File

@@ -37,6 +37,7 @@ import (
"tailscale.com/safesocket"
"tailscale.com/tailcfg"
"tailscale.com/tka"
"tailscale.com/types/dnstype"
"tailscale.com/types/key"
"tailscale.com/types/tkatype"
)
@@ -813,6 +814,8 @@ func (lc *LocalClient) EditPrefs(ctx context.Context, mp *ipn.MaskedPrefs) (*ipn
return decodeJSON[*ipn.Prefs](body)
}
// GetDNSOSConfig returns the system DNS configuration for the current device.
// That is, it returns the DNS configuration that the system would use if Tailscale weren't being used.
func (lc *LocalClient) GetDNSOSConfig(ctx context.Context) (*apitype.DNSOSConfig, error) {
body, err := lc.get200(ctx, "/localapi/v0/dns-osconfig")
if err != nil {
@@ -825,6 +828,21 @@ func (lc *LocalClient) GetDNSOSConfig(ctx context.Context) (*apitype.DNSOSConfig
return &osCfg, nil
}
// QueryDNS executes a DNS query for a name (`google.com.`) and query type (`CNAME`).
// It returns the raw DNS response bytes and the resolvers that were used to answer the query
// (often just one, but can be more if we raced multiple resolvers).
func (lc *LocalClient) QueryDNS(ctx context.Context, name string, queryType string) (bytes []byte, resolvers []*dnstype.Resolver, err error) {
body, err := lc.get200(ctx, fmt.Sprintf("/localapi/v0/dns-query?name=%s&type=%s", url.QueryEscape(name), queryType))
if err != nil {
return nil, nil, err
}
var res apitype.DNSQueryResponse
if err := json.Unmarshal(body, &res); err != nil {
return nil, nil, fmt.Errorf("invalid query response: %w", err)
}
return res.Bytes, res.Resolvers, nil
}
// StartLoginInteractive starts an interactive login.
func (lc *LocalClient) StartLoginInteractive(ctx context.Context) error {
_, err := lc.send(ctx, "POST", "/localapi/v0/login-interactive", http.StatusNoContent, nil)

View File

@@ -51,6 +51,9 @@ type Client struct {
// HTTPClient optionally specifies an alternate HTTP client to use.
// If nil, http.DefaultClient is used.
HTTPClient *http.Client
// UserAgent optionally specifies an alternate User-Agent header
UserAgent string
}
func (c *Client) httpClient() *http.Client {
@@ -97,8 +100,9 @@ func (c *Client) setAuth(r *http.Request) {
// and can be changed manually by the user.
func NewClient(tailnet string, auth AuthMethod) *Client {
return &Client{
tailnet: tailnet,
auth: auth,
tailnet: tailnet,
auth: auth,
UserAgent: "tailscale-client-oss",
}
}
@@ -110,17 +114,16 @@ func (c *Client) Do(req *http.Request) (*http.Response, error) {
return nil, errors.New("use of Client without setting I_Acknowledge_This_API_Is_Unstable")
}
c.setAuth(req)
if c.UserAgent != "" {
req.Header.Set("User-Agent", c.UserAgent)
}
return c.httpClient().Do(req)
}
// sendRequest add the authentication key to the request and sends it. It
// receives the response and reads up to 10MB of it.
func (c *Client) sendRequest(req *http.Request) ([]byte, *http.Response, error) {
if !I_Acknowledge_This_API_Is_Unstable {
return nil, nil, errors.New("use of Client without setting I_Acknowledge_This_API_Is_Unstable")
}
c.setAuth(req)
resp, err := c.httpClient().Do(req)
resp, err := c.Do(req)
if err != nil {
return nil, resp, err
}

View File

@@ -17,7 +17,6 @@ import (
"os"
"path"
"path/filepath"
"slices"
"strings"
"sync"
"time"
@@ -27,6 +26,7 @@ import (
"tailscale.com/client/tailscale/apitype"
"tailscale.com/clientupdate"
"tailscale.com/envknob"
"tailscale.com/envknob/featureknob"
"tailscale.com/hostinfo"
"tailscale.com/ipn"
"tailscale.com/ipn/ipnstate"
@@ -35,6 +35,7 @@ import (
"tailscale.com/net/tsaddr"
"tailscale.com/tailcfg"
"tailscale.com/types/logger"
"tailscale.com/types/views"
"tailscale.com/util/httpm"
"tailscale.com/version"
"tailscale.com/version/distro"
@@ -113,11 +114,6 @@ const (
ManageServerMode ServerMode = "manage"
)
var (
exitNodeRouteV4 = netip.MustParsePrefix("0.0.0.0/0")
exitNodeRouteV6 = netip.MustParsePrefix("::/0")
)
// ServerOpts contains options for constructing a new Server.
type ServerOpts struct {
// Mode specifies the mode of web client being constructed.
@@ -927,10 +923,10 @@ func (s *Server) serveGetNodeData(w http.ResponseWriter, r *http.Request) {
return p == route
})
}
data.AdvertisingExitNodeApproved = routeApproved(exitNodeRouteV4) || routeApproved(exitNodeRouteV6)
data.AdvertisingExitNodeApproved = routeApproved(tsaddr.AllIPv4()) || routeApproved(tsaddr.AllIPv6())
for _, r := range prefs.AdvertiseRoutes {
if r == exitNodeRouteV4 || r == exitNodeRouteV6 {
if tsaddr.IsExitRoute(r) {
data.AdvertisingExitNode = true
} else {
data.AdvertisedRoutes = append(data.AdvertisedRoutes, subnetRoute{
@@ -965,37 +961,16 @@ func (s *Server) serveGetNodeData(w http.ResponseWriter, r *http.Request) {
}
func availableFeatures() map[string]bool {
env := hostinfo.GetEnvType()
features := map[string]bool{
"advertise-exit-node": true, // available on all platforms
"advertise-routes": true, // available on all platforms
"use-exit-node": canUseExitNode(env) == nil,
"ssh": envknob.CanRunTailscaleSSH() == nil,
"use-exit-node": featureknob.CanUseExitNode() == nil,
"ssh": featureknob.CanRunTailscaleSSH() == nil,
"auto-update": version.IsUnstableBuild() && clientupdate.CanAutoUpdate(),
}
if env == hostinfo.HomeAssistantAddOn {
// Setting SSH on Home Assistant causes trouble on startup
// (since the flag is not being passed to `tailscale up`).
// Although Tailscale SSH does work here,
// it's not terribly useful since it's running in a separate container.
features["ssh"] = false
}
return features
}
func canUseExitNode(env hostinfo.EnvType) error {
switch dist := distro.Get(); dist {
case distro.Synology, // see https://github.com/tailscale/tailscale/issues/1995
distro.QNAP,
distro.Unraid:
return fmt.Errorf("Tailscale exit nodes cannot be used on %s.", dist)
}
if env == hostinfo.HomeAssistantAddOn {
return errors.New("Tailscale exit nodes cannot be used on Home Assistant.")
}
return nil
}
// aclsAllowAccess returns whether tailnet ACLs (as expressed in the provided filter rules)
// permit any devices to access the local web client.
// This does not currently check whether a specific device can connect, just any device.
@@ -1071,7 +1046,7 @@ func (s *Server) servePostRoutes(ctx context.Context, data postRoutesRequest) er
var currNonExitRoutes []string
var currAdvertisingExitNode bool
for _, r := range prefs.AdvertiseRoutes {
if r == exitNodeRouteV4 || r == exitNodeRouteV6 {
if tsaddr.IsExitRoute(r) {
currAdvertisingExitNode = true
continue
}
@@ -1092,12 +1067,7 @@ func (s *Server) servePostRoutes(ctx context.Context, data postRoutesRequest) er
return err
}
hasExitNodeRoute := func(all []netip.Prefix) bool {
return slices.Contains(all, exitNodeRouteV4) ||
slices.Contains(all, exitNodeRouteV6)
}
if !data.UseExitNode.IsZero() && hasExitNodeRoute(routes) {
if !data.UseExitNode.IsZero() && tsaddr.ContainsExitRoutes(views.SliceOf(routes)) {
return errors.New("cannot use and advertise exit node at same time")
}

View File

@@ -27,11 +27,8 @@ import (
"strconv"
"strings"
"github.com/google/uuid"
"tailscale.com/clientupdate/distsign"
"tailscale.com/types/logger"
"tailscale.com/util/cmpver"
"tailscale.com/util/winutil"
"tailscale.com/version"
"tailscale.com/version/distro"
)
@@ -756,164 +753,6 @@ func (up *Updater) updateMacAppStore() error {
return nil
}
const (
// winMSIEnv is the environment variable that, if set, is the MSI file for
// the update command to install. It's passed like this so we can stop the
// tailscale.exe process from running before the msiexec process runs and
// tries to overwrite ourselves.
winMSIEnv = "TS_UPDATE_WIN_MSI"
// winExePathEnv is the environment variable that is set along with
// winMSIEnv and carries the full path of the calling tailscale.exe binary.
// It is used to re-launch the GUI process (tailscale-ipn.exe) after
// install is complete.
winExePathEnv = "TS_UPDATE_WIN_EXE_PATH"
)
var (
verifyAuthenticode func(string) error // set non-nil only on Windows
markTempFileFunc func(string) error // set non-nil only on Windows
)
func (up *Updater) updateWindows() error {
if msi := os.Getenv(winMSIEnv); msi != "" {
// stdout/stderr from this part of the install could be lost since the
// parent tailscaled is replaced. Create a temp log file to have some
// output to debug with in case update fails.
close, err := up.switchOutputToFile()
if err != nil {
up.Logf("failed to create log file for installation: %v; proceeding with existing outputs", err)
} else {
defer close.Close()
}
up.Logf("installing %v ...", msi)
if err := up.installMSI(msi); err != nil {
up.Logf("MSI install failed: %v", err)
return err
}
up.Logf("success.")
return nil
}
if !winutil.IsCurrentProcessElevated() {
return errors.New(`update must be run as Administrator
you can run the command prompt as Administrator one of these ways:
* right-click cmd.exe, select 'Run as administrator'
* press Windows+x, then press a
* press Windows+r, type in "cmd", then press Ctrl+Shift+Enter`)
}
ver, err := requestedTailscaleVersion(up.Version, up.Track)
if err != nil {
return err
}
arch := runtime.GOARCH
if arch == "386" {
arch = "x86"
}
if !up.confirm(ver) {
return nil
}
tsDir := filepath.Join(os.Getenv("ProgramData"), "Tailscale")
msiDir := filepath.Join(tsDir, "MSICache")
if fi, err := os.Stat(tsDir); err != nil {
return fmt.Errorf("expected %s to exist, got stat error: %w", tsDir, err)
} else if !fi.IsDir() {
return fmt.Errorf("expected %s to be a directory; got %v", tsDir, fi.Mode())
}
if err := os.MkdirAll(msiDir, 0700); err != nil {
return err
}
up.cleanupOldDownloads(filepath.Join(msiDir, "*.msi"))
pkgsPath := fmt.Sprintf("%s/tailscale-setup-%s-%s.msi", up.Track, ver, arch)
msiTarget := filepath.Join(msiDir, path.Base(pkgsPath))
if err := up.downloadURLToFile(pkgsPath, msiTarget); err != nil {
return err
}
up.Logf("verifying MSI authenticode...")
if err := verifyAuthenticode(msiTarget); err != nil {
return fmt.Errorf("authenticode verification of %s failed: %w", msiTarget, err)
}
up.Logf("authenticode verification succeeded")
up.Logf("making tailscale.exe copy to switch to...")
up.cleanupOldDownloads(filepath.Join(os.TempDir(), "tailscale-updater-*.exe"))
selfOrig, selfCopy, err := makeSelfCopy()
if err != nil {
return err
}
defer os.Remove(selfCopy)
up.Logf("running tailscale.exe copy for final install...")
cmd := exec.Command(selfCopy, "update")
cmd.Env = append(os.Environ(), winMSIEnv+"="+msiTarget, winExePathEnv+"="+selfOrig)
cmd.Stdout = up.Stderr
cmd.Stderr = up.Stderr
cmd.Stdin = os.Stdin
if err := cmd.Start(); err != nil {
return err
}
// Once it's started, exit ourselves, so the binary is free
// to be replaced.
os.Exit(0)
panic("unreachable")
}
func (up *Updater) switchOutputToFile() (io.Closer, error) {
var logFilePath string
exePath, err := os.Executable()
if err != nil {
logFilePath = filepath.Join(os.TempDir(), "tailscale-updater.log")
} else {
logFilePath = strings.TrimSuffix(exePath, ".exe") + ".log"
}
up.Logf("writing update output to %q", logFilePath)
logFile, err := os.Create(logFilePath)
if err != nil {
return nil, err
}
up.Logf = func(m string, args ...any) {
fmt.Fprintf(logFile, m+"\n", args...)
}
up.Stdout = logFile
up.Stderr = logFile
return logFile, nil
}
func (up *Updater) installMSI(msi string) error {
var err error
for tries := 0; tries < 2; tries++ {
cmd := exec.Command("msiexec.exe", "/i", filepath.Base(msi), "/quiet", "/norestart", "/qn")
cmd.Dir = filepath.Dir(msi)
cmd.Stdout = up.Stdout
cmd.Stderr = up.Stderr
cmd.Stdin = os.Stdin
err = cmd.Run()
if err == nil {
break
}
up.Logf("Install attempt failed: %v", err)
uninstallVersion := up.currentVersion
if v := os.Getenv("TS_DEBUG_UNINSTALL_VERSION"); v != "" {
uninstallVersion = v
}
// Assume it's a downgrade, which msiexec won't permit. Uninstall our current version first.
up.Logf("Uninstalling current version %q for downgrade...", uninstallVersion)
cmd = exec.Command("msiexec.exe", "/x", msiUUIDForVersion(uninstallVersion), "/norestart", "/qn")
cmd.Stdout = up.Stdout
cmd.Stderr = up.Stderr
cmd.Stdin = os.Stdin
err = cmd.Run()
up.Logf("msiexec uninstall: %v", err)
}
return err
}
// cleanupOldDownloads removes all files matching glob (see filepath.Glob).
// Only regular files are removed, so the glob must match specific files and
// not directories.
@@ -938,53 +777,6 @@ func (up *Updater) cleanupOldDownloads(glob string) {
}
}
func msiUUIDForVersion(ver string) string {
arch := runtime.GOARCH
if arch == "386" {
arch = "x86"
}
track, err := versionToTrack(ver)
if err != nil {
track = UnstableTrack
}
msiURL := fmt.Sprintf("https://pkgs.tailscale.com/%s/tailscale-setup-%s-%s.msi", track, ver, arch)
return "{" + strings.ToUpper(uuid.NewSHA1(uuid.NameSpaceURL, []byte(msiURL)).String()) + "}"
}
func makeSelfCopy() (origPathExe, tmpPathExe string, err error) {
selfExe, err := os.Executable()
if err != nil {
return "", "", err
}
f, err := os.Open(selfExe)
if err != nil {
return "", "", err
}
defer f.Close()
f2, err := os.CreateTemp("", "tailscale-updater-*.exe")
if err != nil {
return "", "", err
}
if f := markTempFileFunc; f != nil {
if err := f(f2.Name()); err != nil {
return "", "", err
}
}
if _, err := io.Copy(f2, f); err != nil {
f2.Close()
return "", "", err
}
return selfExe, f2.Name(), f2.Close()
}
func (up *Updater) downloadURLToFile(pathSrc, fileDst string) (ret error) {
c, err := distsign.NewClient(up.Logf, up.PkgsAddr)
if err != nil {
return err
}
return c.Download(context.Background(), pathSrc, fileDst)
}
func (up *Updater) updateFreeBSD() (err error) {
if up.Version != "" {
return errors.New("installing a specific version on FreeBSD is not supported")

View File

@@ -0,0 +1,20 @@
// Copyright (c) Tailscale Inc & AUTHORS
// SPDX-License-Identifier: BSD-3-Clause
//go:build (linux && !android) || windows
package clientupdate
import (
"context"
"tailscale.com/clientupdate/distsign"
)
func (up *Updater) downloadURLToFile(pathSrc, fileDst string) (ret error) {
c, err := distsign.NewClient(up.Logf, up.PkgsAddr)
if err != nil {
return err
}
return c.Download(context.Background(), pathSrc, fileDst)
}

View File

@@ -0,0 +1,10 @@
// Copyright (c) Tailscale Inc & AUTHORS
// SPDX-License-Identifier: BSD-3-Clause
//go:build !((linux && !android) || windows)
package clientupdate
func (up *Updater) downloadURLToFile(pathSrc, fileDst string) (ret error) {
panic("unreachable")
}

View File

@@ -0,0 +1,10 @@
// Copyright (c) Tailscale Inc & AUTHORS
// SPDX-License-Identifier: BSD-3-Clause
//go:build !windows
package clientupdate
func (up *Updater) updateWindows() error {
panic("unreachable")
}

View File

@@ -7,13 +7,57 @@
package clientupdate
import (
"errors"
"fmt"
"io"
"os"
"os/exec"
"path"
"path/filepath"
"runtime"
"strings"
"github.com/google/uuid"
"golang.org/x/sys/windows"
"tailscale.com/util/winutil"
"tailscale.com/util/winutil/authenticode"
)
func init() {
markTempFileFunc = markTempFileWindows
verifyAuthenticode = verifyTailscale
const (
// winMSIEnv is the environment variable that, if set, is the MSI file for
// the update command to install. It's passed like this so we can stop the
// tailscale.exe process from running before the msiexec process runs and
// tries to overwrite ourselves.
winMSIEnv = "TS_UPDATE_WIN_MSI"
// winExePathEnv is the environment variable that is set along with
// winMSIEnv and carries the full path of the calling tailscale.exe binary.
// It is used to re-launch the GUI process (tailscale-ipn.exe) after
// install is complete.
winExePathEnv = "TS_UPDATE_WIN_EXE_PATH"
)
func makeSelfCopy() (origPathExe, tmpPathExe string, err error) {
selfExe, err := os.Executable()
if err != nil {
return "", "", err
}
f, err := os.Open(selfExe)
if err != nil {
return "", "", err
}
defer f.Close()
f2, err := os.CreateTemp("", "tailscale-updater-*.exe")
if err != nil {
return "", "", err
}
if err := markTempFileWindows(f2.Name()); err != nil {
return "", "", err
}
if _, err := io.Copy(f2, f); err != nil {
f2.Close()
return "", "", err
}
return selfExe, f2.Name(), f2.Close()
}
func markTempFileWindows(name string) error {
@@ -23,6 +67,159 @@ func markTempFileWindows(name string) error {
const certSubjectTailscale = "Tailscale Inc."
func verifyTailscale(path string) error {
func verifyAuthenticode(path string) error {
return authenticode.Verify(path, certSubjectTailscale)
}
func (up *Updater) updateWindows() error {
if msi := os.Getenv(winMSIEnv); msi != "" {
// stdout/stderr from this part of the install could be lost since the
// parent tailscaled is replaced. Create a temp log file to have some
// output to debug with in case update fails.
close, err := up.switchOutputToFile()
if err != nil {
up.Logf("failed to create log file for installation: %v; proceeding with existing outputs", err)
} else {
defer close.Close()
}
up.Logf("installing %v ...", msi)
if err := up.installMSI(msi); err != nil {
up.Logf("MSI install failed: %v", err)
return err
}
up.Logf("success.")
return nil
}
if !winutil.IsCurrentProcessElevated() {
return errors.New(`update must be run as Administrator
you can run the command prompt as Administrator one of these ways:
* right-click cmd.exe, select 'Run as administrator'
* press Windows+x, then press a
* press Windows+r, type in "cmd", then press Ctrl+Shift+Enter`)
}
ver, err := requestedTailscaleVersion(up.Version, up.Track)
if err != nil {
return err
}
arch := runtime.GOARCH
if arch == "386" {
arch = "x86"
}
if !up.confirm(ver) {
return nil
}
tsDir := filepath.Join(os.Getenv("ProgramData"), "Tailscale")
msiDir := filepath.Join(tsDir, "MSICache")
if fi, err := os.Stat(tsDir); err != nil {
return fmt.Errorf("expected %s to exist, got stat error: %w", tsDir, err)
} else if !fi.IsDir() {
return fmt.Errorf("expected %s to be a directory; got %v", tsDir, fi.Mode())
}
if err := os.MkdirAll(msiDir, 0700); err != nil {
return err
}
up.cleanupOldDownloads(filepath.Join(msiDir, "*.msi"))
pkgsPath := fmt.Sprintf("%s/tailscale-setup-%s-%s.msi", up.Track, ver, arch)
msiTarget := filepath.Join(msiDir, path.Base(pkgsPath))
if err := up.downloadURLToFile(pkgsPath, msiTarget); err != nil {
return err
}
up.Logf("verifying MSI authenticode...")
if err := verifyAuthenticode(msiTarget); err != nil {
return fmt.Errorf("authenticode verification of %s failed: %w", msiTarget, err)
}
up.Logf("authenticode verification succeeded")
up.Logf("making tailscale.exe copy to switch to...")
up.cleanupOldDownloads(filepath.Join(os.TempDir(), "tailscale-updater-*.exe"))
selfOrig, selfCopy, err := makeSelfCopy()
if err != nil {
return err
}
defer os.Remove(selfCopy)
up.Logf("running tailscale.exe copy for final install...")
cmd := exec.Command(selfCopy, "update")
cmd.Env = append(os.Environ(), winMSIEnv+"="+msiTarget, winExePathEnv+"="+selfOrig)
cmd.Stdout = up.Stderr
cmd.Stderr = up.Stderr
cmd.Stdin = os.Stdin
if err := cmd.Start(); err != nil {
return err
}
// Once it's started, exit ourselves, so the binary is free
// to be replaced.
os.Exit(0)
panic("unreachable")
}
func (up *Updater) installMSI(msi string) error {
var err error
for tries := 0; tries < 2; tries++ {
cmd := exec.Command("msiexec.exe", "/i", filepath.Base(msi), "/quiet", "/norestart", "/qn")
cmd.Dir = filepath.Dir(msi)
cmd.Stdout = up.Stdout
cmd.Stderr = up.Stderr
cmd.Stdin = os.Stdin
err = cmd.Run()
if err == nil {
break
}
up.Logf("Install attempt failed: %v", err)
uninstallVersion := up.currentVersion
if v := os.Getenv("TS_DEBUG_UNINSTALL_VERSION"); v != "" {
uninstallVersion = v
}
// Assume it's a downgrade, which msiexec won't permit. Uninstall our current version first.
up.Logf("Uninstalling current version %q for downgrade...", uninstallVersion)
cmd = exec.Command("msiexec.exe", "/x", msiUUIDForVersion(uninstallVersion), "/norestart", "/qn")
cmd.Stdout = up.Stdout
cmd.Stderr = up.Stderr
cmd.Stdin = os.Stdin
err = cmd.Run()
up.Logf("msiexec uninstall: %v", err)
}
return err
}
func msiUUIDForVersion(ver string) string {
arch := runtime.GOARCH
if arch == "386" {
arch = "x86"
}
track, err := versionToTrack(ver)
if err != nil {
track = UnstableTrack
}
msiURL := fmt.Sprintf("https://pkgs.tailscale.com/%s/tailscale-setup-%s-%s.msi", track, ver, arch)
return "{" + strings.ToUpper(uuid.NewSHA1(uuid.NameSpaceURL, []byte(msiURL)).String()) + "}"
}
func (up *Updater) switchOutputToFile() (io.Closer, error) {
var logFilePath string
exePath, err := os.Executable()
if err != nil {
logFilePath = filepath.Join(os.TempDir(), "tailscale-updater.log")
} else {
logFilePath = strings.TrimSuffix(exePath, ".exe") + ".log"
}
up.Logf("writing update output to %q", logFilePath)
logFile, err := os.Create(logFilePath)
if err != nil {
return nil, err
}
up.Logf = func(m string, args ...any) {
fmt.Fprintf(logFile, m+"\n", args...)
}
up.Stdout = logFile
up.Stderr = logFile
return logFile, nil
}

View File

@@ -117,7 +117,7 @@ func installEgressForwardingRule(_ context.Context, dstStr string, tsIPs []netip
if err := nfr.DNATNonTailscaleTraffic("tailscale0", dst); err != nil {
return fmt.Errorf("installing egress proxy rules: %w", err)
}
if err := nfr.AddSNATRuleForDst(local, dst); err != nil {
if err := nfr.EnsureSNATForDst(local, dst); err != nil {
return fmt.Errorf("installing egress proxy rules: %w", err)
}
if err := nfr.ClampMSSToPMTU("tailscale0", dst); err != nil {

View File

@@ -132,35 +132,9 @@ func newNetfilterRunner(logf logger.Logf) (linuxfw.NetfilterRunner, error) {
func main() {
log.SetPrefix("boot: ")
tailscale.I_Acknowledge_This_API_Is_Unstable = true
cfg := &settings{
AuthKey: defaultEnvs([]string{"TS_AUTHKEY", "TS_AUTH_KEY"}, ""),
Hostname: defaultEnv("TS_HOSTNAME", ""),
Routes: defaultEnvStringPointer("TS_ROUTES"),
ServeConfigPath: defaultEnv("TS_SERVE_CONFIG", ""),
ProxyTargetIP: defaultEnv("TS_DEST_IP", ""),
ProxyTargetDNSName: defaultEnv("TS_EXPERIMENTAL_DEST_DNS_NAME", ""),
TailnetTargetIP: defaultEnv("TS_TAILNET_TARGET_IP", ""),
TailnetTargetFQDN: defaultEnv("TS_TAILNET_TARGET_FQDN", ""),
DaemonExtraArgs: defaultEnv("TS_TAILSCALED_EXTRA_ARGS", ""),
ExtraArgs: defaultEnv("TS_EXTRA_ARGS", ""),
InKubernetes: os.Getenv("KUBERNETES_SERVICE_HOST") != "",
UserspaceMode: defaultBool("TS_USERSPACE", true),
StateDir: defaultEnv("TS_STATE_DIR", ""),
AcceptDNS: defaultEnvBoolPointer("TS_ACCEPT_DNS"),
KubeSecret: defaultEnv("TS_KUBE_SECRET", "tailscale"),
SOCKSProxyAddr: defaultEnv("TS_SOCKS5_SERVER", ""),
HTTPProxyAddr: defaultEnv("TS_OUTBOUND_HTTP_PROXY_LISTEN", ""),
Socket: defaultEnv("TS_SOCKET", "/tmp/tailscaled.sock"),
AuthOnce: defaultBool("TS_AUTH_ONCE", false),
Root: defaultEnv("TS_TEST_ONLY_ROOT", "/"),
TailscaledConfigFilePath: tailscaledConfigFilePath(),
AllowProxyingClusterTrafficViaIngress: defaultBool("EXPERIMENTAL_ALLOW_PROXYING_CLUSTER_TRAFFIC_VIA_INGRESS", false),
PodIP: defaultEnv("POD_IP", ""),
EnableForwardingOptimizations: defaultBool("TS_EXPERIMENTAL_ENABLE_FORWARDING_OPTIMIZATIONS", false),
HealthCheckAddrPort: defaultEnv("TS_HEALTHCHECK_ADDR_PORT", ""),
}
if err := cfg.validate(); err != nil {
cfg, err := configFromEnv()
if err != nil {
log.Fatalf("invalid configuration: %v", err)
}
@@ -275,10 +249,8 @@ authLoop:
switch *n.State {
case ipn.NeedsLogin:
if isOneStepConfig(cfg) {
// This could happen if this is the
// first time tailscaled was run for
// this device and the auth key was not
// passed via the configfile.
// This could happen if this is the first time tailscaled was run for this
// device and the auth key was not passed via the configfile.
log.Fatalf("invalid state: tailscaled daemon started with a config file, but tailscale is not logged in: ensure you pass a valid auth key in the config file.")
}
if err := authTailscale(); err != nil {
@@ -376,6 +348,9 @@ authLoop:
}
})
)
// egressSvcsErrorChan will get an error sent to it if this containerboot instance is configured to expose 1+
// egress services in HA mode and errored.
var egressSvcsErrorChan = make(chan error)
defer t.Stop()
// resetTimer resets timer for when to next attempt to resolve the DNS
// name for the proxy configured with TS_EXPERIMENTAL_DEST_DNS_NAME. The
@@ -401,6 +376,7 @@ authLoop:
failedResolveAttempts++
}
var egressSvcsNotify chan ipn.Notify
notifyChan := make(chan ipn.Notify)
errChan := make(chan error)
go func() {
@@ -478,7 +454,11 @@ runLoop:
egressAddrs = node.Addresses().AsSlice()
newCurentEgressIPs = deephash.Hash(&egressAddrs)
egressIPsHaveChanged = newCurentEgressIPs != currentEgressIPs
if egressIPsHaveChanged && len(egressAddrs) != 0 {
// The firewall rules get (re-)installed:
// - on startup
// - when the tailnet IPs of the tailnet target have changed
// - when the tailnet IPs of this node have changed
if (egressIPsHaveChanged || ipsHaveChanged) && len(egressAddrs) != 0 {
var rulesInstalled bool
for _, egressAddr := range egressAddrs {
ea := egressAddr.Addr()
@@ -575,31 +555,50 @@ runLoop:
h.Unlock()
healthzRunner()
}
if egressSvcsNotify != nil {
egressSvcsNotify <- n
}
}
if !startupTasksDone {
// For containerboot instances that act as TCP
// proxies (proxying traffic to an endpoint
// passed via one of the env vars that
// containerbot reads) and store state in a
// Kubernetes Secret, we consider startup tasks
// done at the point when device info has been
// successfully stored to state Secret.
// For all other containerboot instances, if we
// just get to this point the startup tasks can
// be considered done.
// For containerboot instances that act as TCP proxies (proxying traffic to an endpoint
// passed via one of the env vars that containerboot reads) and store state in a
// Kubernetes Secret, we consider startup tasks done at the point when device info has
// been successfully stored to state Secret. For all other containerboot instances, if
// we just get to this point the startup tasks can be considered done.
if !isL3Proxy(cfg) || !hasKubeStateStore(cfg) || (currentDeviceEndpoints != deephash.Sum{} && currentDeviceID != deephash.Sum{}) {
// This log message is used in tests to detect when all
// post-auth configuration is done.
log.Println("Startup complete, waiting for shutdown signal")
startupTasksDone = true
// Wait on tailscaled process. It won't
// be cleaned up by default when the
// container exits as it is not PID1.
// TODO (irbekrm): perhaps we can
// replace the reaper by a running
// cmd.Wait in a goroutine immediately
// after starting tailscaled?
// Configure egress proxy. Egress proxy will set up firewall rules to proxy
// traffic to tailnet targets configured in the provided configuration file. It
// will then continuously monitor the config file and netmap updates and
// reconfigure the firewall rules as needed. If any of its operations fail, it
// will crash this node.
if cfg.EgressSvcsCfgPath != "" {
log.Printf("configuring egress proxy using configuration file at %s", cfg.EgressSvcsCfgPath)
egressSvcsNotify = make(chan ipn.Notify)
ep := egressProxy{
cfgPath: cfg.EgressSvcsCfgPath,
nfr: nfr,
kc: kc,
stateSecret: cfg.KubeSecret,
netmapChan: egressSvcsNotify,
podIPv4: cfg.PodIPv4,
tailnetAddrs: addrs,
}
go func() {
if err := ep.run(ctx, n); err != nil {
egressSvcsErrorChan <- err
}
}()
}
// Wait on tailscaled process. It won't be cleaned up by default when the
// container exits as it is not PID1. TODO (irbekrm): perhaps we can replace the
// reaper by a running cmd.Wait in a goroutine immediately after starting
// tailscaled?
reaper := func() {
defer wg.Done()
for {
@@ -637,6 +636,8 @@ runLoop:
}
backendAddrs = newBackendAddrs
resetTimer(false)
case e := <-egressSvcsErrorChan:
log.Fatalf("egress proxy failed: %v", e)
}
}
wg.Wait()
@@ -741,5 +742,5 @@ func tailscaledConfigFilePath() string {
log.Fatalf("no tailscaled config file found in %q for current capability version %q", dir, tailcfg.CurrentCapabilityVersion)
}
log.Printf("Using tailscaled config file %q for capability version %q", maxCompatVer, tailcfg.CurrentCapabilityVersion)
return path.Join(dir, kubeutils.TailscaledConfigFileNameForCap(maxCompatVer))
return path.Join(dir, kubeutils.TailscaledConfigFileName(maxCompatVer))
}

View File

@@ -0,0 +1,571 @@
// Copyright (c) Tailscale Inc & AUTHORS
// SPDX-License-Identifier: BSD-3-Clause
//go:build linux
package main
import (
"context"
"encoding/json"
"errors"
"fmt"
"log"
"net/netip"
"os"
"path/filepath"
"reflect"
"strings"
"time"
"github.com/fsnotify/fsnotify"
"tailscale.com/ipn"
"tailscale.com/kube/egressservices"
"tailscale.com/kube/kubeclient"
"tailscale.com/tailcfg"
"tailscale.com/util/linuxfw"
"tailscale.com/util/mak"
)
const tailscaleTunInterface = "tailscale0"
// This file contains functionality to run containerboot as a proxy that can
// route cluster traffic to one or more tailnet targets, based on portmapping
// rules read from a configfile. Currently (9/2024) this is only used for the
// Kubernetes operator egress proxies.
// egressProxy knows how to configure firewall rules to route cluster traffic to
// one or more tailnet services.
type egressProxy struct {
cfgPath string // path to egress service config file
nfr linuxfw.NetfilterRunner // never nil
kc kubeclient.Client // never nil
stateSecret string // name of the kube state Secret
netmapChan chan ipn.Notify // chan to receive netmap updates on
podIPv4 string // never empty string, currently only IPv4 is supported
// tailnetFQDNs is the egress service FQDN to tailnet IP mappings that
// were last used to configure firewall rules for this proxy.
// TODO(irbekrm): target addresses are also stored in the state Secret.
// Evaluate whether we should retrieve them from there and not store in
// memory at all.
targetFQDNs map[string][]netip.Prefix
// used to configure firewall rules.
tailnetAddrs []netip.Prefix
}
// run configures egress proxy firewall rules and ensures that the firewall rules are reconfigured when:
// - the mounted egress config has changed
// - the proxy's tailnet IP addresses have changed
// - tailnet IPs have changed for any backend targets specified by tailnet FQDN
func (ep *egressProxy) run(ctx context.Context, n ipn.Notify) error {
var tickChan <-chan time.Time
var eventChan <-chan fsnotify.Event
// TODO (irbekrm): take a look if this can be pulled into a single func
// shared with serve config loader.
if w, err := fsnotify.NewWatcher(); err != nil {
log.Printf("failed to create fsnotify watcher, timer-only mode: %v", err)
ticker := time.NewTicker(5 * time.Second)
defer ticker.Stop()
tickChan = ticker.C
} else {
defer w.Close()
if err := w.Add(filepath.Dir(ep.cfgPath)); err != nil {
return fmt.Errorf("failed to add fsnotify watch: %w", err)
}
eventChan = w.Events
}
if err := ep.sync(ctx, n); err != nil {
return err
}
for {
var err error
select {
case <-ctx.Done():
return nil
case <-tickChan:
err = ep.sync(ctx, n)
case <-eventChan:
log.Printf("config file change detected, ensuring firewall config is up to date...")
err = ep.sync(ctx, n)
case n = <-ep.netmapChan:
shouldResync := ep.shouldResync(n)
if shouldResync {
log.Printf("netmap change detected, ensuring firewall config is up to date...")
err = ep.sync(ctx, n)
}
}
if err != nil {
return fmt.Errorf("error syncing egress service config: %w", err)
}
}
}
// sync triggers an egress proxy config resync. The resync calculates the diff between config and status to determine if
// any firewall rules need to be updated. Currently using status in state Secret as a reference for what is the current
// firewall configuration is good enough because - the status is keyed by the Pod IP - we crash the Pod on errors such
// as failed firewall update
func (ep *egressProxy) sync(ctx context.Context, n ipn.Notify) error {
cfgs, err := ep.getConfigs()
if err != nil {
return fmt.Errorf("error retrieving egress service configs: %w", err)
}
status, err := ep.getStatus(ctx)
if err != nil {
return fmt.Errorf("error retrieving current egress proxy status: %w", err)
}
newStatus, err := ep.syncEgressConfigs(cfgs, status, n)
if err != nil {
return fmt.Errorf("error syncing egress service configs: %w", err)
}
if !servicesStatusIsEqual(newStatus, status) {
if err := ep.setStatus(ctx, newStatus, n); err != nil {
return fmt.Errorf("error setting egress proxy status: %w", err)
}
}
return nil
}
// addrsHaveChanged returns true if the provided netmap update contains tailnet address change for this proxy node.
// Netmap must not be nil.
func (ep *egressProxy) addrsHaveChanged(n ipn.Notify) bool {
return !reflect.DeepEqual(ep.tailnetAddrs, n.NetMap.SelfNode.Addresses())
}
// syncEgressConfigs adds and deletes firewall rules to match the desired
// configuration. It uses the provided status to determine what is currently
// applied and updates the status after a successful sync.
func (ep *egressProxy) syncEgressConfigs(cfgs *egressservices.Configs, status *egressservices.Status, n ipn.Notify) (*egressservices.Status, error) {
if !(wantsServicesConfigured(cfgs) || hasServicesConfigured(status)) {
return nil, nil
}
// Delete unnecessary services.
if err := ep.deleteUnnecessaryServices(cfgs, status); err != nil {
return nil, fmt.Errorf("error deleting services: %w", err)
}
newStatus := &egressservices.Status{}
if !wantsServicesConfigured(cfgs) {
return newStatus, nil
}
// Add new services, update rules for any that have changed.
rulesPerSvcToAdd := make(map[string][]rule, 0)
rulesPerSvcToDelete := make(map[string][]rule, 0)
for svcName, cfg := range *cfgs {
tailnetTargetIPs, err := ep.tailnetTargetIPsForSvc(cfg, n)
if err != nil {
return nil, fmt.Errorf("error determining tailnet target IPs: %w", err)
}
rulesToAdd, rulesToDelete, err := updatesForCfg(svcName, cfg, status, tailnetTargetIPs)
if err != nil {
return nil, fmt.Errorf("error validating service changes: %v", err)
}
log.Printf("syncegressservices: looking at svc %s rulesToAdd %d rulesToDelete %d", svcName, len(rulesToAdd), len(rulesToDelete))
if len(rulesToAdd) != 0 {
mak.Set(&rulesPerSvcToAdd, svcName, rulesToAdd)
}
if len(rulesToDelete) != 0 {
mak.Set(&rulesPerSvcToDelete, svcName, rulesToDelete)
}
if len(rulesToAdd) != 0 || ep.addrsHaveChanged(n) {
// For each tailnet target, set up SNAT from the local tailnet device address of the matching
// family.
for _, t := range tailnetTargetIPs {
var local netip.Addr
for _, pfx := range n.NetMap.SelfNode.Addresses().All() {
if !pfx.IsSingleIP() {
continue
}
if pfx.Addr().Is4() != t.Is4() {
continue
}
local = pfx.Addr()
break
}
if !local.IsValid() {
return nil, fmt.Errorf("no valid local IP: %v", local)
}
if err := ep.nfr.EnsureSNATForDst(local, t); err != nil {
return nil, fmt.Errorf("error setting up SNAT rule: %w", err)
}
}
}
// Update the status. Status will be written back to the state Secret by the caller.
mak.Set(&newStatus.Services, svcName, &egressservices.ServiceStatus{TailnetTargetIPs: tailnetTargetIPs, TailnetTarget: cfg.TailnetTarget, Ports: cfg.Ports})
}
// Actually apply the firewall rules.
if err := ensureRulesAdded(rulesPerSvcToAdd, ep.nfr); err != nil {
return nil, fmt.Errorf("error adding rules: %w", err)
}
if err := ensureRulesDeleted(rulesPerSvcToDelete, ep.nfr); err != nil {
return nil, fmt.Errorf("error deleting rules: %w", err)
}
return newStatus, nil
}
// updatesForCfg calculates any rules that need to be added or deleted for an individucal egress service config.
func updatesForCfg(svcName string, cfg egressservices.Config, status *egressservices.Status, tailnetTargetIPs []netip.Addr) ([]rule, []rule, error) {
rulesToAdd := make([]rule, 0)
rulesToDelete := make([]rule, 0)
currentConfig, ok := lookupCurrentConfig(svcName, status)
// If no rules for service are present yet, add them all.
if !ok {
for _, t := range tailnetTargetIPs {
for ports := range cfg.Ports {
log.Printf("syncegressservices: svc %s adding port %v", svcName, ports)
rulesToAdd = append(rulesToAdd, rule{tailnetPort: ports.TargetPort, containerPort: ports.MatchPort, protocol: ports.Protocol, tailnetIP: t})
}
}
return rulesToAdd, rulesToDelete, nil
}
// If there are no backend targets available, delete any currently configured rules.
if len(tailnetTargetIPs) == 0 {
log.Printf("tailnet target for egress service %s does not have any backend addresses, deleting all rules", svcName)
for _, ip := range currentConfig.TailnetTargetIPs {
for ports := range currentConfig.Ports {
rulesToDelete = append(rulesToAdd, rule{tailnetPort: ports.TargetPort, containerPort: ports.MatchPort, protocol: ports.Protocol, tailnetIP: ip})
}
}
return rulesToAdd, rulesToDelete, nil
}
// If there are rules present for backend targets that no longer match, delete them.
for _, ip := range currentConfig.TailnetTargetIPs {
var found bool
for _, wantsIP := range tailnetTargetIPs {
if reflect.DeepEqual(ip, wantsIP) {
found = true
break
}
}
if !found {
for ports := range currentConfig.Ports {
rulesToDelete = append(rulesToDelete, rule{tailnetPort: ports.TargetPort, containerPort: ports.MatchPort, protocol: ports.Protocol, tailnetIP: ip})
}
}
}
// Sync rules for the currently wanted backend targets.
for _, ip := range tailnetTargetIPs {
// If the backend target is not yet present in status, add all rules.
var found bool
for _, gotIP := range currentConfig.TailnetTargetIPs {
if reflect.DeepEqual(ip, gotIP) {
found = true
break
}
}
if !found {
for ports := range cfg.Ports {
rulesToAdd = append(rulesToAdd, rule{tailnetPort: ports.TargetPort, containerPort: ports.MatchPort, protocol: ports.Protocol, tailnetIP: ip})
}
continue
}
// If the backend target is present in status, check that the
// currently applied rules are up to date.
// Delete any current portmappings that are no longer present in config.
for port := range currentConfig.Ports {
if _, ok := cfg.Ports[port]; ok {
continue
}
rulesToDelete = append(rulesToDelete, rule{tailnetPort: port.TargetPort, containerPort: port.MatchPort, protocol: port.Protocol, tailnetIP: ip})
}
// Add any new portmappings.
for port := range cfg.Ports {
if _, ok := currentConfig.Ports[port]; ok {
continue
}
rulesToAdd = append(rulesToAdd, rule{tailnetPort: port.TargetPort, containerPort: port.MatchPort, protocol: port.Protocol, tailnetIP: ip})
}
}
return rulesToAdd, rulesToDelete, nil
}
// deleteUnneccessaryServices ensure that any services found on status, but not
// present in config are deleted.
func (ep *egressProxy) deleteUnnecessaryServices(cfgs *egressservices.Configs, status *egressservices.Status) error {
if !hasServicesConfigured(status) {
return nil
}
if !wantsServicesConfigured(cfgs) {
for svcName, svc := range status.Services {
log.Printf("service %s is no longer required, deleting", svcName)
if err := ensureServiceDeleted(svcName, svc, ep.nfr); err != nil {
return fmt.Errorf("error deleting service %s: %w", svcName, err)
}
}
return nil
}
for svcName, svc := range status.Services {
if _, ok := (*cfgs)[svcName]; !ok {
log.Printf("service %s is no longer required, deleting", svcName)
if err := ensureServiceDeleted(svcName, svc, ep.nfr); err != nil {
return fmt.Errorf("error deleting service %s: %w", svcName, err)
}
// TODO (irbekrm): also delete the SNAT rule here
}
}
return nil
}
// getConfigs gets the mounted egress service configuration.
func (ep *egressProxy) getConfigs() (*egressservices.Configs, error) {
j, err := os.ReadFile(ep.cfgPath)
if os.IsNotExist(err) {
return nil, nil
}
if err != nil {
return nil, err
}
if len(j) == 0 || string(j) == "" {
return nil, nil
}
cfg := &egressservices.Configs{}
if err := json.Unmarshal(j, &cfg); err != nil {
return nil, err
}
return cfg, nil
}
// getStatus gets the current status of the configured firewall. The current
// status is stored in state Secret. Returns nil status if no status that
// applies to the current proxy Pod was found. Uses the Pod IP to determine if a
// status found in the state Secret applies to this proxy Pod.
func (ep *egressProxy) getStatus(ctx context.Context) (*egressservices.Status, error) {
secret, err := ep.kc.GetSecret(ctx, ep.stateSecret)
if err != nil {
return nil, fmt.Errorf("error retrieving state secret: %w", err)
}
status := &egressservices.Status{}
raw, ok := secret.Data[egressservices.KeyEgressServices]
if !ok {
return nil, nil
}
if err := json.Unmarshal([]byte(raw), status); err != nil {
return nil, fmt.Errorf("error unmarshalling previous config: %w", err)
}
if reflect.DeepEqual(status.PodIPv4, ep.podIPv4) {
return status, nil
}
return nil, nil
}
// setStatus writes egress proxy's currently configured firewall to the state
// Secret and updates proxy's tailnet addresses.
func (ep *egressProxy) setStatus(ctx context.Context, status *egressservices.Status, n ipn.Notify) error {
// Pod IP is used to determine if a stored status applies to THIS proxy Pod.
if status == nil {
status = &egressservices.Status{}
}
status.PodIPv4 = ep.podIPv4
secret, err := ep.kc.GetSecret(ctx, ep.stateSecret)
if err != nil {
return fmt.Errorf("error retrieving state Secret: %w", err)
}
bs, err := json.Marshal(status)
if err != nil {
return fmt.Errorf("error marshalling service config: %w", err)
}
secret.Data[egressservices.KeyEgressServices] = bs
patch := kubeclient.JSONPatch{
Op: "replace",
Path: fmt.Sprintf("/data/%s", egressservices.KeyEgressServices),
Value: bs,
}
if err := ep.kc.JSONPatchSecret(ctx, ep.stateSecret, []kubeclient.JSONPatch{patch}); err != nil {
return fmt.Errorf("error patching state Secret: %w", err)
}
ep.tailnetAddrs = n.NetMap.SelfNode.Addresses().AsSlice()
return nil
}
// tailnetTargetIPsForSvc returns the tailnet IPs to which traffic for this
// egress service should be proxied. The egress service can be configured by IP
// or by FQDN. If it's configured by IP, just return that. If it's configured by
// FQDN, resolve the FQDN and return the resolved IPs. It checks if the
// netfilter runner supports IPv6 NAT and skips any IPv6 addresses if it
// doesn't.
func (ep *egressProxy) tailnetTargetIPsForSvc(svc egressservices.Config, n ipn.Notify) (addrs []netip.Addr, err error) {
if svc.TailnetTarget.IP != "" {
addr, err := netip.ParseAddr(svc.TailnetTarget.IP)
if err != nil {
return nil, fmt.Errorf("error parsing tailnet target IP: %w", err)
}
if addr.Is6() && !ep.nfr.HasIPV6NAT() {
log.Printf("tailnet target is an IPv6 address, but this host does not support IPv6 in the chosen firewall mode. This will probably not work.")
return addrs, nil
}
return []netip.Addr{addr}, nil
}
if svc.TailnetTarget.FQDN == "" {
return nil, errors.New("unexpected egress service config- neither tailnet target IP nor FQDN is set")
}
if n.NetMap == nil {
log.Printf("netmap is not available, unable to determine backend addresses for %s", svc.TailnetTarget.FQDN)
return addrs, nil
}
var (
node tailcfg.NodeView
nodeFound bool
)
for _, nn := range n.NetMap.Peers {
if equalFQDNs(nn.Name(), svc.TailnetTarget.FQDN) {
node = nn
nodeFound = true
break
}
}
if nodeFound {
for _, addr := range node.Addresses().AsSlice() {
if addr.Addr().Is6() && !ep.nfr.HasIPV6NAT() {
log.Printf("tailnet target %v is an IPv6 address, but this host does not support IPv6 in the chosen firewall mode, skipping.", addr.Addr().String())
continue
}
addrs = append(addrs, addr.Addr())
}
// Egress target endpoints configured via FQDN are stored, so
// that we can determine if a netmap update should trigger a
// resync.
mak.Set(&ep.targetFQDNs, svc.TailnetTarget.FQDN, node.Addresses().AsSlice())
}
return addrs, nil
}
// shouldResync parses netmap update and returns true if the update contains
// changes for which the egress proxy's firewall should be reconfigured.
func (ep *egressProxy) shouldResync(n ipn.Notify) bool {
if n.NetMap == nil {
return false
}
// If proxy's tailnet addresses have changed, resync.
if !reflect.DeepEqual(n.NetMap.SelfNode.Addresses().AsSlice(), ep.tailnetAddrs) {
log.Printf("node addresses have changed, trigger egress config resync")
ep.tailnetAddrs = n.NetMap.SelfNode.Addresses().AsSlice()
return true
}
// If the IPs for any of the egress services configured via FQDN have
// changed, resync.
for fqdn, ips := range ep.targetFQDNs {
for _, nn := range n.NetMap.Peers {
if equalFQDNs(nn.Name(), fqdn) {
if !reflect.DeepEqual(ips, nn.Addresses().AsSlice()) {
log.Printf("backend addresses for egress target %q have changed old IPs %v, new IPs %v trigger egress config resync", nn.Name(), ips, nn.Addresses().AsSlice())
}
return true
}
}
}
return false
}
// ensureServiceDeleted ensures that any rules for an egress service are removed
// from the firewall configuration.
func ensureServiceDeleted(svcName string, svc *egressservices.ServiceStatus, nfr linuxfw.NetfilterRunner) error {
// Note that the portmap is needed for iptables based firewall only.
// Nftables group rules for a service in a chain, so there is no need to
// specify individual portmapping based rules.
pms := make([]linuxfw.PortMap, 0)
for pm := range svc.Ports {
pms = append(pms, linuxfw.PortMap{MatchPort: pm.MatchPort, TargetPort: pm.TargetPort, Protocol: pm.Protocol})
}
if err := nfr.DeleteSvc(svcName, tailscaleTunInterface, svc.TailnetTargetIPs, pms); err != nil {
return fmt.Errorf("error deleting service %s: %w", svcName, err)
}
return nil
}
// ensureRulesAdded ensures that all portmapping rules are added to the firewall
// configuration. For any rules that already exist, calling this function is a
// no-op. In case of nftables, a service consists of one or two (one per IP
// family) chains that conain the portmapping rules for the service and the
// chains as needed when this function is called.
func ensureRulesAdded(rulesPerSvc map[string][]rule, nfr linuxfw.NetfilterRunner) error {
for svc, rules := range rulesPerSvc {
for _, rule := range rules {
log.Printf("ensureRulesAdded svc %s tailnetTarget %s container port %d tailnet port %d protocol %s", svc, rule.tailnetIP, rule.containerPort, rule.tailnetPort, rule.protocol)
if err := nfr.EnsurePortMapRuleForSvc(svc, tailscaleTunInterface, rule.tailnetIP, linuxfw.PortMap{MatchPort: rule.containerPort, TargetPort: rule.tailnetPort, Protocol: rule.protocol}); err != nil {
return fmt.Errorf("error ensuring rule: %w", err)
}
}
}
return nil
}
// ensureRulesDeleted ensures that the given rules are deleted from the firewall
// configuration. For any rules that do not exist, calling this funcion is a
// no-op.
func ensureRulesDeleted(rulesPerSvc map[string][]rule, nfr linuxfw.NetfilterRunner) error {
for svc, rules := range rulesPerSvc {
for _, rule := range rules {
log.Printf("ensureRulesDeleted svc %s tailnetTarget %s container port %d tailnet port %d protocol %s", svc, rule.tailnetIP, rule.containerPort, rule.tailnetPort, rule.protocol)
if err := nfr.DeletePortMapRuleForSvc(svc, tailscaleTunInterface, rule.tailnetIP, linuxfw.PortMap{MatchPort: rule.containerPort, TargetPort: rule.tailnetPort, Protocol: rule.protocol}); err != nil {
return fmt.Errorf("error deleting rule: %w", err)
}
}
}
return nil
}
func lookupCurrentConfig(svcName string, status *egressservices.Status) (*egressservices.ServiceStatus, bool) {
if status == nil || len(status.Services) == 0 {
return nil, false
}
c, ok := status.Services[svcName]
return c, ok
}
func equalFQDNs(s, s1 string) bool {
s, _ = strings.CutSuffix(s, ".")
s1, _ = strings.CutSuffix(s1, ".")
return strings.EqualFold(s, s1)
}
// rule contains configuration for an egress proxy firewall rule.
type rule struct {
containerPort uint16 // port to match incoming traffic
tailnetPort uint16 // tailnet service port
tailnetIP netip.Addr // tailnet service IP
protocol string
}
func wantsServicesConfigured(cfgs *egressservices.Configs) bool {
return cfgs != nil && len(*cfgs) != 0
}
func hasServicesConfigured(status *egressservices.Status) bool {
return status != nil && len(status.Services) != 0
}
func servicesStatusIsEqual(st, st1 *egressservices.Status) bool {
if st == nil && st1 == nil {
return true
}
if st == nil || st1 == nil {
return false
}
st.PodIPv4 = ""
st1.PodIPv4 = ""
return reflect.DeepEqual(*st, *st1)
}

View File

@@ -0,0 +1,175 @@
// Copyright (c) Tailscale Inc & AUTHORS
// SPDX-License-Identifier: BSD-3-Clause
//go:build linux
package main
import (
"net/netip"
"reflect"
"testing"
"tailscale.com/kube/egressservices"
)
func Test_updatesForSvc(t *testing.T) {
tailnetIPv4, tailnetIPv6 := netip.MustParseAddr("100.99.99.99"), netip.MustParseAddr("fd7a:115c:a1e0::701:b62a")
tailnetIPv4_1, tailnetIPv6_1 := netip.MustParseAddr("100.88.88.88"), netip.MustParseAddr("fd7a:115c:a1e0::4101:512f")
ports := map[egressservices.PortMap]struct{}{{Protocol: "tcp", MatchPort: 4003, TargetPort: 80}: {}}
ports1 := map[egressservices.PortMap]struct{}{{Protocol: "udp", MatchPort: 4004, TargetPort: 53}: {}}
ports2 := map[egressservices.PortMap]struct{}{{Protocol: "tcp", MatchPort: 4003, TargetPort: 80}: {},
{Protocol: "tcp", MatchPort: 4005, TargetPort: 443}: {}}
fqdnSpec := egressservices.Config{
TailnetTarget: egressservices.TailnetTarget{FQDN: "test"},
Ports: ports,
}
fqdnSpec1 := egressservices.Config{
TailnetTarget: egressservices.TailnetTarget{FQDN: "test"},
Ports: ports1,
}
fqdnSpec2 := egressservices.Config{
TailnetTarget: egressservices.TailnetTarget{IP: tailnetIPv4.String()},
Ports: ports,
}
fqdnSpec3 := egressservices.Config{
TailnetTarget: egressservices.TailnetTarget{IP: tailnetIPv4.String()},
Ports: ports2,
}
r := rule{containerPort: 4003, tailnetPort: 80, protocol: "tcp", tailnetIP: tailnetIPv4}
r1 := rule{containerPort: 4003, tailnetPort: 80, protocol: "tcp", tailnetIP: tailnetIPv6}
r2 := rule{tailnetPort: 53, containerPort: 4004, protocol: "udp", tailnetIP: tailnetIPv4}
r3 := rule{tailnetPort: 53, containerPort: 4004, protocol: "udp", tailnetIP: tailnetIPv6}
r4 := rule{containerPort: 4003, tailnetPort: 80, protocol: "tcp", tailnetIP: tailnetIPv4_1}
r5 := rule{containerPort: 4003, tailnetPort: 80, protocol: "tcp", tailnetIP: tailnetIPv6_1}
r6 := rule{containerPort: 4005, tailnetPort: 443, protocol: "tcp", tailnetIP: tailnetIPv4}
tests := []struct {
name string
svcName string
tailnetTargetIPs []netip.Addr
podIP string
spec egressservices.Config
status *egressservices.Status
wantRulesToAdd []rule
wantRulesToDelete []rule
}{
{
name: "add_fqdn_svc_that_does_not_yet_exist",
svcName: "test",
tailnetTargetIPs: []netip.Addr{tailnetIPv4, tailnetIPv6},
spec: fqdnSpec,
status: &egressservices.Status{},
wantRulesToAdd: []rule{r, r1},
wantRulesToDelete: []rule{},
},
{
name: "fqdn_svc_already_exists",
svcName: "test",
tailnetTargetIPs: []netip.Addr{tailnetIPv4, tailnetIPv6},
spec: fqdnSpec,
status: &egressservices.Status{
Services: map[string]*egressservices.ServiceStatus{"test": {
TailnetTargetIPs: []netip.Addr{tailnetIPv4, tailnetIPv6},
TailnetTarget: egressservices.TailnetTarget{FQDN: "test"},
Ports: ports,
}}},
wantRulesToAdd: []rule{},
wantRulesToDelete: []rule{},
},
{
name: "fqdn_svc_already_exists_add_port_remove_port",
svcName: "test",
tailnetTargetIPs: []netip.Addr{tailnetIPv4, tailnetIPv6},
spec: fqdnSpec1,
status: &egressservices.Status{
Services: map[string]*egressservices.ServiceStatus{"test": {
TailnetTargetIPs: []netip.Addr{tailnetIPv4, tailnetIPv6},
TailnetTarget: egressservices.TailnetTarget{FQDN: "test"},
Ports: ports,
}}},
wantRulesToAdd: []rule{r2, r3},
wantRulesToDelete: []rule{r, r1},
},
{
name: "fqdn_svc_already_exists_change_fqdn_backend_ips",
svcName: "test",
tailnetTargetIPs: []netip.Addr{tailnetIPv4_1, tailnetIPv6_1},
spec: fqdnSpec,
status: &egressservices.Status{
Services: map[string]*egressservices.ServiceStatus{"test": {
TailnetTargetIPs: []netip.Addr{tailnetIPv4, tailnetIPv6},
TailnetTarget: egressservices.TailnetTarget{FQDN: "test"},
Ports: ports,
}}},
wantRulesToAdd: []rule{r4, r5},
wantRulesToDelete: []rule{r, r1},
},
{
name: "add_ip_service",
svcName: "test",
tailnetTargetIPs: []netip.Addr{tailnetIPv4},
spec: fqdnSpec2,
status: &egressservices.Status{},
wantRulesToAdd: []rule{r},
wantRulesToDelete: []rule{},
},
{
name: "add_ip_service_already_exists",
svcName: "test",
tailnetTargetIPs: []netip.Addr{tailnetIPv4},
spec: fqdnSpec2,
status: &egressservices.Status{
Services: map[string]*egressservices.ServiceStatus{"test": {
TailnetTargetIPs: []netip.Addr{tailnetIPv4},
TailnetTarget: egressservices.TailnetTarget{IP: tailnetIPv4.String()},
Ports: ports,
}}},
wantRulesToAdd: []rule{},
wantRulesToDelete: []rule{},
},
{
name: "ip_service_add_port",
svcName: "test",
tailnetTargetIPs: []netip.Addr{tailnetIPv4},
spec: fqdnSpec3,
status: &egressservices.Status{
Services: map[string]*egressservices.ServiceStatus{"test": {
TailnetTargetIPs: []netip.Addr{tailnetIPv4},
TailnetTarget: egressservices.TailnetTarget{IP: tailnetIPv4.String()},
Ports: ports,
}}},
wantRulesToAdd: []rule{r6},
wantRulesToDelete: []rule{},
},
{
name: "ip_service_delete_port",
svcName: "test",
tailnetTargetIPs: []netip.Addr{tailnetIPv4},
spec: fqdnSpec,
status: &egressservices.Status{
Services: map[string]*egressservices.ServiceStatus{"test": {
TailnetTargetIPs: []netip.Addr{tailnetIPv4},
TailnetTarget: egressservices.TailnetTarget{IP: tailnetIPv4.String()},
Ports: ports2,
}}},
wantRulesToAdd: []rule{},
wantRulesToDelete: []rule{r6},
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
gotRulesToAdd, gotRulesToDelete, err := updatesForCfg(tt.svcName, tt.spec, tt.status, tt.tailnetTargetIPs)
if err != nil {
t.Errorf("updatesForSvc() unexpected error %v", err)
return
}
if !reflect.DeepEqual(gotRulesToAdd, tt.wantRulesToAdd) {
t.Errorf("updatesForSvc() got rulesToAdd = \n%v\n want rulesToAdd \n%v", gotRulesToAdd, tt.wantRulesToAdd)
}
if !reflect.DeepEqual(gotRulesToDelete, tt.wantRulesToDelete) {
t.Errorf("updatesForSvc() got rulesToDelete = \n%v\n want rulesToDelete \n%v", gotRulesToDelete, tt.wantRulesToDelete)
}
})
}
}

View File

@@ -14,6 +14,7 @@ import (
"os"
"path"
"strconv"
"strings"
"tailscale.com/ipn/conffile"
"tailscale.com/kube/kubeclient"
@@ -62,8 +63,65 @@ type settings struct {
// PodIP is the IP of the Pod if running in Kubernetes. This is used
// when setting up rules to proxy cluster traffic to cluster ingress
// target.
// Deprecated: use PodIPv4, PodIPv6 instead to support dual stack clusters
PodIP string
PodIPv4 string
PodIPv6 string
HealthCheckAddrPort string
EgressSvcsCfgPath string
}
func configFromEnv() (*settings, error) {
cfg := &settings{
AuthKey: defaultEnvs([]string{"TS_AUTHKEY", "TS_AUTH_KEY"}, ""),
Hostname: defaultEnv("TS_HOSTNAME", ""),
Routes: defaultEnvStringPointer("TS_ROUTES"),
ServeConfigPath: defaultEnv("TS_SERVE_CONFIG", ""),
ProxyTargetIP: defaultEnv("TS_DEST_IP", ""),
ProxyTargetDNSName: defaultEnv("TS_EXPERIMENTAL_DEST_DNS_NAME", ""),
TailnetTargetIP: defaultEnv("TS_TAILNET_TARGET_IP", ""),
TailnetTargetFQDN: defaultEnv("TS_TAILNET_TARGET_FQDN", ""),
DaemonExtraArgs: defaultEnv("TS_TAILSCALED_EXTRA_ARGS", ""),
ExtraArgs: defaultEnv("TS_EXTRA_ARGS", ""),
InKubernetes: os.Getenv("KUBERNETES_SERVICE_HOST") != "",
UserspaceMode: defaultBool("TS_USERSPACE", true),
StateDir: defaultEnv("TS_STATE_DIR", ""),
AcceptDNS: defaultEnvBoolPointer("TS_ACCEPT_DNS"),
KubeSecret: defaultEnv("TS_KUBE_SECRET", "tailscale"),
SOCKSProxyAddr: defaultEnv("TS_SOCKS5_SERVER", ""),
HTTPProxyAddr: defaultEnv("TS_OUTBOUND_HTTP_PROXY_LISTEN", ""),
Socket: defaultEnv("TS_SOCKET", "/tmp/tailscaled.sock"),
AuthOnce: defaultBool("TS_AUTH_ONCE", false),
Root: defaultEnv("TS_TEST_ONLY_ROOT", "/"),
TailscaledConfigFilePath: tailscaledConfigFilePath(),
AllowProxyingClusterTrafficViaIngress: defaultBool("EXPERIMENTAL_ALLOW_PROXYING_CLUSTER_TRAFFIC_VIA_INGRESS", false),
PodIP: defaultEnv("POD_IP", ""),
EnableForwardingOptimizations: defaultBool("TS_EXPERIMENTAL_ENABLE_FORWARDING_OPTIMIZATIONS", false),
HealthCheckAddrPort: defaultEnv("TS_HEALTHCHECK_ADDR_PORT", ""),
EgressSvcsCfgPath: defaultEnv("TS_EGRESS_SERVICES_CONFIG_PATH", ""),
}
podIPs, ok := os.LookupEnv("POD_IPS")
if ok {
ips := strings.Split(podIPs, ",")
if len(ips) > 2 {
return nil, fmt.Errorf("POD_IPs can contain at most 2 IPs, got %d (%v)", len(ips), ips)
}
for _, ip := range ips {
parsed, err := netip.ParseAddr(ip)
if err != nil {
return nil, fmt.Errorf("error parsing IP address %s: %w", ip, err)
}
if parsed.Is4() {
cfg.PodIPv4 = parsed.String()
continue
}
cfg.PodIPv6 = parsed.String()
}
}
if err := cfg.validate(); err != nil {
return nil, fmt.Errorf("invalid configuration: %v", err)
}
return cfg, nil
}
func (s *settings) validate() error {
@@ -129,44 +187,51 @@ func (cfg *settings) setupKube(ctx context.Context) error {
}
canPatch, canCreate, err := kc.CheckSecretPermissions(ctx, cfg.KubeSecret)
if err != nil {
return fmt.Errorf("Some Kubernetes permissions are missing, please check your RBAC configuration: %v", err)
return fmt.Errorf("some Kubernetes permissions are missing, please check your RBAC configuration: %v", err)
}
cfg.KubernetesCanPatch = canPatch
s, err := kc.GetSecret(ctx, cfg.KubeSecret)
if err != nil && kubeclient.IsNotFoundErr(err) && !canCreate {
return fmt.Errorf("Tailscale state Secret %s does not exist and we don't have permissions to create it. "+
"If you intend to store tailscale state elsewhere than a Kubernetes Secret, "+
"you can explicitly set TS_KUBE_SECRET env var to an empty string. "+
"Else ensure that RBAC is set up that allows the service account associated with this installation to create Secrets.", cfg.KubeSecret)
} else if err != nil && !kubeclient.IsNotFoundErr(err) {
return fmt.Errorf("Getting Tailscale state Secret %s: %v", cfg.KubeSecret, err)
}
if cfg.AuthKey == "" && !isOneStepConfig(cfg) {
if s == nil {
log.Print("TS_AUTHKEY not provided and kube secret does not exist, login will be interactive if needed.")
return nil
if err != nil {
if !kubeclient.IsNotFoundErr(err) {
return fmt.Errorf("getting Tailscale state Secret %s: %v", cfg.KubeSecret, err)
}
keyBytes, _ := s.Data["authkey"]
key := string(keyBytes)
if key != "" {
// This behavior of pulling authkeys from kube secrets was added
// at the same time as the patch permission, so we can enforce
// that we must be able to patch out the authkey after
// authenticating if you want to use this feature. This avoids
// us having to deal with the case where we might leave behind
// an unnecessary reusable authkey in a secret, like a rake in
// the grass.
if !cfg.KubernetesCanPatch {
return errors.New("authkey found in TS_KUBE_SECRET, but the pod doesn't have patch permissions on the secret to manage the authkey.")
}
cfg.AuthKey = key
} else {
log.Print("No authkey found in kube secret and TS_AUTHKEY not provided, login will be interactive if needed.")
if !canCreate {
return fmt.Errorf("tailscale state Secret %s does not exist and we don't have permissions to create it. "+
"If you intend to store tailscale state elsewhere than a Kubernetes Secret, "+
"you can explicitly set TS_KUBE_SECRET env var to an empty string. "+
"Else ensure that RBAC is set up that allows the service account associated with this installation to create Secrets.", cfg.KubeSecret)
}
}
// Return early if we already have an auth key.
if cfg.AuthKey != "" || isOneStepConfig(cfg) {
return nil
}
if s == nil {
log.Print("TS_AUTHKEY not provided and state Secret does not exist, login will be interactive if needed.")
return nil
}
keyBytes, _ := s.Data["authkey"]
key := string(keyBytes)
if key != "" {
// Enforce that we must be able to patch out the authkey after
// authenticating if you want to use this feature. This avoids
// us having to deal with the case where we might leave behind
// an unnecessary reusable authkey in a secret, like a rake in
// the grass.
if !cfg.KubernetesCanPatch {
return errors.New("authkey found in TS_KUBE_SECRET, but the pod doesn't have patch permissions on the Secret to manage the authkey.")
}
cfg.AuthKey = key
}
log.Print("No authkey found in state Secret and TS_AUTHKEY not provided, login will be interactive if needed.")
return nil
}
@@ -198,7 +263,7 @@ func isOneStepConfig(cfg *settings) bool {
// as an L3 proxy, proxying to an endpoint provided via one of the config env
// vars.
func isL3Proxy(cfg *settings) bool {
return cfg.ProxyTargetIP != "" || cfg.ProxyTargetDNSName != "" || cfg.TailnetTargetIP != "" || cfg.TailnetTargetFQDN != "" || cfg.AllowProxyingClusterTrafficViaIngress
return cfg.ProxyTargetIP != "" || cfg.ProxyTargetDNSName != "" || cfg.TailnetTargetIP != "" || cfg.TailnetTargetFQDN != "" || cfg.AllowProxyingClusterTrafficViaIngress || cfg.EgressSvcsCfgPath != ""
}
// hasKubeStateStore returns true if the state must be stored in a Kubernetes

View File

@@ -113,6 +113,7 @@ tailscale.com/cmd/derper dependencies: (generated by github.com/tailscale/depawa
tailscale.com/net/stunserver from tailscale.com/cmd/derper
L tailscale.com/net/tcpinfo from tailscale.com/derp
tailscale.com/net/tlsdial from tailscale.com/derp/derphttp
tailscale.com/net/tlsdial/blockblame from tailscale.com/net/tlsdial
tailscale.com/net/tsaddr from tailscale.com/ipn+
💣 tailscale.com/net/tshttpproxy from tailscale.com/derp/derphttp+
tailscale.com/net/wsconn from tailscale.com/cmd/derper+
@@ -128,7 +129,7 @@ tailscale.com/cmd/derper dependencies: (generated by github.com/tailscale/depawa
tailscale.com/tsweb from tailscale.com/cmd/derper
tailscale.com/tsweb/promvarz from tailscale.com/tsweb
tailscale.com/tsweb/varz from tailscale.com/tsweb+
tailscale.com/types/dnstype from tailscale.com/tailcfg
tailscale.com/types/dnstype from tailscale.com/tailcfg+
tailscale.com/types/empty from tailscale.com/ipn
tailscale.com/types/ipproto from tailscale.com/tailcfg+
tailscale.com/types/key from tailscale.com/client/tailscale+
@@ -162,7 +163,8 @@ tailscale.com/cmd/derper dependencies: (generated by github.com/tailscale/depawa
tailscale.com/util/singleflight from tailscale.com/net/dnscache
tailscale.com/util/slicesx from tailscale.com/cmd/derper+
tailscale.com/util/syspolicy from tailscale.com/ipn
tailscale.com/util/syspolicy/internal from tailscale.com/util/syspolicy/setting
tailscale.com/util/syspolicy/internal from tailscale.com/util/syspolicy/setting+
tailscale.com/util/syspolicy/internal/loggerx from tailscale.com/util/syspolicy
tailscale.com/util/syspolicy/setting from tailscale.com/util/syspolicy
tailscale.com/util/usermetric from tailscale.com/health
tailscale.com/util/vizerror from tailscale.com/tailcfg+

View File

@@ -75,6 +75,11 @@ func main() {
prober.WithPageLink("Prober metrics", "/debug/varz"),
prober.WithProbeLink("Run Probe", "/debug/probe-run?name={{.Name}}"),
), tsweb.HandlerOptions{Logf: log.Printf}))
mux.Handle("/healthz", http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.Header().Set("Content-Type", "text/plain")
w.WriteHeader(http.StatusOK)
w.Write([]byte("ok\n"))
}))
log.Printf("Listening on %s", *listen)
log.Fatal(http.ListenAndServe(*listen, mux))
}

View File

@@ -51,6 +51,7 @@ func main() {
ctx := context.Background()
tsClient := tailscale.NewClient("-", nil)
tsClient.UserAgent = "tailscale-get-authkey"
tsClient.HTTPClient = credentials.Client(ctx)
tsClient.BaseURL = baseURL

View File

@@ -278,7 +278,7 @@ func TestConnectorWithProxyClass(t *testing.T) {
pc.Status = tsapi.ProxyClassStatus{
Conditions: []metav1.Condition{{
Status: metav1.ConditionTrue,
Type: string(tsapi.ProxyClassready),
Type: string(tsapi.ProxyClassReady),
ObservedGeneration: pc.Generation,
}}}
})

View File

@@ -310,7 +310,7 @@ tailscale.com/cmd/k8s-operator dependencies: (generated by github.com/tailscale/
gvisor.dev/gvisor/pkg/tcpip/network/internal/ip from gvisor.dev/gvisor/pkg/tcpip/network/ipv4+
gvisor.dev/gvisor/pkg/tcpip/network/internal/multicast from gvisor.dev/gvisor/pkg/tcpip/network/ipv4+
gvisor.dev/gvisor/pkg/tcpip/network/ipv4 from tailscale.com/net/tstun+
gvisor.dev/gvisor/pkg/tcpip/network/ipv6 from tailscale.com/wgengine/netstack
gvisor.dev/gvisor/pkg/tcpip/network/ipv6 from tailscale.com/wgengine/netstack+
gvisor.dev/gvisor/pkg/tcpip/ports from gvisor.dev/gvisor/pkg/tcpip/stack+
gvisor.dev/gvisor/pkg/tcpip/seqnum from gvisor.dev/gvisor/pkg/tcpip/header+
💣 gvisor.dev/gvisor/pkg/tcpip/stack from gvisor.dev/gvisor/pkg/tcpip/adapters/gonet+
@@ -654,11 +654,12 @@ tailscale.com/cmd/k8s-operator dependencies: (generated by github.com/tailscale/
tailscale.com/client/tailscale/apitype from tailscale.com/client/tailscale+
tailscale.com/client/web from tailscale.com/ipn/ipnlocal
tailscale.com/clientupdate from tailscale.com/client/web+
tailscale.com/clientupdate/distsign from tailscale.com/clientupdate
LW tailscale.com/clientupdate/distsign from tailscale.com/clientupdate
tailscale.com/control/controlbase from tailscale.com/control/controlhttp+
tailscale.com/control/controlclient from tailscale.com/ipn/ipnlocal+
tailscale.com/control/controlhttp from tailscale.com/control/controlclient
tailscale.com/control/controlknobs from tailscale.com/control/controlclient+
tailscale.com/control/keyfallback from tailscale.com/control/controlclient
tailscale.com/derp from tailscale.com/derp/derphttp+
tailscale.com/derp/derphttp from tailscale.com/ipn/localapi+
tailscale.com/disco from tailscale.com/derp+
@@ -668,6 +669,7 @@ tailscale.com/cmd/k8s-operator dependencies: (generated by github.com/tailscale/
tailscale.com/doctor/routetable from tailscale.com/ipn/ipnlocal
tailscale.com/drive from tailscale.com/client/tailscale+
tailscale.com/envknob from tailscale.com/client/tailscale+
tailscale.com/envknob/featureknob from tailscale.com/client/web+
tailscale.com/health from tailscale.com/control/controlclient+
tailscale.com/health/healthmsg from tailscale.com/ipn/ipnlocal
tailscale.com/hostinfo from tailscale.com/client/web+
@@ -690,6 +692,7 @@ tailscale.com/cmd/k8s-operator dependencies: (generated by github.com/tailscale/
tailscale.com/k8s-operator/sessionrecording/spdy from tailscale.com/k8s-operator/sessionrecording
tailscale.com/k8s-operator/sessionrecording/tsrecorder from tailscale.com/k8s-operator/sessionrecording+
tailscale.com/k8s-operator/sessionrecording/ws from tailscale.com/k8s-operator/sessionrecording
tailscale.com/kube/egressservices from tailscale.com/cmd/k8s-operator
tailscale.com/kube/kubeapi from tailscale.com/ipn/store/kubestore+
tailscale.com/kube/kubeclient from tailscale.com/ipn/store/kubestore
tailscale.com/kube/kubetypes from tailscale.com/cmd/k8s-operator+
@@ -733,6 +736,7 @@ tailscale.com/cmd/k8s-operator dependencies: (generated by github.com/tailscale/
tailscale.com/net/stun from tailscale.com/ipn/localapi+
L tailscale.com/net/tcpinfo from tailscale.com/derp
tailscale.com/net/tlsdial from tailscale.com/control/controlclient+
tailscale.com/net/tlsdial/blockblame from tailscale.com/net/tlsdial
tailscale.com/net/tsaddr from tailscale.com/client/web+
tailscale.com/net/tsdial from tailscale.com/control/controlclient+
💣 tailscale.com/net/tshttpproxy from tailscale.com/clientupdate/distsign+
@@ -808,7 +812,8 @@ tailscale.com/cmd/k8s-operator dependencies: (generated by github.com/tailscale/
tailscale.com/util/singleflight from tailscale.com/control/controlclient+
tailscale.com/util/slicesx from tailscale.com/appc+
tailscale.com/util/syspolicy from tailscale.com/control/controlclient+
tailscale.com/util/syspolicy/internal from tailscale.com/util/syspolicy/setting
tailscale.com/util/syspolicy/internal from tailscale.com/util/syspolicy/setting+
tailscale.com/util/syspolicy/internal/loggerx from tailscale.com/util/syspolicy
tailscale.com/util/syspolicy/setting from tailscale.com/util/syspolicy
tailscale.com/util/sysresources from tailscale.com/wgengine/magicsock
tailscale.com/util/systemd from tailscale.com/control/controlclient+

View File

@@ -22,7 +22,7 @@ rules:
resources: ["ingressclasses"]
verbs: ["get", "list", "watch"]
- apiGroups: ["tailscale.com"]
resources: ["connectors", "connectors/status", "proxyclasses", "proxyclasses/status"]
resources: ["connectors", "connectors/status", "proxyclasses", "proxyclasses/status", "proxygroups", "proxygroups/status"]
verbs: ["get", "list", "watch", "update"]
- apiGroups: ["tailscale.com"]
resources: ["dnsconfigs", "dnsconfigs/status"]
@@ -53,12 +53,15 @@ rules:
- apiGroups: [""]
resources: ["secrets", "serviceaccounts", "configmaps"]
verbs: ["create","delete","deletecollection","get","list","patch","update","watch"]
- apiGroups: [""]
resources: ["pods"]
verbs: ["get","list","watch"]
- apiGroups: ["apps"]
resources: ["statefulsets", "deployments"]
verbs: ["create","delete","deletecollection","get","list","patch","update","watch"]
- apiGroups: ["discovery.k8s.io"]
resources: ["endpointslices"]
verbs: ["get", "list", "watch"]
verbs: ["get", "list", "watch", "create", "update", "deletecollection"]
- apiGroups: ["rbac.authorization.k8s.io"]
resources: ["roles", "rolebindings"]
verbs: ["get", "create", "patch", "update", "list", "watch"]

View File

@@ -57,12 +57,12 @@ operatorConfig:
# proxyConfig contains configuraton that will be applied to any ingress/egress
# proxies created by the operator.
# https://tailscale.com/kb/1236/kubernetes-operator/#cluster-ingress
# https://tailscale.com/kb/1236/kubernetes-operator/#cluster-egress
# https://tailscale.com/kb/1439/kubernetes-operator-cluster-ingress
# https://tailscale.com/kb/1438/kubernetes-operator-cluster-egress
# Note that this section contains only a few global configuration options and
# will not be updated with more configuration options in the future.
# If you need more configuration options, take a look at ProxyClass:
# https://tailscale.com/kb/1236/kubernetes-operator#cluster-resource-customization-using-proxyclass-custom-resource
# https://tailscale.com/kb/1445/kubernetes-operator-customization#cluster-resource-customization-using-proxyclass-custom-resource
proxyConfig:
image:
# Repository defaults to DockerHub, but images are also synced to ghcr.io/tailscale/tailscale.
@@ -79,12 +79,13 @@ proxyConfig:
defaultTags: "tag:k8s"
firewallMode: auto
# If defined, this proxy class will be used as the default proxy class for
# service and ingress resources that do not have a proxy class defined.
# service and ingress resources that do not have a proxy class defined. It
# does not apply to Connector resources.
defaultProxyClass: ""
# apiServerProxyConfig allows to configure whether the operator should expose
# Kubernetes API server.
# https://tailscale.com/kb/1236/kubernetes-operator/#accessing-the-kubernetes-control-plane-using-an-api-server-proxy
# https://tailscale.com/kb/1437/kubernetes-operator-api-server-proxy
apiServerProxyConfig:
mode: "false" # "true", "false", "noauth"

View File

@@ -37,7 +37,7 @@ spec:
exit node.
Connector is a cluster-scoped resource.
More info:
https://tailscale.com/kb/1236/kubernetes-operator#deploying-exit-nodes-and-subnet-routers-on-kubernetes-using-connector-custom-resource
https://tailscale.com/kb/1441/kubernetes-operator-connector
type: object
required:
- spec
@@ -115,7 +115,7 @@ spec:
To autoapprove the subnet routes or exit node defined by a Connector,
you can configure Tailscale ACLs to give these tags the necessary
permissions.
See https://tailscale.com/kb/1018/acls/#auto-approvers-for-routes-and-exit-nodes.
See https://tailscale.com/kb/1337/acl-syntax#autoapprovers.
If you specify custom tags here, you must also make the operator an owner of these tags.
See https://tailscale.com/kb/1236/kubernetes-operator/#setting-up-the-kubernetes-operator.
Tags cannot be changed once a Connector node has been created.

View File

@@ -30,7 +30,7 @@ spec:
connector.spec.proxyClass field.
ProxyClass is a cluster scoped resource.
More info:
https://tailscale.com/kb/1236/kubernetes-operator#cluster-resource-customization-using-proxyclass-custom-resource.
https://tailscale.com/kb/1445/kubernetes-operator-customization#cluster-resource-customization-using-proxyclass-custom-resource
type: object
required:
- spec
@@ -1908,7 +1908,7 @@ spec:
routes advertized by other nodes on the tailnet, such as subnet
routes.
This is equivalent of passing --accept-routes flag to a tailscale Linux client.
https://tailscale.com/kb/1019/subnets#use-your-subnet-routes-from-other-machines
https://tailscale.com/kb/1019/subnets#use-your-subnet-routes-from-other-devices
Defaults to false.
type: boolean
status:

View File

@@ -0,0 +1,187 @@
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
annotations:
controller-gen.kubebuilder.io/version: v0.15.1-0.20240618033008-7824932b0cab
name: proxygroups.tailscale.com
spec:
group: tailscale.com
names:
kind: ProxyGroup
listKind: ProxyGroupList
plural: proxygroups
shortNames:
- pg
singular: proxygroup
scope: Cluster
versions:
- additionalPrinterColumns:
- description: Status of the deployed ProxyGroup resources.
jsonPath: .status.conditions[?(@.type == "ProxyGroupReady")].reason
name: Status
type: string
name: v1alpha1
schema:
openAPIV3Schema:
type: object
required:
- spec
properties:
apiVersion:
description: |-
APIVersion defines the versioned schema of this representation of an object.
Servers should convert recognized schemas to the latest internal value, and
may reject unrecognized values.
More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources
type: string
kind:
description: |-
Kind is a string value representing the REST resource this object represents.
Servers may infer this from the endpoint the client submits requests to.
Cannot be updated.
In CamelCase.
More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds
type: string
metadata:
type: object
spec:
description: Spec describes the desired ProxyGroup instances.
type: object
required:
- type
properties:
hostnamePrefix:
description: |-
HostnamePrefix is the hostname prefix to use for tailnet devices created
by the ProxyGroup. Each device will have the integer number from its
StatefulSet pod appended to this prefix to form the full hostname.
HostnamePrefix can contain lower case letters, numbers and dashes, it
must not start with a dash and must be between 1 and 62 characters long.
type: string
pattern: ^[a-z0-9][a-z0-9-]{0,61}$
proxyClass:
description: |-
ProxyClass is the name of the ProxyClass custom resource that contains
configuration options that should be applied to the resources created
for this ProxyGroup. If unset, and there is no default ProxyClass
configured, the operator will create resources with the default
configuration.
type: string
replicas:
description: |-
Replicas specifies how many replicas to create the StatefulSet with.
Defaults to 2.
type: integer
format: int32
tags:
description: |-
Tags that the Tailscale devices will be tagged with. Defaults to [tag:k8s].
If you specify custom tags here, make sure you also make the operator
an owner of these tags.
See https://tailscale.com/kb/1236/kubernetes-operator/#setting-up-the-kubernetes-operator.
Tags cannot be changed once a ProxyGroup device has been created.
Tag values must be in form ^tag:[a-zA-Z][a-zA-Z0-9-]*$.
type: array
items:
type: string
pattern: ^tag:[a-zA-Z][a-zA-Z0-9-]*$
type:
description: Type of the ProxyGroup proxies. Currently the only supported type is egress.
type: string
enum:
- egress
status:
description: |-
ProxyGroupStatus describes the status of the ProxyGroup resources. This is
set and managed by the Tailscale operator.
type: object
properties:
conditions:
description: |-
List of status conditions to indicate the status of the ProxyGroup
resources. Known condition types are `ProxyGroupReady`.
type: array
items:
description: Condition contains details for one aspect of the current state of this API Resource.
type: object
required:
- lastTransitionTime
- message
- reason
- status
- type
properties:
lastTransitionTime:
description: |-
lastTransitionTime is the last time the condition transitioned from one status to another.
This should be when the underlying condition changed. If that is not known, then using the time when the API field changed is acceptable.
type: string
format: date-time
message:
description: |-
message is a human readable message indicating details about the transition.
This may be an empty string.
type: string
maxLength: 32768
observedGeneration:
description: |-
observedGeneration represents the .metadata.generation that the condition was set based upon.
For instance, if .metadata.generation is currently 12, but the .status.conditions[x].observedGeneration is 9, the condition is out of date
with respect to the current state of the instance.
type: integer
format: int64
minimum: 0
reason:
description: |-
reason contains a programmatic identifier indicating the reason for the condition's last transition.
Producers of specific condition types may define expected values and meanings for this field,
and whether the values are considered a guaranteed API.
The value should be a CamelCase string.
This field may not be empty.
type: string
maxLength: 1024
minLength: 1
pattern: ^[A-Za-z]([A-Za-z0-9_,:]*[A-Za-z0-9_])?$
status:
description: status of the condition, one of True, False, Unknown.
type: string
enum:
- "True"
- "False"
- Unknown
type:
description: type of condition in CamelCase or in foo.example.com/CamelCase.
type: string
maxLength: 316
pattern: ^([a-z0-9]([-a-z0-9]*[a-z0-9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*/)?(([A-Za-z0-9][-A-Za-z0-9_.]*)?[A-Za-z0-9])$
x-kubernetes-list-map-keys:
- type
x-kubernetes-list-type: map
devices:
description: List of tailnet devices associated with the ProxyGroup StatefulSet.
type: array
items:
type: object
required:
- hostname
properties:
hostname:
description: |-
Hostname is the fully qualified domain name of the device.
If MagicDNS is enabled in your tailnet, it is the MagicDNS name of the
node.
type: string
tailnetIPs:
description: |-
TailnetIPs is the set of tailnet IP addresses (both IPv4 and IPv6)
assigned to the device.
type: array
items:
type: string
x-kubernetes-list-map-keys:
- hostname
x-kubernetes-list-type: map
served: true
storage: true
subresources:
status: {}

View File

@@ -1670,7 +1670,7 @@ spec:
- type
x-kubernetes-list-type: map
devices:
description: List of tailnet devices associated with the Recorder statefulset.
description: List of tailnet devices associated with the Recorder StatefulSet.
type: array
items:
type: object

View File

@@ -0,0 +1,7 @@
apiVersion: tailscale.com/v1alpha1
kind: ProxyGroup
metadata:
name: egress-proxies
spec:
type: egress
replicas: 3

View File

@@ -66,7 +66,7 @@ spec:
exit node.
Connector is a cluster-scoped resource.
More info:
https://tailscale.com/kb/1236/kubernetes-operator#deploying-exit-nodes-and-subnet-routers-on-kubernetes-using-connector-custom-resource
https://tailscale.com/kb/1441/kubernetes-operator-connector
properties:
apiVersion:
description: |-
@@ -140,7 +140,7 @@ spec:
To autoapprove the subnet routes or exit node defined by a Connector,
you can configure Tailscale ACLs to give these tags the necessary
permissions.
See https://tailscale.com/kb/1018/acls/#auto-approvers-for-routes-and-exit-nodes.
See https://tailscale.com/kb/1337/acl-syntax#autoapprovers.
If you specify custom tags here, you must also make the operator an owner of these tags.
See https://tailscale.com/kb/1236/kubernetes-operator/#setting-up-the-kubernetes-operator.
Tags cannot be changed once a Connector node has been created.
@@ -463,7 +463,7 @@ spec:
connector.spec.proxyClass field.
ProxyClass is a cluster scoped resource.
More info:
https://tailscale.com/kb/1236/kubernetes-operator#cluster-resource-customization-using-proxyclass-custom-resource.
https://tailscale.com/kb/1445/kubernetes-operator-customization#cluster-resource-customization-using-proxyclass-custom-resource
properties:
apiVersion:
description: |-
@@ -2336,7 +2336,7 @@ spec:
routes advertized by other nodes on the tailnet, such as subnet
routes.
This is equivalent of passing --accept-routes flag to a tailscale Linux client.
https://tailscale.com/kb/1019/subnets#use-your-subnet-routes-from-other-machines
https://tailscale.com/kb/1019/subnets#use-your-subnet-routes-from-other-devices
Defaults to false.
type: boolean
type: object
@@ -2418,6 +2418,194 @@ spec:
---
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
annotations:
controller-gen.kubebuilder.io/version: v0.15.1-0.20240618033008-7824932b0cab
name: proxygroups.tailscale.com
spec:
group: tailscale.com
names:
kind: ProxyGroup
listKind: ProxyGroupList
plural: proxygroups
shortNames:
- pg
singular: proxygroup
scope: Cluster
versions:
- additionalPrinterColumns:
- description: Status of the deployed ProxyGroup resources.
jsonPath: .status.conditions[?(@.type == "ProxyGroupReady")].reason
name: Status
type: string
name: v1alpha1
schema:
openAPIV3Schema:
properties:
apiVersion:
description: |-
APIVersion defines the versioned schema of this representation of an object.
Servers should convert recognized schemas to the latest internal value, and
may reject unrecognized values.
More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources
type: string
kind:
description: |-
Kind is a string value representing the REST resource this object represents.
Servers may infer this from the endpoint the client submits requests to.
Cannot be updated.
In CamelCase.
More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds
type: string
metadata:
type: object
spec:
description: Spec describes the desired ProxyGroup instances.
properties:
hostnamePrefix:
description: |-
HostnamePrefix is the hostname prefix to use for tailnet devices created
by the ProxyGroup. Each device will have the integer number from its
StatefulSet pod appended to this prefix to form the full hostname.
HostnamePrefix can contain lower case letters, numbers and dashes, it
must not start with a dash and must be between 1 and 62 characters long.
pattern: ^[a-z0-9][a-z0-9-]{0,61}$
type: string
proxyClass:
description: |-
ProxyClass is the name of the ProxyClass custom resource that contains
configuration options that should be applied to the resources created
for this ProxyGroup. If unset, and there is no default ProxyClass
configured, the operator will create resources with the default
configuration.
type: string
replicas:
description: |-
Replicas specifies how many replicas to create the StatefulSet with.
Defaults to 2.
format: int32
type: integer
tags:
description: |-
Tags that the Tailscale devices will be tagged with. Defaults to [tag:k8s].
If you specify custom tags here, make sure you also make the operator
an owner of these tags.
See https://tailscale.com/kb/1236/kubernetes-operator/#setting-up-the-kubernetes-operator.
Tags cannot be changed once a ProxyGroup device has been created.
Tag values must be in form ^tag:[a-zA-Z][a-zA-Z0-9-]*$.
items:
pattern: ^tag:[a-zA-Z][a-zA-Z0-9-]*$
type: string
type: array
type:
description: Type of the ProxyGroup proxies. Currently the only supported type is egress.
enum:
- egress
type: string
required:
- type
type: object
status:
description: |-
ProxyGroupStatus describes the status of the ProxyGroup resources. This is
set and managed by the Tailscale operator.
properties:
conditions:
description: |-
List of status conditions to indicate the status of the ProxyGroup
resources. Known condition types are `ProxyGroupReady`.
items:
description: Condition contains details for one aspect of the current state of this API Resource.
properties:
lastTransitionTime:
description: |-
lastTransitionTime is the last time the condition transitioned from one status to another.
This should be when the underlying condition changed. If that is not known, then using the time when the API field changed is acceptable.
format: date-time
type: string
message:
description: |-
message is a human readable message indicating details about the transition.
This may be an empty string.
maxLength: 32768
type: string
observedGeneration:
description: |-
observedGeneration represents the .metadata.generation that the condition was set based upon.
For instance, if .metadata.generation is currently 12, but the .status.conditions[x].observedGeneration is 9, the condition is out of date
with respect to the current state of the instance.
format: int64
minimum: 0
type: integer
reason:
description: |-
reason contains a programmatic identifier indicating the reason for the condition's last transition.
Producers of specific condition types may define expected values and meanings for this field,
and whether the values are considered a guaranteed API.
The value should be a CamelCase string.
This field may not be empty.
maxLength: 1024
minLength: 1
pattern: ^[A-Za-z]([A-Za-z0-9_,:]*[A-Za-z0-9_])?$
type: string
status:
description: status of the condition, one of True, False, Unknown.
enum:
- "True"
- "False"
- Unknown
type: string
type:
description: type of condition in CamelCase or in foo.example.com/CamelCase.
maxLength: 316
pattern: ^([a-z0-9]([-a-z0-9]*[a-z0-9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*/)?(([A-Za-z0-9][-A-Za-z0-9_.]*)?[A-Za-z0-9])$
type: string
required:
- lastTransitionTime
- message
- reason
- status
- type
type: object
type: array
x-kubernetes-list-map-keys:
- type
x-kubernetes-list-type: map
devices:
description: List of tailnet devices associated with the ProxyGroup StatefulSet.
items:
properties:
hostname:
description: |-
Hostname is the fully qualified domain name of the device.
If MagicDNS is enabled in your tailnet, it is the MagicDNS name of the
node.
type: string
tailnetIPs:
description: |-
TailnetIPs is the set of tailnet IP addresses (both IPv4 and IPv6)
assigned to the device.
items:
type: string
type: array
required:
- hostname
type: object
type: array
x-kubernetes-list-map-keys:
- hostname
x-kubernetes-list-type: map
type: object
required:
- spec
type: object
served: true
storage: true
subresources:
status: {}
---
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
annotations:
controller-gen.kubebuilder.io/version: v0.15.1-0.20240618033008-7824932b0cab
@@ -4084,7 +4272,7 @@ spec:
- type
x-kubernetes-list-type: map
devices:
description: List of tailnet devices associated with the Recorder statefulset.
description: List of tailnet devices associated with the Recorder StatefulSet.
items:
properties:
hostname:
@@ -4171,6 +4359,8 @@ rules:
- connectors/status
- proxyclasses
- proxyclasses/status
- proxygroups
- proxygroups/status
verbs:
- get
- list
@@ -4231,6 +4421,14 @@ rules:
- patch
- update
- watch
- apiGroups:
- ""
resources:
- pods
verbs:
- get
- list
- watch
- apiGroups:
- apps
resources:
@@ -4253,6 +4451,9 @@ rules:
- get
- list
- watch
- create
- update
- deletecollection
- apiGroups:
- rbac.authorization.k8s.io
resources:

View File

@@ -0,0 +1,213 @@
// Copyright (c) Tailscale Inc & AUTHORS
// SPDX-License-Identifier: BSD-3-Clause
//go:build !plan9
package main
import (
"context"
"encoding/json"
"fmt"
"net/netip"
"reflect"
"strings"
"go.uber.org/zap"
corev1 "k8s.io/api/core/v1"
discoveryv1 "k8s.io/api/discovery/v1"
apierrors "k8s.io/apimachinery/pkg/api/errors"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
"sigs.k8s.io/controller-runtime/pkg/client"
"sigs.k8s.io/controller-runtime/pkg/reconcile"
tsoperator "tailscale.com/k8s-operator"
"tailscale.com/kube/egressservices"
"tailscale.com/types/ptr"
)
// egressEpsReconciler reconciles EndpointSlices for tailnet services exposed to cluster via egress ProxyGroup proxies.
type egressEpsReconciler struct {
client.Client
logger *zap.SugaredLogger
tsNamespace string
}
// Reconcile reconciles an EndpointSlice for a tailnet service. It updates the EndpointSlice with the endpoints of
// those ProxyGroup Pods that are ready to route traffic to the tailnet service.
// It compares tailnet service state stored in egress proxy state Secrets by containerboot with the desired
// configuration stored in proxy-cfg ConfigMap to determine if the endpoint is ready.
func (er *egressEpsReconciler) Reconcile(ctx context.Context, req reconcile.Request) (res reconcile.Result, err error) {
l := er.logger.With("Service", req.NamespacedName)
l.Debugf("starting reconcile")
defer l.Debugf("reconcile finished")
eps := new(discoveryv1.EndpointSlice)
err = er.Get(ctx, req.NamespacedName, eps)
if apierrors.IsNotFound(err) {
l.Debugf("EndpointSlice not found")
return reconcile.Result{}, nil
}
if err != nil {
return reconcile.Result{}, fmt.Errorf("failed to get EndpointSlice: %w", err)
}
if !eps.DeletionTimestamp.IsZero() {
l.Debugf("EnpointSlice is being deleted")
return res, nil
}
// Get the user-created ExternalName Service and use its status conditions to determine whether cluster
// resources are set up for this tailnet service.
svc := &corev1.Service{
ObjectMeta: metav1.ObjectMeta{
Name: eps.Labels[LabelParentName],
Namespace: eps.Labels[LabelParentNamespace],
},
}
err = er.Get(ctx, client.ObjectKeyFromObject(svc), svc)
if apierrors.IsNotFound(err) {
l.Infof("ExternalName Service %s/%s not found, perhaps it was deleted", svc.Namespace, svc.Name)
return res, nil
}
if err != nil {
return res, fmt.Errorf("error retrieving ExternalName Service: %w", err)
}
if !tsoperator.EgressServiceIsValidAndConfigured(svc) {
l.Infof("Cluster resources for ExternalName Service %s/%s are not yet configured", svc.Namespace, svc.Name)
return res, nil
}
// TODO(irbekrm): currently this reconcile loop runs all the checks every time it's triggered, which is
// wasteful. Once we have a Ready condition for ExternalName Services for ProxyGroup, use the condition to
// determine if a reconcile is needed.
oldEps := eps.DeepCopy()
proxyGroupName := eps.Labels[labelProxyGroup]
tailnetSvc := tailnetSvcName(svc)
l = l.With("tailnet-service-name", tailnetSvc)
// Retrieve the desired tailnet service configuration from the ConfigMap.
_, cfgs, err := egressSvcsConfigs(ctx, er.Client, proxyGroupName, er.tsNamespace)
if err != nil {
return res, fmt.Errorf("error retrieving tailnet services configuration: %w", err)
}
cfg, ok := (*cfgs)[tailnetSvc]
if !ok {
l.Infof("[unexpected] configuration for tailnet service %s not found", tailnetSvc)
return res, nil
}
// Check which Pods in ProxyGroup are ready to route traffic to this
// egress service.
podList := &corev1.PodList{}
if err := er.List(ctx, podList, client.MatchingLabels(pgLabels(proxyGroupName, nil))); err != nil {
return res, fmt.Errorf("error listing Pods for ProxyGroup %s: %w", proxyGroupName, err)
}
newEndpoints := make([]discoveryv1.Endpoint, 0)
for _, pod := range podList.Items {
ready, err := er.podIsReadyToRouteTraffic(ctx, pod, &cfg, tailnetSvc, l)
if err != nil {
return res, fmt.Errorf("error verifying if Pod is ready to route traffic: %w", err)
}
if !ready {
continue // maybe next time
}
podIP, err := podIPv4(&pod) // we currently only support IPv4
if err != nil {
return res, fmt.Errorf("error determining IPv4 address for Pod: %w", err)
}
newEndpoints = append(newEndpoints, discoveryv1.Endpoint{
Hostname: (*string)(&pod.UID),
Addresses: []string{podIP},
Conditions: discoveryv1.EndpointConditions{
Ready: ptr.To(true),
Serving: ptr.To(true),
Terminating: ptr.To(false),
},
})
}
// Note that Endpoints are being overwritten with the currently valid endpoints so we don't need to explicitly
// run a cleanup for deleted Pods etc.
eps.Endpoints = newEndpoints
if !reflect.DeepEqual(eps, oldEps) {
l.Infof("Updating EndpointSlice to ensure traffic is routed to ready proxy Pods")
if err := er.Update(ctx, eps); err != nil {
return res, fmt.Errorf("error updating EndpointSlice: %w", err)
}
}
return res, nil
}
func podIPv4(pod *corev1.Pod) (string, error) {
for _, ip := range pod.Status.PodIPs {
parsed, err := netip.ParseAddr(ip.IP)
if err != nil {
return "", fmt.Errorf("error parsing IP address %s: %w", ip, err)
}
if parsed.Is4() {
return parsed.String(), nil
}
}
return "", nil
}
// podIsReadyToRouteTraffic returns true if it appears that the proxy Pod has configured firewall rules to be able to
// route traffic to the given tailnet service. It retrieves the proxy's state Secret and compares the tailnet service
// status written there to the desired service configuration.
func (er *egressEpsReconciler) podIsReadyToRouteTraffic(ctx context.Context, pod corev1.Pod, cfg *egressservices.Config, tailnetSvcName string, l *zap.SugaredLogger) (bool, error) {
l = l.With("proxy_pod", pod.Name)
l.Debugf("checking whether proxy is ready to route to egress service")
if !pod.DeletionTimestamp.IsZero() {
l.Debugf("proxy Pod is being deleted, ignore")
return false, nil
}
podIP, err := podIPv4(&pod)
if err != nil {
return false, fmt.Errorf("error determining Pod IP address: %v", err)
}
if podIP == "" {
l.Infof("[unexpected] Pod does not have an IPv4 address, and IPv6 is not currently supported")
return false, nil
}
stateS := &corev1.Secret{
ObjectMeta: metav1.ObjectMeta{
Name: pod.Name,
Namespace: pod.Namespace,
},
}
err = er.Get(ctx, client.ObjectKeyFromObject(stateS), stateS)
if apierrors.IsNotFound(err) {
l.Debugf("proxy does not have a state Secret, waiting...")
return false, nil
}
if err != nil {
return false, fmt.Errorf("error getting state Secret: %w", err)
}
svcStatusBS := stateS.Data[egressservices.KeyEgressServices]
if len(svcStatusBS) == 0 {
l.Debugf("proxy's state Secret does not contain egress services status, waiting...")
return false, nil
}
svcStatus := &egressservices.Status{}
if err := json.Unmarshal(svcStatusBS, svcStatus); err != nil {
return false, fmt.Errorf("error unmarshalling egress service status: %w", err)
}
if !strings.EqualFold(podIP, svcStatus.PodIPv4) {
l.Infof("proxy's egress service status is for Pod IP %s, current proxy's Pod IP %s, waiting for the proxy to reconfigure...", svcStatus.PodIPv4, podIP)
return false, nil
}
st, ok := (*svcStatus).Services[tailnetSvcName]
if !ok {
l.Infof("proxy's state Secret does not have egress service status, waiting...")
return false, nil
}
if !reflect.DeepEqual(cfg.TailnetTarget, st.TailnetTarget) {
l.Infof("proxy has configured egress service for tailnet target %v, current target is %v, waiting for proxy to reconfigure...", st.TailnetTarget, cfg.TailnetTarget)
return false, nil
}
if !reflect.DeepEqual(cfg.Ports, st.Ports) {
l.Debugf("proxy has configured egress service for ports %#+v, wants ports %#+v, waiting for proxy to reconfigure", st.Ports, cfg.Ports)
return false, nil
}
l.Debugf("proxy is ready to route traffic to egress service")
return true, nil
}

View File

@@ -0,0 +1,211 @@
// Copyright (c) Tailscale Inc & AUTHORS
// SPDX-License-Identifier: BSD-3-Clause
//go:build !plan9
package main
import (
"encoding/json"
"fmt"
"math/rand/v2"
"testing"
"github.com/AlekSi/pointer"
"go.uber.org/zap"
corev1 "k8s.io/api/core/v1"
discoveryv1 "k8s.io/api/discovery/v1"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
"k8s.io/apimachinery/pkg/types"
"sigs.k8s.io/controller-runtime/pkg/client/fake"
tsapi "tailscale.com/k8s-operator/apis/v1alpha1"
"tailscale.com/kube/egressservices"
"tailscale.com/tstest"
"tailscale.com/util/mak"
)
func TestTailscaleEgressEndpointSlices(t *testing.T) {
clock := tstest.NewClock(tstest.ClockOpts{})
svc := &corev1.Service{
ObjectMeta: metav1.ObjectMeta{
Name: "test",
Namespace: "default",
UID: types.UID("1234-UID"),
Annotations: map[string]string{
AnnotationTailnetTargetFQDN: "foo.bar.ts.net",
AnnotationProxyGroup: "foo",
},
},
Spec: corev1.ServiceSpec{
ExternalName: "placeholder",
Type: corev1.ServiceTypeExternalName,
Selector: nil,
Ports: []corev1.ServicePort{
{
Name: "http",
Protocol: "TCP",
Port: 80,
},
},
},
Status: corev1.ServiceStatus{
Conditions: []metav1.Condition{
condition(tsapi.EgressSvcConfigured, metav1.ConditionTrue, "", "", clock),
condition(tsapi.EgressSvcValid, metav1.ConditionTrue, "", "", clock),
},
},
}
port := randomPort()
cm := configMapForSvc(t, svc, port)
fc := fake.NewClientBuilder().
WithScheme(tsapi.GlobalScheme).
WithObjects(svc, cm).
WithStatusSubresource(svc).
Build()
zl, err := zap.NewDevelopment()
if err != nil {
t.Fatal(err)
}
er := &egressEpsReconciler{
Client: fc,
logger: zl.Sugar(),
tsNamespace: "operator-ns",
}
eps := &discoveryv1.EndpointSlice{
ObjectMeta: metav1.ObjectMeta{
Name: "foo",
Namespace: "operator-ns",
Labels: map[string]string{
LabelParentName: "test",
LabelParentNamespace: "default",
labelSvcType: typeEgress,
labelProxyGroup: "foo"},
},
AddressType: discoveryv1.AddressTypeIPv4,
}
mustCreate(t, fc, eps)
t.Run("no_proxy_group_resources", func(t *testing.T) {
expectReconciled(t, er, "operator-ns", "foo") // should not error
})
t.Run("no_pods_ready_to_route_traffic", func(t *testing.T) {
pod, stateS := podAndSecretForProxyGroup("foo")
mustCreate(t, fc, pod)
mustCreate(t, fc, stateS)
expectReconciled(t, er, "operator-ns", "foo") // should not error
})
t.Run("pods_are_ready_to_route_traffic", func(t *testing.T) {
pod, stateS := podAndSecretForProxyGroup("foo")
stBs := serviceStatusForPodIP(t, svc, pod.Status.PodIPs[0].IP, port)
mustUpdate(t, fc, "operator-ns", stateS.Name, func(s *corev1.Secret) {
mak.Set(&s.Data, egressservices.KeyEgressServices, stBs)
})
expectReconciled(t, er, "operator-ns", "foo")
eps.Endpoints = append(eps.Endpoints, discoveryv1.Endpoint{
Addresses: []string{"10.0.0.1"},
Hostname: pointer.To("foo"),
Conditions: discoveryv1.EndpointConditions{
Serving: pointer.ToBool(true),
Ready: pointer.ToBool(true),
Terminating: pointer.ToBool(false),
},
})
expectEqual(t, fc, eps, nil)
})
t.Run("status_does_not_match_pod_ip", func(t *testing.T) {
_, stateS := podAndSecretForProxyGroup("foo") // replica Pod has IP 10.0.0.1
stBs := serviceStatusForPodIP(t, svc, "10.0.0.2", port) // status is for a Pod with IP 10.0.0.2
mustUpdate(t, fc, "operator-ns", stateS.Name, func(s *corev1.Secret) {
mak.Set(&s.Data, egressservices.KeyEgressServices, stBs)
})
expectReconciled(t, er, "operator-ns", "foo")
eps.Endpoints = []discoveryv1.Endpoint{}
expectEqual(t, fc, eps, nil)
})
}
func configMapForSvc(t *testing.T, svc *corev1.Service, p uint16) *corev1.ConfigMap {
t.Helper()
ports := make(map[egressservices.PortMap]struct{})
for _, port := range svc.Spec.Ports {
ports[egressservices.PortMap{Protocol: string(port.Protocol), MatchPort: p, TargetPort: uint16(port.Port)}] = struct{}{}
}
cfg := egressservices.Config{
Ports: ports,
}
if fqdn := svc.Annotations[AnnotationTailnetTargetFQDN]; fqdn != "" {
cfg.TailnetTarget = egressservices.TailnetTarget{FQDN: fqdn}
}
if ip := svc.Annotations[AnnotationTailnetTargetIP]; ip != "" {
cfg.TailnetTarget = egressservices.TailnetTarget{IP: ip}
}
name := tailnetSvcName(svc)
cfgs := egressservices.Configs{name: cfg}
bs, err := json.Marshal(&cfgs)
if err != nil {
t.Fatalf("error marshalling config: %v", err)
}
cm := &corev1.ConfigMap{
ObjectMeta: metav1.ObjectMeta{
Name: pgEgressCMName(svc.Annotations[AnnotationProxyGroup]),
Namespace: "operator-ns",
},
BinaryData: map[string][]byte{egressservices.KeyEgressServices: bs},
}
return cm
}
func serviceStatusForPodIP(t *testing.T, svc *corev1.Service, ip string, p uint16) []byte {
t.Helper()
ports := make(map[egressservices.PortMap]struct{})
for _, port := range svc.Spec.Ports {
ports[egressservices.PortMap{Protocol: string(port.Protocol), MatchPort: p, TargetPort: uint16(port.Port)}] = struct{}{}
}
svcSt := egressservices.ServiceStatus{Ports: ports}
if fqdn := svc.Annotations[AnnotationTailnetTargetFQDN]; fqdn != "" {
svcSt.TailnetTarget = egressservices.TailnetTarget{FQDN: fqdn}
}
if ip := svc.Annotations[AnnotationTailnetTargetIP]; ip != "" {
svcSt.TailnetTarget = egressservices.TailnetTarget{IP: ip}
}
svcName := tailnetSvcName(svc)
st := egressservices.Status{
PodIPv4: ip,
Services: map[string]*egressservices.ServiceStatus{svcName: &svcSt},
}
bs, err := json.Marshal(st)
if err != nil {
t.Fatalf("error marshalling service status: %v", err)
}
return bs
}
func podAndSecretForProxyGroup(pg string) (*corev1.Pod, *corev1.Secret) {
p := &corev1.Pod{
ObjectMeta: metav1.ObjectMeta{
Name: fmt.Sprintf("%s-0", pg),
Namespace: "operator-ns",
Labels: pgLabels(pg, nil),
UID: "foo",
},
Status: corev1.PodStatus{
PodIPs: []corev1.PodIP{
{IP: "10.0.0.1"},
},
},
}
s := &corev1.Secret{
ObjectMeta: metav1.ObjectMeta{
Name: fmt.Sprintf("%s-0", pg),
Namespace: "operator-ns",
Labels: pgSecretLabels(pg, "state"),
},
}
return p, s
}
func randomPort() uint16 {
return uint16(rand.Int32N(1000) + 1000)
}

View File

@@ -0,0 +1,179 @@
// Copyright (c) Tailscale Inc & AUTHORS
// SPDX-License-Identifier: BSD-3-Clause
//go:build !plan9
package main
import (
"context"
"errors"
"fmt"
"strings"
"go.uber.org/zap"
appsv1 "k8s.io/api/apps/v1"
corev1 "k8s.io/api/core/v1"
discoveryv1 "k8s.io/api/discovery/v1"
apiequality "k8s.io/apimachinery/pkg/api/equality"
apierrors "k8s.io/apimachinery/pkg/api/errors"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
"sigs.k8s.io/controller-runtime/pkg/client"
"sigs.k8s.io/controller-runtime/pkg/reconcile"
tsoperator "tailscale.com/k8s-operator"
tsapi "tailscale.com/k8s-operator/apis/v1alpha1"
"tailscale.com/tstime"
)
const (
reasonReadinessCheckFailed = "ReadinessCheckFailed"
reasonClusterResourcesNotReady = "ClusterResourcesNotReady"
reasonNoProxies = "NoProxiesConfigured"
reasonNotReady = "NotReadyToRouteTraffic"
reasonReady = "ReadyToRouteTraffic"
reasonPartiallyReady = "PartiallyReadyToRouteTraffic"
msgReadyToRouteTemplate = "%d out of %d replicas are ready to route traffic"
)
type egressSvcsReadinessReconciler struct {
client.Client
logger *zap.SugaredLogger
clock tstime.Clock
tsNamespace string
}
// Reconcile reconciles an ExternalName Service that defines a tailnet target to be exposed on a ProxyGroup and sets the
// EgressSvcReady condition on it. The condition gets set to true if at least one of the proxies is currently ready to
// route traffic to the target. It compares proxy Pod IPs with the endpoints set on the EndpointSlice for the egress
// service to determine how many replicas are currently able to route traffic.
func (esrr *egressSvcsReadinessReconciler) Reconcile(ctx context.Context, req reconcile.Request) (res reconcile.Result, err error) {
l := esrr.logger.With("Service", req.NamespacedName)
defer l.Info("reconcile finished")
svc := new(corev1.Service)
if err = esrr.Get(ctx, req.NamespacedName, svc); apierrors.IsNotFound(err) {
l.Info("Service not found")
return res, nil
} else if err != nil {
return res, fmt.Errorf("failed to get Service: %w", err)
}
var (
reason, msg string
st metav1.ConditionStatus = metav1.ConditionUnknown
)
oldStatus := svc.Status.DeepCopy()
defer func() {
tsoperator.SetServiceCondition(svc, tsapi.EgressSvcReady, st, reason, msg, esrr.clock, l)
if !apiequality.Semantic.DeepEqual(oldStatus, svc.Status) {
err = errors.Join(err, esrr.Status().Update(ctx, svc))
}
}()
crl := egressSvcChildResourceLabels(svc)
eps, err := getSingleObject[discoveryv1.EndpointSlice](ctx, esrr.Client, esrr.tsNamespace, crl)
if err != nil {
err = fmt.Errorf("error getting EndpointSlice: %w", err)
reason = reasonReadinessCheckFailed
msg = err.Error()
return res, err
}
if eps == nil {
l.Infof("EndpointSlice for Service does not yet exist, waiting...")
reason, msg = reasonClusterResourcesNotReady, reasonClusterResourcesNotReady
st = metav1.ConditionFalse
return res, nil
}
pg := &tsapi.ProxyGroup{
ObjectMeta: metav1.ObjectMeta{
Name: svc.Annotations[AnnotationProxyGroup],
},
}
err = esrr.Get(ctx, client.ObjectKeyFromObject(pg), pg)
if apierrors.IsNotFound(err) {
l.Infof("ProxyGroup for Service does not exist, waiting...")
reason, msg = reasonClusterResourcesNotReady, reasonClusterResourcesNotReady
st = metav1.ConditionFalse
return res, nil
}
if err != nil {
err = fmt.Errorf("error retrieving ProxyGroup: %w", err)
reason = reasonReadinessCheckFailed
msg = err.Error()
return res, err
}
if !tsoperator.ProxyGroupIsReady(pg) {
l.Infof("ProxyGroup for Service is not ready, waiting...")
reason, msg = reasonClusterResourcesNotReady, reasonClusterResourcesNotReady
st = metav1.ConditionFalse
return res, nil
}
replicas := pgReplicas(pg)
if replicas == 0 {
l.Infof("ProxyGroup replicas set to 0")
reason, msg = reasonNoProxies, reasonNoProxies
st = metav1.ConditionFalse
return res, nil
}
podLabels := pgLabels(pg.Name, nil)
var readyReplicas int32
for i := range replicas {
podLabels[appsv1.PodIndexLabel] = fmt.Sprintf("%d", i)
pod, err := getSingleObject[corev1.Pod](ctx, esrr.Client, esrr.tsNamespace, podLabels)
if err != nil {
err = fmt.Errorf("error retrieving ProxyGroup Pod: %w", err)
reason = reasonReadinessCheckFailed
msg = err.Error()
return res, err
}
if pod == nil {
l.Infof("[unexpected] ProxyGroup is ready, but replica %d was not found", i)
reason, msg = reasonClusterResourcesNotReady, reasonClusterResourcesNotReady
return res, nil
}
l.Infof("looking at Pod with IPs %v", pod.Status.PodIPs)
ready := false
for _, ep := range eps.Endpoints {
l.Infof("looking at endpoint with addresses %v", ep.Addresses)
if endpointReadyForPod(&ep, pod, l) {
l.Infof("endpoint is ready for Pod")
ready = true
break
}
}
if ready {
readyReplicas++
}
}
msg = fmt.Sprintf(msgReadyToRouteTemplate, readyReplicas, replicas)
if readyReplicas == 0 {
reason = reasonNotReady
st = metav1.ConditionFalse
return res, nil
}
st = metav1.ConditionTrue
if readyReplicas < replicas {
reason = reasonPartiallyReady
} else {
reason = reasonReady
}
return res, nil
}
// endpointReadyForPod returns true if the endpoint is for the Pod's IPv4 address and is ready to serve traffic.
// Endpoint must not be nil.
func endpointReadyForPod(ep *discoveryv1.Endpoint, pod *corev1.Pod, l *zap.SugaredLogger) bool {
podIP, err := podIPv4(pod)
if err != nil {
l.Infof("[unexpected] error retrieving Pod's IPv4 address: %v", err)
return false
}
// Currently we only ever set a single address on and Endpoint and nothing else is meant to modify this.
if len(ep.Addresses) != 1 {
return false
}
return strings.EqualFold(ep.Addresses[0], podIP) &&
*ep.Conditions.Ready &&
*ep.Conditions.Serving &&
!*ep.Conditions.Terminating
}

View File

@@ -0,0 +1,169 @@
// Copyright (c) Tailscale Inc & AUTHORS
// SPDX-License-Identifier: BSD-3-Clause
//go:build !plan9
package main
import (
"fmt"
"testing"
"github.com/AlekSi/pointer"
"go.uber.org/zap"
appsv1 "k8s.io/api/apps/v1"
corev1 "k8s.io/api/core/v1"
discoveryv1 "k8s.io/api/discovery/v1"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
"sigs.k8s.io/controller-runtime/pkg/client/fake"
tsoperator "tailscale.com/k8s-operator"
tsapi "tailscale.com/k8s-operator/apis/v1alpha1"
"tailscale.com/tstest"
"tailscale.com/tstime"
)
func TestEgressServiceReadiness(t *testing.T) {
// We need to pass a ProxyGroup object to WithStatusSubresource because of some quirks in how the fake client
// works. Without this code further down would not be able to update ProxyGroup status.
fc := fake.NewClientBuilder().
WithScheme(tsapi.GlobalScheme).
WithStatusSubresource(&tsapi.ProxyGroup{}).
Build()
zl, _ := zap.NewDevelopment()
cl := tstest.NewClock(tstest.ClockOpts{})
rec := &egressSvcsReadinessReconciler{
tsNamespace: "operator-ns",
Client: fc,
logger: zl.Sugar(),
clock: cl,
}
tailnetFQDN := "my-app.tailnetxyz.ts.net"
egressSvc := &corev1.Service{
ObjectMeta: metav1.ObjectMeta{
Name: "my-app",
Namespace: "dev",
Annotations: map[string]string{
AnnotationProxyGroup: "dev",
AnnotationTailnetTargetFQDN: tailnetFQDN,
},
},
}
fakeClusterIPSvc := &corev1.Service{ObjectMeta: metav1.ObjectMeta{Name: "my-app", Namespace: "operator-ns"}}
l := egressSvcEpsLabels(egressSvc, fakeClusterIPSvc)
eps := &discoveryv1.EndpointSlice{
ObjectMeta: metav1.ObjectMeta{
Name: "my-app",
Namespace: "operator-ns",
Labels: l,
},
AddressType: discoveryv1.AddressTypeIPv4,
}
pg := &tsapi.ProxyGroup{
ObjectMeta: metav1.ObjectMeta{
Name: "dev",
},
}
mustCreate(t, fc, egressSvc)
setClusterNotReady(egressSvc, cl, zl.Sugar())
t.Run("endpointslice_does_not_exist", func(t *testing.T) {
expectReconciled(t, rec, "dev", "my-app")
expectEqual(t, fc, egressSvc, nil) // not ready
})
t.Run("proxy_group_does_not_exist", func(t *testing.T) {
mustCreate(t, fc, eps)
expectReconciled(t, rec, "dev", "my-app")
expectEqual(t, fc, egressSvc, nil) // still not ready
})
t.Run("proxy_group_not_ready", func(t *testing.T) {
mustCreate(t, fc, pg)
expectReconciled(t, rec, "dev", "my-app")
expectEqual(t, fc, egressSvc, nil) // still not ready
})
t.Run("no_ready_replicas", func(t *testing.T) {
setPGReady(pg, cl, zl.Sugar())
mustUpdateStatus(t, fc, pg.Namespace, pg.Name, func(p *tsapi.ProxyGroup) {
p.Status = pg.Status
})
expectEqual(t, fc, pg, nil)
for i := range pgReplicas(pg) {
p := pod(pg, i)
mustCreate(t, fc, p)
mustUpdateStatus(t, fc, p.Namespace, p.Name, func(existing *corev1.Pod) {
existing.Status.PodIPs = p.Status.PodIPs
})
}
expectReconciled(t, rec, "dev", "my-app")
setNotReady(egressSvc, cl, zl.Sugar(), pgReplicas(pg))
expectEqual(t, fc, egressSvc, nil) // still not ready
})
t.Run("one_ready_replica", func(t *testing.T) {
setEndpointForReplica(pg, 0, eps)
mustUpdate(t, fc, eps.Namespace, eps.Name, func(e *discoveryv1.EndpointSlice) {
e.Endpoints = eps.Endpoints
})
setReady(egressSvc, cl, zl.Sugar(), pgReplicas(pg), 1)
expectReconciled(t, rec, "dev", "my-app")
expectEqual(t, fc, egressSvc, nil) // partially ready
})
t.Run("all_replicas_ready", func(t *testing.T) {
for i := range pgReplicas(pg) {
setEndpointForReplica(pg, i, eps)
}
mustUpdate(t, fc, eps.Namespace, eps.Name, func(e *discoveryv1.EndpointSlice) {
e.Endpoints = eps.Endpoints
})
setReady(egressSvc, cl, zl.Sugar(), pgReplicas(pg), pgReplicas(pg))
expectReconciled(t, rec, "dev", "my-app")
expectEqual(t, fc, egressSvc, nil) // ready
})
}
func setClusterNotReady(svc *corev1.Service, cl tstime.Clock, l *zap.SugaredLogger) {
tsoperator.SetServiceCondition(svc, tsapi.EgressSvcReady, metav1.ConditionFalse, reasonClusterResourcesNotReady, reasonClusterResourcesNotReady, cl, l)
}
func setNotReady(svc *corev1.Service, cl tstime.Clock, l *zap.SugaredLogger, replicas int32) {
msg := fmt.Sprintf(msgReadyToRouteTemplate, 0, replicas)
tsoperator.SetServiceCondition(svc, tsapi.EgressSvcReady, metav1.ConditionFalse, reasonNotReady, msg, cl, l)
}
func setReady(svc *corev1.Service, cl tstime.Clock, l *zap.SugaredLogger, replicas, readyReplicas int32) {
reason := reasonPartiallyReady
if readyReplicas == replicas {
reason = reasonReady
}
msg := fmt.Sprintf(msgReadyToRouteTemplate, readyReplicas, replicas)
tsoperator.SetServiceCondition(svc, tsapi.EgressSvcReady, metav1.ConditionTrue, reason, msg, cl, l)
}
func setPGReady(pg *tsapi.ProxyGroup, cl tstime.Clock, l *zap.SugaredLogger) {
tsoperator.SetProxyGroupCondition(pg, tsapi.ProxyGroupReady, metav1.ConditionTrue, "foo", "foo", pg.Generation, cl, l)
}
func setEndpointForReplica(pg *tsapi.ProxyGroup, ordinal int32, eps *discoveryv1.EndpointSlice) {
p := pod(pg, ordinal)
eps.Endpoints = append(eps.Endpoints, discoveryv1.Endpoint{
Addresses: []string{p.Status.PodIPs[0].IP},
Conditions: discoveryv1.EndpointConditions{
Ready: pointer.ToBool(true),
Serving: pointer.ToBool(true),
Terminating: pointer.ToBool(false),
},
})
}
func pod(pg *tsapi.ProxyGroup, ordinal int32) *corev1.Pod {
l := pgLabels(pg.Name, nil)
l[appsv1.PodIndexLabel] = fmt.Sprintf("%d", ordinal)
ip := fmt.Sprintf("10.0.0.%d", ordinal)
return &corev1.Pod{
ObjectMeta: metav1.ObjectMeta{
Name: fmt.Sprintf("%s-%d", pg.Name, ordinal),
Namespace: "operator-ns",
Labels: l,
},
Status: corev1.PodStatus{
PodIPs: []corev1.PodIP{{IP: ip}},
},
}
}

View File

@@ -0,0 +1,716 @@
// Copyright (c) Tailscale Inc & AUTHORS
// SPDX-License-Identifier: BSD-3-Clause
//go:build !plan9
package main
import (
"context"
"crypto/sha256"
"encoding/json"
"errors"
"fmt"
"math/rand/v2"
"reflect"
"slices"
"strings"
"sync"
"go.uber.org/zap"
corev1 "k8s.io/api/core/v1"
discoveryv1 "k8s.io/api/discovery/v1"
apiequality "k8s.io/apimachinery/pkg/api/equality"
apierrors "k8s.io/apimachinery/pkg/api/errors"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
"k8s.io/apimachinery/pkg/types"
"k8s.io/apimachinery/pkg/util/intstr"
"k8s.io/apimachinery/pkg/util/sets"
"k8s.io/apiserver/pkg/storage/names"
"k8s.io/client-go/tools/record"
"sigs.k8s.io/controller-runtime/pkg/client"
"sigs.k8s.io/controller-runtime/pkg/reconcile"
tsoperator "tailscale.com/k8s-operator"
tsapi "tailscale.com/k8s-operator/apis/v1alpha1"
"tailscale.com/kube/egressservices"
"tailscale.com/kube/kubetypes"
"tailscale.com/tstime"
"tailscale.com/util/clientmetric"
"tailscale.com/util/mak"
"tailscale.com/util/set"
)
const (
reasonEgressSvcInvalid = "EgressSvcInvalid"
reasonEgressSvcValid = "EgressSvcValid"
reasonEgressSvcCreationFailed = "EgressSvcCreationFailed"
reasonProxyGroupNotReady = "ProxyGroupNotReady"
labelProxyGroup = "tailscale.com/proxy-group"
labelSvcType = "tailscale.com/svc-type" // ingress or egress
typeEgress = "egress"
// maxPorts is the maximum number of ports that can be exposed on a
// container. In practice this will be ports in range [3000 - 4000). The
// high range should make it easier to distinguish container ports from
// the tailnet target ports for debugging purposes (i.e when reading
// netfilter rules). The limit of 10000 is somewhat arbitrary, the
// assumption is that this would not be hit in practice.
maxPorts = 10000
indexEgressProxyGroup = ".metadata.annotations.egress-proxy-group"
)
var gaugeEgressServices = clientmetric.NewGauge(kubetypes.MetricEgressServiceCount)
// egressSvcsReconciler reconciles user created ExternalName Services that specify a tailnet
// endpoint that should be exposed to cluster workloads and an egress ProxyGroup
// on whose proxies it should be exposed.
type egressSvcsReconciler struct {
client.Client
logger *zap.SugaredLogger
recorder record.EventRecorder
clock tstime.Clock
tsNamespace string
mu sync.Mutex // protects following
svcs set.Slice[types.UID] // UIDs of all currently managed egress Services for ProxyGroup
}
// Reconcile reconciles an ExternalName Service that specifies a tailnet target and a ProxyGroup on whose proxies should
// forward cluster traffic to the target.
// For an ExternalName Service the reconciler:
//
// - for each port N defined on the ExternalName Service, allocates a port X in range [3000- 4000), unique for the
// ProxyGroup proxies. Proxies will forward cluster traffic received on port N to port M on the tailnet target
//
// - creates a ClusterIP Service in the operator's namespace with portmappings for all M->N port pairs. This will allow
// cluster workloads to send traffic on the user-defined tailnet target port and get it transparently mapped to the
// randomly selected port on proxy Pods.
//
// - creates an EndpointSlice in the operator's namespace with kubernetes.io/service-name label pointing to the
// ClusterIP Service. The endpoints will get dynamically updates to proxy Pod IPs as the Pods become ready to route
// traffic to the tailnet target. kubernetes.io/service-name label ensures that kube-proxy sets up routing rules to
// forward cluster traffic received on ClusterIP Service's IP address to the endpoints (Pod IPs).
//
// - updates the egress service config in a ConfigMap mounted to the ProxyGroup proxies with the tailnet target and the
// portmappings.
func (esr *egressSvcsReconciler) Reconcile(ctx context.Context, req reconcile.Request) (res reconcile.Result, err error) {
l := esr.logger.With("Service", req.NamespacedName)
defer l.Info("reconcile finished")
svc := new(corev1.Service)
if err = esr.Get(ctx, req.NamespacedName, svc); apierrors.IsNotFound(err) {
l.Info("Service not found")
return res, nil
} else if err != nil {
return res, fmt.Errorf("failed to get Service: %w", err)
}
// Name of the 'egress service', meaning the tailnet target.
tailnetSvc := tailnetSvcName(svc)
l = l.With("tailnet-service", tailnetSvc)
// Note that resources for egress Services are only cleaned up when the
// Service is actually deleted (and not if, for example, user decides to
// remove the Tailscale annotation from it). This should be fine- we
// assume that the egress ExternalName Services are always created for
// Tailscale operator specifically.
if !svc.DeletionTimestamp.IsZero() {
l.Info("Service is being deleted, ensuring resource cleanup")
return res, esr.maybeCleanup(ctx, svc, l)
}
oldStatus := svc.Status.DeepCopy()
defer func() {
if !apiequality.Semantic.DeepEqual(oldStatus, svc.Status) {
err = errors.Join(err, esr.Status().Update(ctx, svc))
}
}()
// Validate the user-created ExternalName Service and the associated ProxyGroup.
if ok, err := esr.validateClusterResources(ctx, svc, l); err != nil {
return res, fmt.Errorf("error validating cluster resources: %w", err)
} else if !ok {
return res, nil
}
if !slices.Contains(svc.Finalizers, FinalizerName) {
l.Infof("configuring tailnet service") // logged exactly once
svc.Finalizers = append(svc.Finalizers, FinalizerName)
if err := esr.Update(ctx, svc); err != nil {
err := fmt.Errorf("failed to add finalizer: %w", err)
r := svcConfiguredReason(svc, false, l)
tsoperator.SetServiceCondition(svc, tsapi.EgressSvcConfigured, metav1.ConditionFalse, r, err.Error(), esr.clock, l)
return res, err
}
esr.mu.Lock()
esr.svcs.Add(svc.UID)
gaugeEgressServices.Set(int64(esr.svcs.Len()))
esr.mu.Unlock()
}
if err := esr.maybeCleanupProxyGroupConfig(ctx, svc, l); err != nil {
err = fmt.Errorf("cleaning up resources for previous ProxyGroup failed: %w", err)
r := svcConfiguredReason(svc, false, l)
tsoperator.SetServiceCondition(svc, tsapi.EgressSvcConfigured, metav1.ConditionFalse, r, err.Error(), esr.clock, l)
return res, err
}
return res, esr.maybeProvision(ctx, svc, l)
}
func (esr *egressSvcsReconciler) maybeProvision(ctx context.Context, svc *corev1.Service, l *zap.SugaredLogger) (err error) {
r := svcConfiguredReason(svc, false, l)
st := metav1.ConditionFalse
defer func() {
msg := r
if st != metav1.ConditionTrue && err != nil {
msg = err.Error()
}
tsoperator.SetServiceCondition(svc, tsapi.EgressSvcConfigured, st, r, msg, esr.clock, l)
}()
crl := egressSvcChildResourceLabels(svc)
clusterIPSvc, err := getSingleObject[corev1.Service](ctx, esr.Client, esr.tsNamespace, crl)
if err != nil {
err = fmt.Errorf("error retrieving ClusterIP Service: %w", err)
return err
}
if clusterIPSvc == nil {
clusterIPSvc = esr.clusterIPSvcForEgress(crl)
}
upToDate := svcConfigurationUpToDate(svc, l)
provisioned := true
if !upToDate {
if clusterIPSvc, provisioned, err = esr.provision(ctx, svc.Annotations[AnnotationProxyGroup], svc, clusterIPSvc, l); err != nil {
return err
}
}
if !provisioned {
l.Infof("unable to provision cluster resources")
return nil
}
// Update ExternalName Service to point at the ClusterIP Service.
clusterDomain := retrieveClusterDomain(esr.tsNamespace, l)
clusterIPSvcFQDN := fmt.Sprintf("%s.%s.svc.%s", clusterIPSvc.Name, clusterIPSvc.Namespace, clusterDomain)
if svc.Spec.ExternalName != clusterIPSvcFQDN {
l.Infof("Configuring ExternalName Service to point to ClusterIP Service %s", clusterIPSvcFQDN)
svc.Spec.ExternalName = clusterIPSvcFQDN
if err = esr.Update(ctx, svc); err != nil {
err = fmt.Errorf("error updating ExternalName Service: %w", err)
return err
}
}
r = svcConfiguredReason(svc, true, l)
st = metav1.ConditionTrue
return nil
}
func (esr *egressSvcsReconciler) provision(ctx context.Context, proxyGroupName string, svc, clusterIPSvc *corev1.Service, l *zap.SugaredLogger) (*corev1.Service, bool, error) {
l.Infof("updating configuration...")
usedPorts, err := esr.usedPortsForPG(ctx, proxyGroupName)
if err != nil {
return nil, false, fmt.Errorf("error calculating used ports for ProxyGroup %s: %w", proxyGroupName, err)
}
oldClusterIPSvc := clusterIPSvc.DeepCopy()
// loop over ClusterIP Service ports, remove any that are not needed.
for i := len(clusterIPSvc.Spec.Ports) - 1; i >= 0; i-- {
pm := clusterIPSvc.Spec.Ports[i]
found := false
for _, wantsPM := range svc.Spec.Ports {
if wantsPM.Port == pm.Port && strings.EqualFold(string(wantsPM.Protocol), string(pm.Protocol)) {
found = true
break
}
}
if !found {
l.Debugf("portmapping %s:%d -> %s:%d is no longer required, removing", pm.Protocol, pm.TargetPort.IntVal, pm.Protocol, pm.Port)
clusterIPSvc.Spec.Ports = slices.Delete(clusterIPSvc.Spec.Ports, i, i+1)
}
}
// loop over ExternalName Service ports, for each one not found on
// ClusterIP Service produce new target port and add a portmapping to
// the ClusterIP Service.
for _, wantsPM := range svc.Spec.Ports {
found := false
for _, gotPM := range clusterIPSvc.Spec.Ports {
if wantsPM.Port == gotPM.Port && strings.EqualFold(string(wantsPM.Protocol), string(gotPM.Protocol)) {
found = true
break
}
}
if !found {
// Calculate a free port to expose on container and add
// a new PortMap to the ClusterIP Service.
if usedPorts.Len() == maxPorts {
// TODO(irbekrm): refactor to avoid extra reconciles here. Low priority as in practice,
// the limit should not be hit.
return nil, false, fmt.Errorf("unable to allocate additional ports on ProxyGroup %s, %d ports already used. Create another ProxyGroup or open an issue if you believe this is unexpected.", proxyGroupName, maxPorts)
}
p := unusedPort(usedPorts)
l.Debugf("mapping tailnet target port %d to container port %d", wantsPM.Port, p)
usedPorts.Insert(p)
clusterIPSvc.Spec.Ports = append(clusterIPSvc.Spec.Ports, corev1.ServicePort{
Name: wantsPM.Name,
Protocol: wantsPM.Protocol,
Port: wantsPM.Port,
TargetPort: intstr.FromInt32(p),
})
}
}
if !reflect.DeepEqual(clusterIPSvc, oldClusterIPSvc) {
if clusterIPSvc, err = createOrUpdate(ctx, esr.Client, esr.tsNamespace, clusterIPSvc, func(svc *corev1.Service) {
svc.Labels = clusterIPSvc.Labels
svc.Spec = clusterIPSvc.Spec
}); err != nil {
return nil, false, fmt.Errorf("error ensuring ClusterIP Service: %v", err)
}
}
crl := egressSvcEpsLabels(svc, clusterIPSvc)
// TODO(irbekrm): support IPv6, but need to investigate how kube proxy
// sets up Service -> Pod routing when IPv6 is involved.
eps := &discoveryv1.EndpointSlice{
ObjectMeta: metav1.ObjectMeta{
Name: fmt.Sprintf("%s-ipv4", clusterIPSvc.Name),
Namespace: esr.tsNamespace,
Labels: crl,
},
AddressType: discoveryv1.AddressTypeIPv4,
Ports: epsPortsFromSvc(clusterIPSvc),
}
if eps, err = createOrUpdate(ctx, esr.Client, esr.tsNamespace, eps, func(e *discoveryv1.EndpointSlice) {
e.Labels = eps.Labels
e.AddressType = eps.AddressType
e.Ports = eps.Ports
for _, p := range e.Endpoints {
p.Conditions.Ready = nil
}
}); err != nil {
return nil, false, fmt.Errorf("error ensuring EndpointSlice: %w", err)
}
cm, cfgs, err := egressSvcsConfigs(ctx, esr.Client, proxyGroupName, esr.tsNamespace)
if err != nil {
return nil, false, fmt.Errorf("error retrieving egress services configuration: %w", err)
}
if cm == nil {
l.Info("ConfigMap not yet created, waiting..")
return nil, false, nil
}
tailnetSvc := tailnetSvcName(svc)
gotCfg := (*cfgs)[tailnetSvc]
wantsCfg := egressSvcCfg(svc, clusterIPSvc)
if !reflect.DeepEqual(gotCfg, wantsCfg) {
l.Debugf("updating egress services ConfigMap %s", cm.Name)
mak.Set(cfgs, tailnetSvc, wantsCfg)
bs, err := json.Marshal(cfgs)
if err != nil {
return nil, false, fmt.Errorf("error marshalling egress services configs: %w", err)
}
mak.Set(&cm.BinaryData, egressservices.KeyEgressServices, bs)
if err := esr.Update(ctx, cm); err != nil {
return nil, false, fmt.Errorf("error updating egress services ConfigMap: %w", err)
}
}
l.Infof("egress service configuration has been updated")
return clusterIPSvc, true, nil
}
func (esr *egressSvcsReconciler) maybeCleanup(ctx context.Context, svc *corev1.Service, logger *zap.SugaredLogger) error {
logger.Info("ensuring that resources created for egress service are deleted")
// Delete egress service config from the ConfigMap mounted by the proxies.
if err := esr.ensureEgressSvcCfgDeleted(ctx, svc, logger); err != nil {
return fmt.Errorf("error deleting egress service config: %w", err)
}
// Delete the ClusterIP Service and EndpointSlice for the egress
// service.
types := []client.Object{
&corev1.Service{},
&discoveryv1.EndpointSlice{},
}
crl := egressSvcChildResourceLabels(svc)
for _, typ := range types {
if err := esr.DeleteAllOf(ctx, typ, client.InNamespace(esr.tsNamespace), client.MatchingLabels(crl)); err != nil {
return fmt.Errorf("error deleting %s: %w", typ, err)
}
}
ix := slices.Index(svc.Finalizers, FinalizerName)
if ix != -1 {
logger.Debug("Removing Tailscale finalizer from Service")
svc.Finalizers = append(svc.Finalizers[:ix], svc.Finalizers[ix+1:]...)
if err := esr.Update(ctx, svc); err != nil {
return fmt.Errorf("failed to remove finalizer: %w", err)
}
}
esr.mu.Lock()
esr.svcs.Remove(svc.UID)
gaugeEgressServices.Set(int64(esr.svcs.Len()))
esr.mu.Unlock()
logger.Info("successfully cleaned up resources for egress Service")
return nil
}
func (esr *egressSvcsReconciler) maybeCleanupProxyGroupConfig(ctx context.Context, svc *corev1.Service, l *zap.SugaredLogger) error {
wantsProxyGroup := svc.Annotations[AnnotationProxyGroup]
cond := tsoperator.GetServiceCondition(svc, tsapi.EgressSvcConfigured)
if cond == nil {
return nil
}
ss := strings.Split(cond.Reason, ":")
if len(ss) < 3 {
return nil
}
if strings.EqualFold(wantsProxyGroup, ss[2]) {
return nil
}
esr.logger.Infof("egress Service configured on ProxyGroup %s, wants ProxyGroup %s, cleaning up...", ss[2], wantsProxyGroup)
if err := esr.ensureEgressSvcCfgDeleted(ctx, svc, l); err != nil {
return fmt.Errorf("error deleting egress service config: %w", err)
}
return nil
}
// usedPortsForPG calculates the currently used match ports for ProxyGroup
// containers. It does that by looking by retrieving all target ports of all
// ClusterIP Services created for egress services exposed on this ProxyGroup's
// proxies.
// TODO(irbekrm): this is currently good enough because we only have a single worker and
// because these Services are created by us, so we can always expect to get the
// latest ClusterIP Services via the controller cache. It will not work as well
// once we split into multiple workers- at that point we probably want to set
// used ports on ProxyGroup's status.
func (esr *egressSvcsReconciler) usedPortsForPG(ctx context.Context, pg string) (sets.Set[int32], error) {
svcList := &corev1.ServiceList{}
if err := esr.List(ctx, svcList, client.InNamespace(esr.tsNamespace), client.MatchingLabels(map[string]string{labelProxyGroup: pg})); err != nil {
return nil, fmt.Errorf("error listing Services: %w", err)
}
usedPorts := sets.New[int32]()
for _, s := range svcList.Items {
for _, p := range s.Spec.Ports {
usedPorts.Insert(p.TargetPort.IntVal)
}
}
return usedPorts, nil
}
// clusterIPSvcForEgress returns a template for the ClusterIP Service created
// for an egress service exposed on ProxyGroup proxies. The ClusterIP Service
// has no selector. Traffic sent to it will be routed to the endpoints defined
// by an EndpointSlice created for this egress service.
func (esr *egressSvcsReconciler) clusterIPSvcForEgress(crl map[string]string) *corev1.Service {
return &corev1.Service{
ObjectMeta: metav1.ObjectMeta{
GenerateName: svcNameBase(crl[LabelParentName]),
Namespace: esr.tsNamespace,
Labels: crl,
},
Spec: corev1.ServiceSpec{
Type: corev1.ServiceTypeClusterIP,
},
}
}
func (esr *egressSvcsReconciler) ensureEgressSvcCfgDeleted(ctx context.Context, svc *corev1.Service, logger *zap.SugaredLogger) error {
crl := egressSvcChildResourceLabels(svc)
cmName := pgEgressCMName(crl[labelProxyGroup])
cm := &corev1.ConfigMap{
ObjectMeta: metav1.ObjectMeta{
Name: cmName,
Namespace: esr.tsNamespace,
},
}
l := logger.With("ConfigMap", client.ObjectKeyFromObject(cm))
l.Debug("ensuring that egress service configuration is removed from proxy config")
if err := esr.Get(ctx, client.ObjectKeyFromObject(cm), cm); apierrors.IsNotFound(err) {
l.Debugf("ConfigMap not found")
return nil
} else if err != nil {
return fmt.Errorf("error retrieving ConfigMap: %w", err)
}
bs := cm.BinaryData[egressservices.KeyEgressServices]
if len(bs) == 0 {
l.Debugf("ConfigMap does not contain egress service configs")
return nil
}
cfgs := &egressservices.Configs{}
if err := json.Unmarshal(bs, cfgs); err != nil {
return fmt.Errorf("error unmarshalling egress services configs")
}
tailnetSvc := tailnetSvcName(svc)
_, ok := (*cfgs)[tailnetSvc]
if !ok {
l.Debugf("ConfigMap does not contain egress service config, likely because it was already deleted")
return nil
}
l.Infof("before deleting config %+#v", *cfgs)
delete(*cfgs, tailnetSvc)
l.Infof("after deleting config %+#v", *cfgs)
bs, err := json.Marshal(cfgs)
if err != nil {
return fmt.Errorf("error marshalling egress services configs: %w", err)
}
mak.Set(&cm.BinaryData, egressservices.KeyEgressServices, bs)
return esr.Update(ctx, cm)
}
func (esr *egressSvcsReconciler) validateClusterResources(ctx context.Context, svc *corev1.Service, l *zap.SugaredLogger) (bool, error) {
proxyGroupName := svc.Annotations[AnnotationProxyGroup]
pg := &tsapi.ProxyGroup{
ObjectMeta: metav1.ObjectMeta{
Name: proxyGroupName,
},
}
if err := esr.Get(ctx, client.ObjectKeyFromObject(pg), pg); apierrors.IsNotFound(err) {
l.Infof("ProxyGroup %q not found, waiting...", proxyGroupName)
tsoperator.SetServiceCondition(svc, tsapi.EgressSvcValid, metav1.ConditionUnknown, reasonProxyGroupNotReady, reasonProxyGroupNotReady, esr.clock, l)
tsoperator.RemoveServiceCondition(svc, tsapi.EgressSvcConfigured)
return false, nil
} else if err != nil {
err := fmt.Errorf("unable to retrieve ProxyGroup %s: %w", proxyGroupName, err)
tsoperator.SetServiceCondition(svc, tsapi.EgressSvcValid, metav1.ConditionUnknown, reasonProxyGroupNotReady, err.Error(), esr.clock, l)
tsoperator.RemoveServiceCondition(svc, tsapi.EgressSvcConfigured)
return false, err
}
if !tsoperator.ProxyGroupIsReady(pg) {
l.Infof("ProxyGroup %s is not ready, waiting...", proxyGroupName)
tsoperator.SetServiceCondition(svc, tsapi.EgressSvcValid, metav1.ConditionUnknown, reasonProxyGroupNotReady, reasonProxyGroupNotReady, esr.clock, l)
tsoperator.RemoveServiceCondition(svc, tsapi.EgressSvcConfigured)
return false, nil
}
if violations := validateEgressService(svc, pg); len(violations) > 0 {
msg := fmt.Sprintf("invalid egress Service: %s", strings.Join(violations, ", "))
esr.recorder.Event(svc, corev1.EventTypeWarning, "INVALIDSERVICE", msg)
l.Info(msg)
tsoperator.SetServiceCondition(svc, tsapi.EgressSvcValid, metav1.ConditionFalse, reasonEgressSvcInvalid, msg, esr.clock, l)
tsoperator.RemoveServiceCondition(svc, tsapi.EgressSvcConfigured)
return false, nil
}
l.Debugf("egress service is valid")
tsoperator.SetServiceCondition(svc, tsapi.EgressSvcValid, metav1.ConditionTrue, reasonEgressSvcValid, reasonEgressSvcValid, esr.clock, l)
return true, nil
}
func validateEgressService(svc *corev1.Service, pg *tsapi.ProxyGroup) []string {
violations := validateService(svc)
// We check that only one of these two is set in the earlier validateService function.
if svc.Annotations[AnnotationTailnetTargetFQDN] == "" && svc.Annotations[AnnotationTailnetTargetIP] == "" {
violations = append(violations, fmt.Sprintf("egress Service for ProxyGroup must have one of %s, %s annotations set", AnnotationTailnetTargetFQDN, AnnotationTailnetTargetIP))
}
if len(svc.Spec.Ports) == 0 {
violations = append(violations, "egress Service for ProxyGroup must have at least one target Port specified")
}
if svc.Spec.Type != corev1.ServiceTypeExternalName {
violations = append(violations, fmt.Sprintf("unexpected egress Service type %s. The only supported type is ExternalName.", svc.Spec.Type))
}
if pg.Spec.Type != tsapi.ProxyGroupTypeEgress {
violations = append(violations, fmt.Sprintf("egress Service references ProxyGroup of type %s, must be type %s", pg.Spec.Type, tsapi.ProxyGroupTypeEgress))
}
return violations
}
// egressSvcNameBase returns a name base that can be passed to
// ObjectMeta.GenerateName to generate a name for the ClusterIP Service.
// The generated name needs to be short enough so that it can later be used to
// generate a valid Kubernetes resource name for the EndpointSlice in form
// 'ipv4-|ipv6-<ClusterIP Service name>.
// A valid Kubernetes resource name must not be longer than 253 chars.
func svcNameBase(s string) string {
// -ipv4 - ipv6
const maxClusterIPSvcNameLength = 253 - 5
base := fmt.Sprintf("ts-%s-", s)
generator := names.SimpleNameGenerator
for {
generatedName := generator.GenerateName(base)
excess := len(generatedName) - maxClusterIPSvcNameLength
if excess <= 0 {
return base
}
base = base[:len(base)-1-excess] // cut off the excess chars
base = base + "-" // re-instate the dash
}
}
// unusedPort returns a port in range [3000 - 4000). The caller must ensure that
// usedPorts does not contain all ports in range [3000 - 4000).
func unusedPort(usedPorts sets.Set[int32]) int32 {
foundFreePort := false
var suggestPort int32
for !foundFreePort {
suggestPort = rand.Int32N(maxPorts) + 3000
if !usedPorts.Has(suggestPort) {
foundFreePort = true
}
}
return suggestPort
}
// tailnetTargetFromSvc returns a tailnet target for the given egress Service.
// Service must contain exactly one of tailscale.com/tailnet-ip,
// tailscale.com/tailnet-fqdn annotations.
func tailnetTargetFromSvc(svc *corev1.Service) egressservices.TailnetTarget {
if fqdn := svc.Annotations[AnnotationTailnetTargetFQDN]; fqdn != "" {
return egressservices.TailnetTarget{
FQDN: fqdn,
}
}
return egressservices.TailnetTarget{
IP: svc.Annotations[AnnotationTailnetTargetIP],
}
}
func egressSvcCfg(externalNameSvc, clusterIPSvc *corev1.Service) egressservices.Config {
tt := tailnetTargetFromSvc(externalNameSvc)
cfg := egressservices.Config{TailnetTarget: tt}
for _, svcPort := range clusterIPSvc.Spec.Ports {
pm := portMap(svcPort)
mak.Set(&cfg.Ports, pm, struct{}{})
}
return cfg
}
func portMap(p corev1.ServicePort) egressservices.PortMap {
// TODO (irbekrm): out of bounds check?
return egressservices.PortMap{Protocol: string(p.Protocol), MatchPort: uint16(p.TargetPort.IntVal), TargetPort: uint16(p.Port)}
}
func isEgressSvcForProxyGroup(obj client.Object) bool {
s, ok := obj.(*corev1.Service)
if !ok {
return false
}
annots := s.ObjectMeta.Annotations
return annots[AnnotationProxyGroup] != "" && (annots[AnnotationTailnetTargetFQDN] != "" || annots[AnnotationTailnetTargetIP] != "")
}
// egressSvcConfig returns a ConfigMap that contains egress services configuration for the provided ProxyGroup as well
// as unmarshalled configuration from the ConfigMap.
func egressSvcsConfigs(ctx context.Context, cl client.Client, proxyGroupName, tsNamespace string) (cm *corev1.ConfigMap, cfgs *egressservices.Configs, err error) {
name := pgEgressCMName(proxyGroupName)
cm = &corev1.ConfigMap{
ObjectMeta: metav1.ObjectMeta{
Name: name,
Namespace: tsNamespace,
},
}
if err := cl.Get(ctx, client.ObjectKeyFromObject(cm), cm); err != nil {
return nil, nil, fmt.Errorf("error retrieving egress services ConfigMap %s: %v", name, err)
}
cfgs = &egressservices.Configs{}
if len(cm.BinaryData[egressservices.KeyEgressServices]) != 0 {
if err := json.Unmarshal(cm.BinaryData[egressservices.KeyEgressServices], cfgs); err != nil {
return nil, nil, fmt.Errorf("error unmarshaling egress services config %v: %w", cm.BinaryData[egressservices.KeyEgressServices], err)
}
}
return cm, cfgs, nil
}
// egressSvcChildResourceLabels returns labels that should be applied to the
// ClusterIP Service and the EndpointSlice created for the egress service.
// TODO(irbekrm): we currently set a bunch of labels based on Kubernetes
// resource names (ProxyGroup, Service). Maximum allowed label length is 63
// chars whilst the maximum allowed resource name length is 253 chars, so we
// should probably validate and truncate (?) the names is they are too long.
func egressSvcChildResourceLabels(svc *corev1.Service) map[string]string {
return map[string]string{
LabelManaged: "true",
LabelParentType: "svc",
LabelParentName: svc.Name,
LabelParentNamespace: svc.Namespace,
labelProxyGroup: svc.Annotations[AnnotationProxyGroup],
labelSvcType: typeEgress,
}
}
// egressEpsLabels returns labels to be added to an EndpointSlice created for an egress service.
func egressSvcEpsLabels(extNSvc, clusterIPSvc *corev1.Service) map[string]string {
l := egressSvcChildResourceLabels(extNSvc)
// Adding this label is what makes kube proxy set up rules to route traffic sent to the clusterIP Service to the
// endpoints defined on this EndpointSlice.
// https://kubernetes.io/docs/concepts/services-networking/endpoint-slices/#ownership
l[discoveryv1.LabelServiceName] = clusterIPSvc.Name
// Kubernetes recommends setting this label.
// https://kubernetes.io/docs/concepts/services-networking/endpoint-slices/#management
l[discoveryv1.LabelManagedBy] = "tailscale.com"
return l
}
func svcConfigurationUpToDate(svc *corev1.Service, l *zap.SugaredLogger) bool {
cond := tsoperator.GetServiceCondition(svc, tsapi.EgressSvcConfigured)
if cond == nil {
return false
}
if cond.Status != metav1.ConditionTrue {
return false
}
wantsReadyReason := svcConfiguredReason(svc, true, l)
return strings.EqualFold(wantsReadyReason, cond.Reason)
}
func cfgHash(c cfg, l *zap.SugaredLogger) string {
bs, err := json.Marshal(c)
if err != nil {
// Don't use l.Error as that messes up component logs with, in this case, unnecessary stack trace.
l.Infof("error marhsalling Config: %v", err)
return ""
}
h := sha256.New()
if _, err := h.Write(bs); err != nil {
// Don't use l.Error as that messes up component logs with, in this case, unnecessary stack trace.
l.Infof("error producing Config hash: %v", err)
return ""
}
return fmt.Sprintf("%x", h.Sum(nil))
}
type cfg struct {
Ports []corev1.ServicePort `json:"ports"`
TailnetTarget egressservices.TailnetTarget `json:"tailnetTarget"`
ProxyGroup string `json:"proxyGroup"`
}
func svcConfiguredReason(svc *corev1.Service, configured bool, l *zap.SugaredLogger) string {
var r string
if configured {
r = "ConfiguredFor:"
} else {
r = fmt.Sprintf("ConfigurationFailed:%s", r)
}
r += fmt.Sprintf("ProxyGroup:%s", svc.Annotations[AnnotationProxyGroup])
tt := tailnetTargetFromSvc(svc)
s := cfg{
Ports: svc.Spec.Ports,
TailnetTarget: tt,
ProxyGroup: svc.Annotations[AnnotationProxyGroup],
}
r += fmt.Sprintf(":Config:%s", cfgHash(s, l))
return r
}
// tailnetSvc accepts and ExternalName Service name and returns a name that will be used to distinguish this tailnet
// service from other tailnet services exposed to cluster workloads.
func tailnetSvcName(extNSvc *corev1.Service) string {
return fmt.Sprintf("%s-%s", extNSvc.Namespace, extNSvc.Name)
}
// epsPortsFromSvc takes the ClusterIP Service created for an egress service and
// returns its Port array in a form that can be used for an EndpointSlice.
func epsPortsFromSvc(svc *corev1.Service) (ep []discoveryv1.EndpointPort) {
for _, p := range svc.Spec.Ports {
ep = append(ep, discoveryv1.EndpointPort{
Protocol: &p.Protocol,
Port: &p.TargetPort.IntVal,
Name: &p.Name,
})
}
return ep
}

View File

@@ -0,0 +1,268 @@
// Copyright (c) Tailscale Inc & AUTHORS
// SPDX-License-Identifier: BSD-3-Clause
//go:build !plan9
package main
import (
"context"
"encoding/json"
"fmt"
"testing"
"github.com/AlekSi/pointer"
"github.com/google/go-cmp/cmp"
"go.uber.org/zap"
corev1 "k8s.io/api/core/v1"
discoveryv1 "k8s.io/api/discovery/v1"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
"k8s.io/apimachinery/pkg/types"
"sigs.k8s.io/controller-runtime/pkg/client"
"sigs.k8s.io/controller-runtime/pkg/client/fake"
tsapi "tailscale.com/k8s-operator/apis/v1alpha1"
"tailscale.com/kube/egressservices"
"tailscale.com/tstest"
"tailscale.com/tstime"
)
func TestTailscaleEgressServices(t *testing.T) {
pg := &tsapi.ProxyGroup{
TypeMeta: metav1.TypeMeta{Kind: "ProxyGroup", APIVersion: "tailscale.com/v1alpha1"},
ObjectMeta: metav1.ObjectMeta{
Name: "foo",
UID: types.UID("1234-UID"),
},
Spec: tsapi.ProxyGroupSpec{
Replicas: pointer.To[int32](3),
Type: tsapi.ProxyGroupTypeEgress,
},
}
cm := &corev1.ConfigMap{
ObjectMeta: metav1.ObjectMeta{
Name: pgEgressCMName("foo"),
Namespace: "operator-ns",
},
}
fc := fake.NewClientBuilder().
WithScheme(tsapi.GlobalScheme).
WithObjects(pg, cm).
WithStatusSubresource(pg).
Build()
zl, err := zap.NewDevelopment()
if err != nil {
t.Fatal(err)
}
clock := tstest.NewClock(tstest.ClockOpts{})
esr := &egressSvcsReconciler{
Client: fc,
logger: zl.Sugar(),
clock: clock,
tsNamespace: "operator-ns",
}
tailnetTargetFQDN := "foo.bar.ts.net."
svc := &corev1.Service{
ObjectMeta: metav1.ObjectMeta{
Name: "test",
Namespace: "default",
UID: types.UID("1234-UID"),
Annotations: map[string]string{
AnnotationTailnetTargetFQDN: tailnetTargetFQDN,
AnnotationProxyGroup: "foo",
},
},
Spec: corev1.ServiceSpec{
ExternalName: "placeholder",
Type: corev1.ServiceTypeExternalName,
Selector: nil,
Ports: []corev1.ServicePort{
{
Name: "http",
Protocol: "TCP",
Port: 80,
},
{
Name: "https",
Protocol: "TCP",
Port: 443,
},
},
},
}
t.Run("proxy_group_not_ready", func(t *testing.T) {
mustCreate(t, fc, svc)
expectReconciled(t, esr, "default", "test")
// Service should have EgressSvcValid condition set to Unknown.
svc.Status.Conditions = []metav1.Condition{condition(tsapi.EgressSvcValid, metav1.ConditionUnknown, reasonProxyGroupNotReady, reasonProxyGroupNotReady, clock)}
expectEqual(t, fc, svc, nil)
})
t.Run("proxy_group_ready", func(t *testing.T) {
mustUpdateStatus(t, fc, "", "foo", func(pg *tsapi.ProxyGroup) {
pg.Status.Conditions = []metav1.Condition{
condition(tsapi.ProxyGroupReady, metav1.ConditionTrue, "", "", clock),
}
})
// Quirks of the fake client.
mustUpdateStatus(t, fc, "default", "test", func(svc *corev1.Service) {
svc.Status.Conditions = []metav1.Condition{}
})
expectReconciled(t, esr, "default", "test")
// Verify that a ClusterIP Service has been created.
name := findGenNameForEgressSvcResources(t, fc, svc)
expectEqual(t, fc, clusterIPSvc(name, svc), removeTargetPortsFromSvc)
clusterSvc := mustGetClusterIPSvc(t, fc, name)
// Verify that an EndpointSlice has been created.
expectEqual(t, fc, endpointSlice(name, svc, clusterSvc), nil)
// Verify that ConfigMap contains configuration for the new egress service.
mustHaveConfigForSvc(t, fc, svc, clusterSvc, cm)
r := svcConfiguredReason(svc, true, zl.Sugar())
// Verify that the user-created ExternalName Service has Configured set to true and ExternalName pointing to the
// CluterIP Service.
svc.Status.Conditions = []metav1.Condition{
condition(tsapi.EgressSvcConfigured, metav1.ConditionTrue, r, r, clock),
}
svc.ObjectMeta.Finalizers = []string{"tailscale.com/finalizer"}
svc.Spec.ExternalName = fmt.Sprintf("%s.operator-ns.svc.cluster.local", name)
expectEqual(t, fc, svc, nil)
})
t.Run("delete_external_name_service", func(t *testing.T) {
name := findGenNameForEgressSvcResources(t, fc, svc)
if err := fc.Delete(context.Background(), svc); err != nil {
t.Fatalf("error deleting ExternalName Service: %v", err)
}
expectReconciled(t, esr, "default", "test")
// Verify that ClusterIP Service and EndpointSlice have been deleted.
expectMissing[corev1.Service](t, fc, "operator-ns", name)
expectMissing[discoveryv1.EndpointSlice](t, fc, "operator-ns", fmt.Sprintf("%s-ipv4", name))
// Verify that service config has been deleted from the ConfigMap.
mustNotHaveConfigForSvc(t, fc, svc, cm)
})
}
func condition(typ tsapi.ConditionType, st metav1.ConditionStatus, r, msg string, clock tstime.Clock) metav1.Condition {
return metav1.Condition{
Type: string(typ),
Status: st,
LastTransitionTime: conditionTime(clock),
Reason: r,
Message: msg,
}
}
func findGenNameForEgressSvcResources(t *testing.T, client client.Client, svc *corev1.Service) string {
t.Helper()
labels := egressSvcChildResourceLabels(svc)
s, err := getSingleObject[corev1.Service](context.Background(), client, "operator-ns", labels)
if err != nil {
t.Fatalf("finding ClusterIP Service for ExternalName Service %s: %v", svc.Name, err)
}
if s == nil {
t.Fatalf("no ClusterIP Service found for ExternalName Service %q", svc.Name)
}
return s.GetName()
}
func clusterIPSvc(name string, extNSvc *corev1.Service) *corev1.Service {
labels := egressSvcChildResourceLabels(extNSvc)
return &corev1.Service{
ObjectMeta: metav1.ObjectMeta{
Name: name,
Namespace: "operator-ns",
GenerateName: fmt.Sprintf("ts-%s-", extNSvc.Name),
Labels: labels,
},
Spec: corev1.ServiceSpec{
Type: corev1.ServiceTypeClusterIP,
Ports: extNSvc.Spec.Ports,
},
}
}
func mustGetClusterIPSvc(t *testing.T, cl client.Client, name string) *corev1.Service {
svc := &corev1.Service{
ObjectMeta: metav1.ObjectMeta{
Name: name,
Namespace: "operator-ns",
},
}
if err := cl.Get(context.Background(), client.ObjectKeyFromObject(svc), svc); err != nil {
t.Fatalf("error retrieving Service")
}
return svc
}
func endpointSlice(name string, extNSvc, clusterIPSvc *corev1.Service) *discoveryv1.EndpointSlice {
labels := egressSvcChildResourceLabels(extNSvc)
labels[discoveryv1.LabelManagedBy] = "tailscale.com"
labels[discoveryv1.LabelServiceName] = name
return &discoveryv1.EndpointSlice{
ObjectMeta: metav1.ObjectMeta{
Name: fmt.Sprintf("%s-ipv4", name),
Namespace: "operator-ns",
Labels: labels,
},
Ports: portsForEndpointSlice(clusterIPSvc),
AddressType: discoveryv1.AddressTypeIPv4,
}
}
func portsForEndpointSlice(svc *corev1.Service) []discoveryv1.EndpointPort {
ports := make([]discoveryv1.EndpointPort, 0)
for _, p := range svc.Spec.Ports {
ports = append(ports, discoveryv1.EndpointPort{
Name: &p.Name,
Protocol: &p.Protocol,
Port: pointer.ToInt32(p.TargetPort.IntVal),
})
}
return ports
}
func mustHaveConfigForSvc(t *testing.T, cl client.Client, extNSvc, clusterIPSvc *corev1.Service, cm *corev1.ConfigMap) {
t.Helper()
wantsCfg := egressSvcCfg(extNSvc, clusterIPSvc)
if err := cl.Get(context.Background(), client.ObjectKeyFromObject(cm), cm); err != nil {
t.Fatalf("Error retrieving ConfigMap: %v", err)
}
name := tailnetSvcName(extNSvc)
gotCfg := configFromCM(t, cm, name)
if gotCfg == nil {
t.Fatalf("No config found for service %q", name)
}
if diff := cmp.Diff(*gotCfg, wantsCfg); diff != "" {
t.Fatalf("unexpected config for service %q (-got +want):\n%s", name, diff)
}
}
func mustNotHaveConfigForSvc(t *testing.T, cl client.Client, extNSvc *corev1.Service, cm *corev1.ConfigMap) {
t.Helper()
if err := cl.Get(context.Background(), client.ObjectKeyFromObject(cm), cm); err != nil {
t.Fatalf("Error retrieving ConfigMap: %v", err)
}
name := tailnetSvcName(extNSvc)
gotCfg := configFromCM(t, cm, name)
if gotCfg != nil {
t.Fatalf("Config %#+v for service %q found when it should not be present", gotCfg, name)
}
}
func configFromCM(t *testing.T, cm *corev1.ConfigMap, svcName string) *egressservices.Config {
t.Helper()
cfgBs, ok := cm.BinaryData[egressservices.KeyEgressServices]
if !ok {
return nil
}
cfgs := &egressservices.Configs{}
if err := json.Unmarshal(cfgBs, cfgs); err != nil {
t.Fatalf("error unmarshalling config: %v", err)
}
cfg, ok := (*cfgs)[svcName]
if ok {
return &cfg
}
return nil
}

View File

@@ -25,11 +25,13 @@ const (
proxyClassCRDPath = operatorDeploymentFilesPath + "/crds/tailscale.com_proxyclasses.yaml"
dnsConfigCRDPath = operatorDeploymentFilesPath + "/crds/tailscale.com_dnsconfigs.yaml"
recorderCRDPath = operatorDeploymentFilesPath + "/crds/tailscale.com_recorders.yaml"
proxyGroupCRDPath = operatorDeploymentFilesPath + "/crds/tailscale.com_proxygroups.yaml"
helmTemplatesPath = operatorDeploymentFilesPath + "/chart/templates"
connectorCRDHelmTemplatePath = helmTemplatesPath + "/connector.yaml"
proxyClassCRDHelmTemplatePath = helmTemplatesPath + "/proxyclass.yaml"
dnsConfigCRDHelmTemplatePath = helmTemplatesPath + "/dnsconfig.yaml"
recorderCRDHelmTemplatePath = helmTemplatesPath + "/recorder.yaml"
proxyGroupCRDHelmTemplatePath = helmTemplatesPath + "/proxygroup.yaml"
helmConditionalStart = "{{ if .Values.installCRDs -}}\n"
helmConditionalEnd = "{{- end -}}"
@@ -146,6 +148,7 @@ func generate(baseDir string) error {
{proxyClassCRDPath, proxyClassCRDHelmTemplatePath},
{dnsConfigCRDPath, dnsConfigCRDHelmTemplatePath},
{recorderCRDPath, recorderCRDHelmTemplatePath},
{proxyGroupCRDPath, proxyGroupCRDHelmTemplatePath},
} {
if err := addCRDToHelm(crd.crdPath, crd.templatePath); err != nil {
return fmt.Errorf("error adding %s CRD to Helm templates: %w", crd.crdPath, err)
@@ -161,6 +164,7 @@ func cleanup(baseDir string) error {
proxyClassCRDHelmTemplatePath,
dnsConfigCRDHelmTemplatePath,
recorderCRDHelmTemplatePath,
proxyGroupCRDHelmTemplatePath,
} {
if err := os.Remove(filepath.Join(baseDir, path)); err != nil && !os.IsNotExist(err) {
return fmt.Errorf("error cleaning up %s: %w", path, err)

View File

@@ -62,6 +62,9 @@ func Test_generate(t *testing.T) {
if !strings.Contains(installContentsWithCRD.String(), "name: recorders.tailscale.com") {
t.Errorf("Recorder CRD not found in default chart install")
}
if !strings.Contains(installContentsWithCRD.String(), "name: proxygroups.tailscale.com") {
t.Errorf("ProxyGroup CRD not found in default chart install")
}
// Test that CRDs can be excluded from Helm chart install
installContentsWithoutCRD := bytes.NewBuffer([]byte{})
@@ -83,4 +86,7 @@ func Test_generate(t *testing.T) {
if strings.Contains(installContentsWithoutCRD.String(), "name: recorders.tailscale.com") {
t.Errorf("Recorder CRD found in chart install that should not contain a CRD")
}
if strings.Contains(installContentsWithoutCRD.String(), "name: proxygroups.tailscale.com") {
t.Errorf("ProxyGroup CRD found in chart install that should not contain a CRD")
}
}

View File

@@ -48,7 +48,7 @@ type IngressReconciler struct {
// managing. This is only used for metrics.
managedIngresses set.Slice[types.UID]
proxyDefaultClass string
defaultProxyClass string
}
var (
@@ -136,7 +136,7 @@ func (a *IngressReconciler) maybeProvision(ctx context.Context, logger *zap.Suga
}
}
proxyClass := proxyClassForObject(ing, a.proxyDefaultClass)
proxyClass := proxyClassForObject(ing, a.defaultProxyClass)
if proxyClass != "" {
if ready, err := proxyClassIsReady(ctx, proxyClass, a.Client); err != nil {
return fmt.Errorf("error verifying ProxyClass for Ingress: %w", err)

View File

@@ -253,7 +253,7 @@ func TestTailscaleIngressWithProxyClass(t *testing.T) {
pc.Status = tsapi.ProxyClassStatus{
Conditions: []metav1.Condition{{
Status: metav1.ConditionTrue,
Type: string(tsapi.ProxyClassready),
Type: string(tsapi.ProxyClassReady),
ObservedGeneration: pc.Generation,
}}}
})

View File

@@ -109,7 +109,7 @@ func main() {
proxyActAsDefaultLoadBalancer: isDefaultLoadBalancer,
proxyTags: tags,
proxyFirewallMode: tsFirewallMode,
proxyDefaultClass: defaultProxyClass,
defaultProxyClass: defaultProxyClass,
}
runReconcilers(rOpts)
}
@@ -143,6 +143,7 @@ func initTSNet(zlog *zap.SugaredLogger) (*tsnet.Server, *tailscale.Client) {
TokenURL: "https://login.tailscale.com/api/v2/oauth/token",
}
tsClient := tailscale.NewClient("-", nil)
tsClient.UserAgent = "tailscale-k8s-operator"
tsClient.HTTPClient = credentials.Client(context.Background())
s := &tsnet.Server{
@@ -238,6 +239,7 @@ func runReconcilers(opts reconcilerOpts) {
ByObject: map[client.Object]cache.ByObject{
&corev1.Secret{}: nsFilter,
&corev1.ServiceAccount{}: nsFilter,
&corev1.Pod{}: nsFilter,
&corev1.ConfigMap{}: nsFilter,
&appsv1.StatefulSet{}: nsFilter,
&appsv1.Deployment{}: nsFilter,
@@ -285,7 +287,7 @@ func runReconcilers(opts reconcilerOpts) {
recorder: eventRecorder,
tsNamespace: opts.tailscaleNamespace,
clock: tstime.DefaultClock{},
proxyDefaultClass: opts.proxyDefaultClass,
defaultProxyClass: opts.defaultProxyClass,
})
if err != nil {
startlog.Fatalf("could not create service reconciler: %v", err)
@@ -308,7 +310,7 @@ func runReconcilers(opts reconcilerOpts) {
recorder: eventRecorder,
Client: mgr.GetClient(),
logger: opts.log.Named("ingress-reconciler"),
proxyDefaultClass: opts.proxyDefaultClass,
defaultProxyClass: opts.defaultProxyClass,
})
if err != nil {
startlog.Fatalf("could not create ingress reconciler: %v", err)
@@ -353,6 +355,65 @@ func runReconcilers(opts reconcilerOpts) {
if err != nil {
startlog.Fatalf("could not create nameserver reconciler: %v", err)
}
egressSvcFilter := handler.EnqueueRequestsFromMapFunc(egressSvcsHandler)
egressProxyGroupFilter := handler.EnqueueRequestsFromMapFunc(egressSvcsFromEgressProxyGroup(mgr.GetClient(), opts.log))
err = builder.
ControllerManagedBy(mgr).
Named("egress-svcs-reconciler").
Watches(&corev1.Service{}, egressSvcFilter).
Watches(&tsapi.ProxyGroup{}, egressProxyGroupFilter).
Complete(&egressSvcsReconciler{
Client: mgr.GetClient(),
tsNamespace: opts.tailscaleNamespace,
recorder: eventRecorder,
clock: tstime.DefaultClock{},
logger: opts.log.Named("egress-svcs-reconciler"),
})
if err != nil {
startlog.Fatalf("could not create egress Services reconciler: %v", err)
}
if err := mgr.GetFieldIndexer().IndexField(context.Background(), new(corev1.Service), indexEgressProxyGroup, indexEgressServices); err != nil {
startlog.Fatalf("failed setting up indexer for egress Services: %v", err)
}
egressSvcFromEpsFilter := handler.EnqueueRequestsFromMapFunc(egressSvcFromEps)
err = builder.
ControllerManagedBy(mgr).
Named("egress-svcs-readiness-reconciler").
Watches(&corev1.Service{}, egressSvcFilter).
Watches(&discoveryv1.EndpointSlice{}, egressSvcFromEpsFilter).
Complete(&egressSvcsReadinessReconciler{
Client: mgr.GetClient(),
tsNamespace: opts.tailscaleNamespace,
clock: tstime.DefaultClock{},
logger: opts.log.Named("egress-svcs-readiness-reconciler"),
})
if err != nil {
startlog.Fatalf("could not create egress Services readiness reconciler: %v", err)
}
epsFilter := handler.EnqueueRequestsFromMapFunc(egressEpsHandler)
podsFilter := handler.EnqueueRequestsFromMapFunc(egressEpsFromPGPods(mgr.GetClient(), opts.tailscaleNamespace))
secretsFilter := handler.EnqueueRequestsFromMapFunc(egressEpsFromPGStateSecrets(mgr.GetClient(), opts.tailscaleNamespace))
epsFromExtNSvcFilter := handler.EnqueueRequestsFromMapFunc(epsFromExternalNameService(mgr.GetClient(), opts.log, opts.tailscaleNamespace))
err = builder.
ControllerManagedBy(mgr).
Named("egress-eps-reconciler").
Watches(&discoveryv1.EndpointSlice{}, epsFilter).
Watches(&corev1.Pod{}, podsFilter).
Watches(&corev1.Secret{}, secretsFilter).
Watches(&corev1.Service{}, epsFromExtNSvcFilter).
Complete(&egressEpsReconciler{
Client: mgr.GetClient(),
tsNamespace: opts.tailscaleNamespace,
logger: opts.log.Named("egress-eps-reconciler"),
})
if err != nil {
startlog.Fatalf("could not create egress EndpointSlices reconciler: %v", err)
}
err = builder.ControllerManagedBy(mgr).
For(&tsapi.ProxyClass{}).
Complete(&ProxyClassReconciler{
@@ -414,6 +475,34 @@ func runReconcilers(opts reconcilerOpts) {
startlog.Fatalf("could not create Recorder reconciler: %v", err)
}
// Recorder reconciler.
ownedByProxyGroupFilter := handler.EnqueueRequestForOwner(mgr.GetScheme(), mgr.GetRESTMapper(), &tsapi.ProxyGroup{})
proxyClassFilterForProxyGroup := handler.EnqueueRequestsFromMapFunc(proxyClassHandlerForProxyGroup(mgr.GetClient(), startlog))
err = builder.ControllerManagedBy(mgr).
For(&tsapi.ProxyGroup{}).
Watches(&appsv1.StatefulSet{}, ownedByProxyGroupFilter).
Watches(&corev1.ServiceAccount{}, ownedByProxyGroupFilter).
Watches(&corev1.Secret{}, ownedByProxyGroupFilter).
Watches(&rbacv1.Role{}, ownedByProxyGroupFilter).
Watches(&rbacv1.RoleBinding{}, ownedByProxyGroupFilter).
Watches(&tsapi.ProxyClass{}, proxyClassFilterForProxyGroup).
Complete(&ProxyGroupReconciler{
recorder: eventRecorder,
Client: mgr.GetClient(),
l: opts.log.Named("proxygroup-reconciler"),
clock: tstime.DefaultClock{},
tsClient: opts.tsClient,
tsNamespace: opts.tailscaleNamespace,
proxyImage: opts.proxyImage,
defaultTags: strings.Split(opts.proxyTags, ","),
tsFirewallMode: opts.proxyFirewallMode,
defaultProxyClass: opts.defaultProxyClass,
})
if err != nil {
startlog.Fatalf("could not create ProxyGroup reconciler: %v", err)
}
startlog.Infof("Startup complete, operator running, version: %s", version.Long())
if err := mgr.Start(signals.SetupSignalHandler()); err != nil {
startlog.Fatalf("could not start manager: %v", err)
@@ -454,10 +543,10 @@ type reconcilerOpts struct {
// Auto is usually the best choice, unless you want to explicitly set
// specific mode for debugging purposes.
proxyFirewallMode string
// proxyDefaultClass is the name of the ProxyClass to use as the default
// defaultProxyClass is the name of the ProxyClass to use as the default
// class for proxies that do not have a ProxyClass set.
// this is defined by an operator env variable.
proxyDefaultClass string
defaultProxyClass string
}
// enqueueAllIngressEgressProxySvcsinNS returns a reconcile request for each
@@ -646,6 +735,27 @@ func proxyClassHandlerForConnector(cl client.Client, logger *zap.SugaredLogger)
}
}
// proxyClassHandlerForConnector returns a handler that, for a given ProxyClass,
// returns a list of reconcile requests for all Connectors that have
// .spec.proxyClass set.
func proxyClassHandlerForProxyGroup(cl client.Client, logger *zap.SugaredLogger) handler.MapFunc {
return func(ctx context.Context, o client.Object) []reconcile.Request {
pgList := new(tsapi.ProxyGroupList)
if err := cl.List(ctx, pgList); err != nil {
logger.Debugf("error listing ProxyGroups for ProxyClass: %v", err)
return nil
}
reqs := make([]reconcile.Request, 0)
proxyClassName := o.GetName()
for _, pg := range pgList.Items {
if pg.Spec.ProxyClass == proxyClassName {
reqs = append(reqs, reconcile.Request{NamespacedName: client.ObjectKeyFromObject(&pg)})
}
}
return reqs
}
}
// serviceHandlerForIngress returns a handler for Service events for ingress
// reconciler that ensures that if the Service associated with an event is of
// interest to the reconciler, the associated Ingress(es) gets be reconciled.
@@ -687,6 +797,10 @@ func serviceHandlerForIngress(cl client.Client, logger *zap.SugaredLogger) handl
}
func serviceHandler(_ context.Context, o client.Object) []reconcile.Request {
if _, ok := o.GetAnnotations()[AnnotationProxyGroup]; ok {
// Do not reconcile Services for ProxyGroup.
return nil
}
if isManagedByType(o, "svc") {
// If this is a Service managed by a Service we want to enqueue its parent
return []reconcile.Request{{NamespacedName: parentFromObjectLabels(o)}}
@@ -712,3 +826,195 @@ func isMagicDNSName(name string) bool {
validMagicDNSName := regexp.MustCompile(`^[a-zA-Z0-9-]+\.[a-zA-Z0-9-]+\.ts\.net\.?$`)
return validMagicDNSName.MatchString(name)
}
// egressSvcsHandler returns accepts a Kubernetes object and returns a reconcile
// request for it , if the object is a Tailscale egress Service meant to be
// exposed on a ProxyGroup.
func egressSvcsHandler(_ context.Context, o client.Object) []reconcile.Request {
if !isEgressSvcForProxyGroup(o) {
return nil
}
return []reconcile.Request{
{
NamespacedName: types.NamespacedName{
Namespace: o.GetNamespace(),
Name: o.GetName(),
},
},
}
}
// egressEpsHandler returns accepts an EndpointSlice and, if the EndpointSlice
// is for an egress service, returns a reconcile request for it.
func egressEpsHandler(_ context.Context, o client.Object) []reconcile.Request {
if typ := o.GetLabels()[labelSvcType]; typ != typeEgress {
return nil
}
return []reconcile.Request{
{
NamespacedName: types.NamespacedName{
Namespace: o.GetNamespace(),
Name: o.GetName(),
},
},
}
}
// egressEpsFromEgressPods returns a Pod event handler that checks if Pod is a replica for a ProxyGroup and if it is,
// returns reconciler requests for all egress EndpointSlices for that ProxyGroup.
func egressEpsFromPGPods(cl client.Client, ns string) handler.MapFunc {
return func(_ context.Context, o client.Object) []reconcile.Request {
if v, ok := o.GetLabels()[LabelManaged]; !ok || v != "true" {
return nil
}
// TODO(irbekrm): for now this is good enough as all ProxyGroups are egress. Add a type check once we
// have ingress ProxyGroups.
if typ := o.GetLabels()[LabelParentType]; typ != "proxygroup" {
return nil
}
pg, ok := o.GetLabels()[LabelParentName]
if !ok {
return nil
}
return reconcileRequestsForPG(pg, cl, ns)
}
}
// egressEpsFromPGStateSecrets returns a Secret event handler that checks if Secret is a state Secret for a ProxyGroup and if it is,
// returns reconciler requests for all egress EndpointSlices for that ProxyGroup.
func egressEpsFromPGStateSecrets(cl client.Client, ns string) handler.MapFunc {
return func(_ context.Context, o client.Object) []reconcile.Request {
if v, ok := o.GetLabels()[LabelManaged]; !ok || v != "true" {
return nil
}
// TODO(irbekrm): for now this is good enough as all ProxyGroups are egress. Add a type check once we
// have ingress ProxyGroups.
if parentType := o.GetLabels()[LabelParentType]; parentType != "proxygroup" {
return nil
}
if secretType := o.GetLabels()[labelSecretType]; secretType != "state" {
return nil
}
pg, ok := o.GetLabels()[LabelParentName]
if !ok {
return nil
}
return reconcileRequestsForPG(pg, cl, ns)
}
}
// egressSvcFromEps is an event handler for EndpointSlices. If an EndpointSlice is for an egress ExternalName Service
// meant to be exposed on a ProxyGroup, returns a reconcile request for the Service.
func egressSvcFromEps(_ context.Context, o client.Object) []reconcile.Request {
if typ := o.GetLabels()[labelSvcType]; typ != typeEgress {
return nil
}
if v, ok := o.GetLabels()[LabelManaged]; !ok || v != "true" {
return nil
}
svcName, ok := o.GetLabels()[LabelParentName]
if !ok {
return nil
}
svcNs, ok := o.GetLabels()[LabelParentNamespace]
if !ok {
return nil
}
return []reconcile.Request{
{
NamespacedName: types.NamespacedName{
Namespace: svcNs,
Name: svcName,
},
},
}
}
func reconcileRequestsForPG(pg string, cl client.Client, ns string) []reconcile.Request {
epsList := discoveryv1.EndpointSliceList{}
if err := cl.List(context.Background(), &epsList,
client.InNamespace(ns),
client.MatchingLabels(map[string]string{labelProxyGroup: pg})); err != nil {
return nil
}
reqs := make([]reconcile.Request, 0)
for _, ep := range epsList.Items {
reqs = append(reqs, reconcile.Request{
NamespacedName: types.NamespacedName{
Namespace: ep.Namespace,
Name: ep.Name,
},
})
}
return reqs
}
// egressSvcsFromEgressProxyGroup is an event handler for egress ProxyGroups. It returns reconcile requests for all
// user-created ExternalName Services that should be exposed on this ProxyGroup.
func egressSvcsFromEgressProxyGroup(cl client.Client, logger *zap.SugaredLogger) handler.MapFunc {
return func(_ context.Context, o client.Object) []reconcile.Request {
pg, ok := o.(*tsapi.ProxyGroup)
if !ok {
logger.Infof("[unexpected] ProxyGroup handler triggered for an object that is not a ProxyGroup")
return nil
}
if pg.Spec.Type != tsapi.ProxyGroupTypeEgress {
return nil
}
svcList := &corev1.ServiceList{}
if err := cl.List(context.Background(), svcList, client.MatchingFields{indexEgressProxyGroup: pg.Name}); err != nil {
logger.Infof("error listing Services: %v, skipping a reconcile for event on ProxyGroup %s", err, pg.Name)
return nil
}
reqs := make([]reconcile.Request, 0)
for _, svc := range svcList.Items {
reqs = append(reqs, reconcile.Request{
NamespacedName: types.NamespacedName{
Namespace: svc.Namespace,
Name: svc.Name,
},
})
}
return reqs
}
}
// epsFromExternalNameService is an event handler for ExternalName Services that define a Tailscale egress service that
// should be exposed on a ProxyGroup. It returns reconcile requests for EndpointSlices created for this Service.
func epsFromExternalNameService(cl client.Client, logger *zap.SugaredLogger, ns string) handler.MapFunc {
return func(_ context.Context, o client.Object) []reconcile.Request {
svc, ok := o.(*corev1.Service)
if !ok {
logger.Infof("[unexpected] Service handler triggered for an object that is not a Service")
return nil
}
if !isEgressSvcForProxyGroup(svc) {
return nil
}
epsList := &discoveryv1.EndpointSliceList{}
if err := cl.List(context.Background(), epsList, client.InNamespace(ns),
client.MatchingLabels(egressSvcChildResourceLabels(svc))); err != nil {
logger.Infof("error listing EndpointSlices: %v, skipping a reconcile for event on Service %s", err, svc.Name)
return nil
}
reqs := make([]reconcile.Request, 0)
for _, eps := range epsList.Items {
reqs = append(reqs, reconcile.Request{
NamespacedName: types.NamespacedName{
Namespace: eps.Namespace,
Name: eps.Name,
},
})
}
return reqs
}
}
// indexEgressServices adds a local index to a cached Tailscale egress Services meant to be exposed on a ProxyGroup. The
// index is used a list filter.
func indexEgressServices(o client.Object) []string {
if !isEgressSvcForProxyGroup(o) {
return nil
}
return []string{o.GetAnnotations()[AnnotationProxyGroup]}
}

View File

@@ -1064,7 +1064,7 @@ func TestProxyClassForService(t *testing.T) {
pc.Status = tsapi.ProxyClassStatus{
Conditions: []metav1.Condition{{
Status: metav1.ConditionTrue,
Type: string(tsapi.ProxyClassready),
Type: string(tsapi.ProxyClassReady),
ObservedGeneration: pc.Generation,
}}}
})
@@ -1487,6 +1487,72 @@ func Test_clusterDomainFromResolverConf(t *testing.T) {
})
}
}
func Test_authKeyRemoval(t *testing.T) {
fc := fake.NewFakeClient()
ft := &fakeTSClient{}
zl, err := zap.NewDevelopment()
if err != nil {
t.Fatal(err)
}
// 1. A new Service that should be exposed via Tailscale gets created, a Secret with a config that contains auth
// key is generated.
clock := tstest.NewClock(tstest.ClockOpts{})
sr := &ServiceReconciler{
Client: fc,
ssr: &tailscaleSTSReconciler{
Client: fc,
tsClient: ft,
defaultTags: []string{"tag:k8s"},
operatorNamespace: "operator-ns",
proxyImage: "tailscale/tailscale",
},
logger: zl.Sugar(),
clock: clock,
}
mustCreate(t, fc, &corev1.Service{
ObjectMeta: metav1.ObjectMeta{
Name: "test",
Namespace: "default",
UID: types.UID("1234-UID"),
},
Spec: corev1.ServiceSpec{
ClusterIP: "10.20.30.40",
Type: corev1.ServiceTypeLoadBalancer,
LoadBalancerClass: ptr.To("tailscale"),
},
})
expectReconciled(t, sr, "default", "test")
fullName, shortName := findGenName(t, fc, "default", "test", "svc")
opts := configOpts{
stsName: shortName,
secretName: fullName,
namespace: "default",
parentType: "svc",
hostname: "default-test",
clusterTargetIP: "10.20.30.40",
app: kubetypes.AppIngressProxy,
}
expectEqual(t, fc, expectedSecret(t, fc, opts), nil)
expectEqual(t, fc, expectedHeadlessService(shortName, "svc"), nil)
expectEqual(t, fc, expectedSTS(t, fc, opts), removeHashAnnotation)
// 2. Apply update to the Secret that imitates the proxy setting device_id.
s := expectedSecret(t, fc, opts)
mustUpdate(t, fc, s.Namespace, s.Name, func(s *corev1.Secret) {
mak.Set(&s.Data, "device_id", []byte("dkkdi4CNTRL"))
})
// 3. Config should no longer contain auth key
expectReconciled(t, sr, "default", "test")
opts.shouldRemoveAuthKey = true
opts.secretExtraData = map[string][]byte{"device_id": []byte("dkkdi4CNTRL")}
expectEqual(t, fc, expectedSecret(t, fc, opts), nil)
}
func Test_externalNameService(t *testing.T) {
fc := fake.NewFakeClient()

View File

@@ -98,9 +98,9 @@ func (pcr *ProxyClassReconciler) Reconcile(ctx context.Context, req reconcile.Re
if errs := pcr.validate(pc); errs != nil {
msg := fmt.Sprintf(messageProxyClassInvalid, errs.ToAggregate().Error())
pcr.recorder.Event(pc, corev1.EventTypeWarning, reasonProxyClassInvalid, msg)
tsoperator.SetProxyClassCondition(pc, tsapi.ProxyClassready, metav1.ConditionFalse, reasonProxyClassInvalid, msg, pc.Generation, pcr.clock, logger)
tsoperator.SetProxyClassCondition(pc, tsapi.ProxyClassReady, metav1.ConditionFalse, reasonProxyClassInvalid, msg, pc.Generation, pcr.clock, logger)
} else {
tsoperator.SetProxyClassCondition(pc, tsapi.ProxyClassready, metav1.ConditionTrue, reasonProxyClassValid, reasonProxyClassValid, pc.Generation, pcr.clock, logger)
tsoperator.SetProxyClassCondition(pc, tsapi.ProxyClassReady, metav1.ConditionTrue, reasonProxyClassValid, reasonProxyClassValid, pc.Generation, pcr.clock, logger)
}
if !apiequality.Semantic.DeepEqual(oldPCStatus, pc.Status) {
if err := pcr.Client.Status().Update(ctx, pc); err != nil {

View File

@@ -69,7 +69,7 @@ func TestProxyClass(t *testing.T) {
// 1. A valid ProxyClass resource gets its status updated to Ready.
expectReconciled(t, pcr, "", "test")
pc.Status.Conditions = append(pc.Status.Conditions, metav1.Condition{
Type: string(tsapi.ProxyClassready),
Type: string(tsapi.ProxyClassReady),
Status: metav1.ConditionTrue,
Reason: reasonProxyClassValid,
Message: reasonProxyClassValid,
@@ -85,7 +85,7 @@ func TestProxyClass(t *testing.T) {
})
expectReconciled(t, pcr, "", "test")
msg := `ProxyClass is not valid: .spec.statefulSet.labels: Invalid value: "?!someVal": a valid label must be an empty string or consist of alphanumeric characters, '-', '_' or '.', and must start and end with an alphanumeric character (e.g. 'MyValue', or 'my_value', or '12345', regex used for validation is '(([A-Za-z0-9][-A-Za-z0-9_.]*)?[A-Za-z0-9])?')`
tsoperator.SetProxyClassCondition(pc, tsapi.ProxyClassready, metav1.ConditionFalse, reasonProxyClassInvalid, msg, 0, cl, zl.Sugar())
tsoperator.SetProxyClassCondition(pc, tsapi.ProxyClassReady, metav1.ConditionFalse, reasonProxyClassInvalid, msg, 0, cl, zl.Sugar())
expectEqual(t, fc, pc, nil)
expectedEvent := "Warning ProxyClassInvalid ProxyClass is not valid: .spec.statefulSet.labels: Invalid value: \"?!someVal\": a valid label must be an empty string or consist of alphanumeric characters, '-', '_' or '.', and must start and end with an alphanumeric character (e.g. 'MyValue', or 'my_value', or '12345', regex used for validation is '(([A-Za-z0-9][-A-Za-z0-9_.]*)?[A-Za-z0-9])?')"
expectEvents(t, fr, []string{expectedEvent})
@@ -99,7 +99,7 @@ func TestProxyClass(t *testing.T) {
})
expectReconciled(t, pcr, "", "test")
msg = `ProxyClass is not valid: spec.statefulSet.pod.tailscaleContainer.image: Invalid value: "FOO bar": invalid reference format: repository name (library/FOO bar) must be lowercase`
tsoperator.SetProxyClassCondition(pc, tsapi.ProxyClassready, metav1.ConditionFalse, reasonProxyClassInvalid, msg, 0, cl, zl.Sugar())
tsoperator.SetProxyClassCondition(pc, tsapi.ProxyClassReady, metav1.ConditionFalse, reasonProxyClassInvalid, msg, 0, cl, zl.Sugar())
expectEqual(t, fc, pc, nil)
expectedEvent = `Warning ProxyClassInvalid ProxyClass is not valid: spec.statefulSet.pod.tailscaleContainer.image: Invalid value: "FOO bar": invalid reference format: repository name (library/FOO bar) must be lowercase`
expectEvents(t, fr, []string{expectedEvent})
@@ -118,7 +118,7 @@ func TestProxyClass(t *testing.T) {
})
expectReconciled(t, pcr, "", "test")
msg = `ProxyClass is not valid: spec.statefulSet.pod.tailscaleInitContainer.image: Invalid value: "FOO bar": invalid reference format: repository name (library/FOO bar) must be lowercase`
tsoperator.SetProxyClassCondition(pc, tsapi.ProxyClassready, metav1.ConditionFalse, reasonProxyClassInvalid, msg, 0, cl, zl.Sugar())
tsoperator.SetProxyClassCondition(pc, tsapi.ProxyClassReady, metav1.ConditionFalse, reasonProxyClassInvalid, msg, 0, cl, zl.Sugar())
expectEqual(t, fc, pc, nil)
expectedEvent = `Warning ProxyClassInvalid ProxyClass is not valid: spec.statefulSet.pod.tailscaleInitContainer.image: Invalid value: "FOO bar": invalid reference format: repository name (library/FOO bar) must be lowercase`
expectEvents(t, fr, []string{expectedEvent})

View File

@@ -0,0 +1,533 @@
// Copyright (c) Tailscale Inc & AUTHORS
// SPDX-License-Identifier: BSD-3-Clause
//go:build !plan9
package main
import (
"context"
"crypto/sha256"
"encoding/json"
"fmt"
"net/http"
"slices"
"sync"
"github.com/pkg/errors"
"go.uber.org/zap"
xslices "golang.org/x/exp/slices"
appsv1 "k8s.io/api/apps/v1"
corev1 "k8s.io/api/core/v1"
rbacv1 "k8s.io/api/rbac/v1"
apiequality "k8s.io/apimachinery/pkg/api/equality"
apierrors "k8s.io/apimachinery/pkg/api/errors"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
"k8s.io/apimachinery/pkg/types"
"k8s.io/client-go/tools/record"
"sigs.k8s.io/controller-runtime/pkg/client"
"sigs.k8s.io/controller-runtime/pkg/reconcile"
"tailscale.com/client/tailscale"
"tailscale.com/ipn"
tsoperator "tailscale.com/k8s-operator"
tsapi "tailscale.com/k8s-operator/apis/v1alpha1"
"tailscale.com/kube/kubetypes"
"tailscale.com/tailcfg"
"tailscale.com/tstime"
"tailscale.com/types/ptr"
"tailscale.com/util/clientmetric"
"tailscale.com/util/mak"
"tailscale.com/util/set"
)
const (
reasonProxyGroupCreationFailed = "ProxyGroupCreationFailed"
reasonProxyGroupReady = "ProxyGroupReady"
reasonProxyGroupCreating = "ProxyGroupCreating"
reasonProxyGroupInvalid = "ProxyGroupInvalid"
)
var gaugeProxyGroupResources = clientmetric.NewGauge(kubetypes.MetricProxyGroupCount)
// ProxyGroupReconciler ensures cluster resources for a ProxyGroup definition.
type ProxyGroupReconciler struct {
client.Client
l *zap.SugaredLogger
recorder record.EventRecorder
clock tstime.Clock
tsClient tsClient
// User-specified defaults from the helm installation.
tsNamespace string
proxyImage string
defaultTags []string
tsFirewallMode string
defaultProxyClass string
mu sync.Mutex // protects following
proxyGroups set.Slice[types.UID] // for proxygroups gauge
}
func (r *ProxyGroupReconciler) logger(name string) *zap.SugaredLogger {
return r.l.With("ProxyGroup", name)
}
func (r *ProxyGroupReconciler) Reconcile(ctx context.Context, req reconcile.Request) (_ reconcile.Result, err error) {
logger := r.logger(req.Name)
logger.Debugf("starting reconcile")
defer logger.Debugf("reconcile finished")
pg := new(tsapi.ProxyGroup)
err = r.Get(ctx, req.NamespacedName, pg)
if apierrors.IsNotFound(err) {
logger.Debugf("ProxyGroup not found, assuming it was deleted")
return reconcile.Result{}, nil
} else if err != nil {
return reconcile.Result{}, fmt.Errorf("failed to get tailscale.com ProxyGroup: %w", err)
}
if markedForDeletion(pg) {
logger.Debugf("ProxyGroup is being deleted, cleaning up resources")
ix := xslices.Index(pg.Finalizers, FinalizerName)
if ix < 0 {
logger.Debugf("no finalizer, nothing to do")
return reconcile.Result{}, nil
}
if done, err := r.maybeCleanup(ctx, pg); err != nil {
return reconcile.Result{}, err
} else if !done {
logger.Debugf("ProxyGroup resource cleanup not yet finished, will retry...")
return reconcile.Result{RequeueAfter: shortRequeue}, nil
}
pg.Finalizers = slices.Delete(pg.Finalizers, ix, ix+1)
if err := r.Update(ctx, pg); err != nil {
return reconcile.Result{}, err
}
return reconcile.Result{}, nil
}
oldPGStatus := pg.Status.DeepCopy()
setStatusReady := func(pg *tsapi.ProxyGroup, status metav1.ConditionStatus, reason, message string) (reconcile.Result, error) {
tsoperator.SetProxyGroupCondition(pg, tsapi.ProxyGroupReady, status, reason, message, pg.Generation, r.clock, logger)
if !apiequality.Semantic.DeepEqual(oldPGStatus, pg.Status) {
// An error encountered here should get returned by the Reconcile function.
if updateErr := r.Client.Status().Update(ctx, pg); updateErr != nil {
err = errors.Wrap(err, updateErr.Error())
}
}
return reconcile.Result{}, err
}
if !slices.Contains(pg.Finalizers, FinalizerName) {
// This log line is printed exactly once during initial provisioning,
// because once the finalizer is in place this block gets skipped. So,
// this is a nice place to log that the high level, multi-reconcile
// operation is underway.
logger.Infof("ensuring ProxyGroup is set up")
pg.Finalizers = append(pg.Finalizers, FinalizerName)
if err = r.Update(ctx, pg); err != nil {
err = fmt.Errorf("error adding finalizer: %w", err)
return setStatusReady(pg, metav1.ConditionFalse, reasonProxyGroupCreationFailed, reasonProxyGroupCreationFailed)
}
}
if err = r.validate(pg); err != nil {
message := fmt.Sprintf("ProxyGroup is invalid: %s", err)
r.recorder.Eventf(pg, corev1.EventTypeWarning, reasonProxyGroupInvalid, message)
return setStatusReady(pg, metav1.ConditionFalse, reasonProxyGroupInvalid, message)
}
proxyClassName := r.defaultProxyClass
if pg.Spec.ProxyClass != "" {
proxyClassName = pg.Spec.ProxyClass
}
var proxyClass *tsapi.ProxyClass
if proxyClassName != "" {
proxyClass = new(tsapi.ProxyClass)
err := r.Get(ctx, types.NamespacedName{Name: proxyClassName}, proxyClass)
if apierrors.IsNotFound(err) {
err = nil
message := fmt.Sprintf("the ProxyGroup's ProxyClass %s does not (yet) exist", proxyClassName)
logger.Info(message)
return setStatusReady(pg, metav1.ConditionFalse, reasonProxyGroupCreating, message)
}
if err != nil {
err = fmt.Errorf("error getting ProxyGroup's ProxyClass %s: %s", proxyClassName, err)
r.recorder.Eventf(pg, corev1.EventTypeWarning, reasonProxyGroupCreationFailed, err.Error())
return setStatusReady(pg, metav1.ConditionFalse, reasonProxyGroupCreationFailed, err.Error())
}
if !tsoperator.ProxyClassIsReady(proxyClass) {
message := fmt.Sprintf("the ProxyGroup's ProxyClass %s is not yet in a ready state, waiting...", proxyClassName)
logger.Info(message)
return setStatusReady(pg, metav1.ConditionFalse, reasonProxyGroupCreating, message)
}
}
if err = r.maybeProvision(ctx, pg, proxyClass); err != nil {
err = fmt.Errorf("error provisioning ProxyGroup resources: %w", err)
r.recorder.Eventf(pg, corev1.EventTypeWarning, reasonProxyGroupCreationFailed, err.Error())
return setStatusReady(pg, metav1.ConditionFalse, reasonProxyGroupCreationFailed, err.Error())
}
desiredReplicas := int(pgReplicas(pg))
if len(pg.Status.Devices) < desiredReplicas {
message := fmt.Sprintf("%d/%d ProxyGroup pods running", len(pg.Status.Devices), desiredReplicas)
logger.Debug(message)
return setStatusReady(pg, metav1.ConditionFalse, reasonProxyGroupCreating, message)
}
if len(pg.Status.Devices) > desiredReplicas {
message := fmt.Sprintf("waiting for %d ProxyGroup pods to shut down", len(pg.Status.Devices)-desiredReplicas)
logger.Debug(message)
return setStatusReady(pg, metav1.ConditionFalse, reasonProxyGroupCreating, message)
}
logger.Info("ProxyGroup resources synced")
return setStatusReady(pg, metav1.ConditionTrue, reasonProxyGroupReady, reasonProxyGroupReady)
}
func (r *ProxyGroupReconciler) maybeProvision(ctx context.Context, pg *tsapi.ProxyGroup, proxyClass *tsapi.ProxyClass) error {
logger := r.logger(pg.Name)
r.mu.Lock()
r.proxyGroups.Add(pg.UID)
gaugeProxyGroupResources.Set(int64(r.proxyGroups.Len()))
r.mu.Unlock()
cfgHash, err := r.ensureConfigSecretsCreated(ctx, pg, proxyClass)
if err != nil {
return fmt.Errorf("error provisioning config Secrets: %w", err)
}
// State secrets are precreated so we can use the ProxyGroup CR as their owner ref.
stateSecrets := pgStateSecrets(pg, r.tsNamespace)
for _, sec := range stateSecrets {
if _, err := createOrUpdate(ctx, r.Client, r.tsNamespace, sec, func(s *corev1.Secret) {
s.ObjectMeta.Labels = sec.ObjectMeta.Labels
s.ObjectMeta.Annotations = sec.ObjectMeta.Annotations
s.ObjectMeta.OwnerReferences = sec.ObjectMeta.OwnerReferences
}); err != nil {
return fmt.Errorf("error provisioning state Secrets: %w", err)
}
}
sa := pgServiceAccount(pg, r.tsNamespace)
if _, err := createOrUpdate(ctx, r.Client, r.tsNamespace, sa, func(s *corev1.ServiceAccount) {
s.ObjectMeta.Labels = sa.ObjectMeta.Labels
s.ObjectMeta.Annotations = sa.ObjectMeta.Annotations
s.ObjectMeta.OwnerReferences = sa.ObjectMeta.OwnerReferences
}); err != nil {
return fmt.Errorf("error provisioning ServiceAccount: %w", err)
}
role := pgRole(pg, r.tsNamespace)
if _, err := createOrUpdate(ctx, r.Client, r.tsNamespace, role, func(r *rbacv1.Role) {
r.ObjectMeta.Labels = role.ObjectMeta.Labels
r.ObjectMeta.Annotations = role.ObjectMeta.Annotations
r.ObjectMeta.OwnerReferences = role.ObjectMeta.OwnerReferences
r.Rules = role.Rules
}); err != nil {
return fmt.Errorf("error provisioning Role: %w", err)
}
roleBinding := pgRoleBinding(pg, r.tsNamespace)
if _, err := createOrUpdate(ctx, r.Client, r.tsNamespace, roleBinding, func(r *rbacv1.RoleBinding) {
r.ObjectMeta.Labels = roleBinding.ObjectMeta.Labels
r.ObjectMeta.Annotations = roleBinding.ObjectMeta.Annotations
r.ObjectMeta.OwnerReferences = roleBinding.ObjectMeta.OwnerReferences
r.RoleRef = roleBinding.RoleRef
r.Subjects = roleBinding.Subjects
}); err != nil {
return fmt.Errorf("error provisioning RoleBinding: %w", err)
}
if pg.Spec.Type == tsapi.ProxyGroupTypeEgress {
cm := pgEgressCM(pg, r.tsNamespace)
if _, err := createOrUpdate(ctx, r.Client, r.tsNamespace, cm, func(existing *corev1.ConfigMap) {
existing.ObjectMeta.Labels = cm.ObjectMeta.Labels
existing.ObjectMeta.OwnerReferences = cm.ObjectMeta.OwnerReferences
}); err != nil {
return fmt.Errorf("error provisioning ConfigMap: %w", err)
}
}
ss, err := pgStatefulSet(pg, r.tsNamespace, r.proxyImage, r.tsFirewallMode, cfgHash)
if err != nil {
return fmt.Errorf("error generating StatefulSet spec: %w", err)
}
ss = applyProxyClassToStatefulSet(proxyClass, ss, nil, logger)
if _, err := createOrUpdate(ctx, r.Client, r.tsNamespace, ss, func(s *appsv1.StatefulSet) {
s.ObjectMeta.Labels = ss.ObjectMeta.Labels
s.ObjectMeta.Annotations = ss.ObjectMeta.Annotations
s.ObjectMeta.OwnerReferences = ss.ObjectMeta.OwnerReferences
s.Spec = ss.Spec
}); err != nil {
return fmt.Errorf("error provisioning StatefulSet: %w", err)
}
if err := r.cleanupDanglingResources(ctx, pg); err != nil {
return fmt.Errorf("error cleaning up dangling resources: %w", err)
}
devices, err := r.getDeviceInfo(ctx, pg)
if err != nil {
return fmt.Errorf("failed to get device info: %w", err)
}
pg.Status.Devices = devices
return nil
}
// cleanupDanglingResources ensures we don't leak config secrets, state secrets, and
// tailnet devices when the number of replicas specified is reduced.
func (r *ProxyGroupReconciler) cleanupDanglingResources(ctx context.Context, pg *tsapi.ProxyGroup) error {
logger := r.logger(pg.Name)
metadata, err := r.getNodeMetadata(ctx, pg)
if err != nil {
return err
}
for _, m := range metadata {
if m.ordinal+1 <= int(pgReplicas(pg)) {
continue
}
// Dangling resource, delete the config + state Secrets, as well as
// deleting the device from the tailnet.
if err := r.deleteTailnetDevice(ctx, m.tsID, logger); err != nil {
return err
}
if err := r.Delete(ctx, m.stateSecret); err != nil {
if !apierrors.IsNotFound(err) {
return fmt.Errorf("error deleting state Secret %s: %w", m.stateSecret.Name, err)
}
}
configSecret := m.stateSecret.DeepCopy()
configSecret.Name += "-config"
if err := r.Delete(ctx, configSecret); err != nil {
if !apierrors.IsNotFound(err) {
return fmt.Errorf("error deleting config Secret %s: %w", configSecret.Name, err)
}
}
}
return nil
}
// maybeCleanup just deletes the device from the tailnet. All the kubernetes
// resources linked to a ProxyGroup will get cleaned up via owner references
// (which we can use because they are all in the same namespace).
func (r *ProxyGroupReconciler) maybeCleanup(ctx context.Context, pg *tsapi.ProxyGroup) (bool, error) {
logger := r.logger(pg.Name)
metadata, err := r.getNodeMetadata(ctx, pg)
if err != nil {
return false, err
}
for _, m := range metadata {
if err := r.deleteTailnetDevice(ctx, m.tsID, logger); err != nil {
return false, err
}
}
logger.Infof("cleaned up ProxyGroup resources")
r.mu.Lock()
r.proxyGroups.Remove(pg.UID)
gaugeProxyGroupResources.Set(int64(r.proxyGroups.Len()))
r.mu.Unlock()
return true, nil
}
func (r *ProxyGroupReconciler) deleteTailnetDevice(ctx context.Context, id tailcfg.StableNodeID, logger *zap.SugaredLogger) error {
logger.Debugf("deleting device %s from control", string(id))
if err := r.tsClient.DeleteDevice(ctx, string(id)); err != nil {
errResp := &tailscale.ErrResponse{}
if ok := errors.As(err, errResp); ok && errResp.Status == http.StatusNotFound {
logger.Debugf("device %s not found, likely because it has already been deleted from control", string(id))
} else {
return fmt.Errorf("error deleting device: %w", err)
}
} else {
logger.Debugf("device %s deleted from control", string(id))
}
return nil
}
func (r *ProxyGroupReconciler) ensureConfigSecretsCreated(ctx context.Context, pg *tsapi.ProxyGroup, proxyClass *tsapi.ProxyClass) (hash string, err error) {
logger := r.logger(pg.Name)
var allConfigs []tailscaledConfigs
for i := range pgReplicas(pg) {
cfgSecret := &corev1.Secret{
ObjectMeta: metav1.ObjectMeta{
Name: fmt.Sprintf("%s-%d-config", pg.Name, i),
Namespace: r.tsNamespace,
Labels: pgSecretLabels(pg.Name, "config"),
OwnerReferences: pgOwnerReference(pg),
},
}
var existingCfgSecret *corev1.Secret // unmodified copy of secret
if err := r.Get(ctx, client.ObjectKeyFromObject(cfgSecret), cfgSecret); err == nil {
logger.Debugf("secret %s/%s already exists", cfgSecret.GetNamespace(), cfgSecret.GetName())
existingCfgSecret = cfgSecret.DeepCopy()
} else if !apierrors.IsNotFound(err) {
return "", err
}
var authKey string
if existingCfgSecret == nil {
logger.Debugf("creating authkey for new ProxyGroup proxy")
tags := pg.Spec.Tags.Stringify()
if len(tags) == 0 {
tags = r.defaultTags
}
authKey, err = newAuthKey(ctx, r.tsClient, tags)
if err != nil {
return "", err
}
}
configs, err := pgTailscaledConfig(pg, proxyClass, i, authKey, existingCfgSecret)
if err != nil {
return "", fmt.Errorf("error creating tailscaled config: %w", err)
}
allConfigs = append(allConfigs, configs)
for cap, cfg := range configs {
cfgJSON, err := json.Marshal(cfg)
if err != nil {
return "", fmt.Errorf("error marshalling tailscaled config: %w", err)
}
mak.Set(&cfgSecret.StringData, tsoperator.TailscaledConfigFileName(cap), string(cfgJSON))
}
if existingCfgSecret != nil {
logger.Debugf("patching the existing ProxyGroup config Secret %s", cfgSecret.Name)
if err := r.Patch(ctx, cfgSecret, client.MergeFrom(existingCfgSecret)); err != nil {
return "", err
}
} else {
logger.Debugf("creating a new config Secret %s for the ProxyGroup", cfgSecret.Name)
if err := r.Create(ctx, cfgSecret); err != nil {
return "", err
}
}
}
sum := sha256.New()
b, err := json.Marshal(allConfigs)
if err != nil {
return "", err
}
if _, err := sum.Write(b); err != nil {
return "", err
}
return fmt.Sprintf("%x", sum.Sum(nil)), nil
}
func pgTailscaledConfig(pg *tsapi.ProxyGroup, class *tsapi.ProxyClass, idx int32, authKey string, oldSecret *corev1.Secret) (tailscaledConfigs, error) {
conf := &ipn.ConfigVAlpha{
Version: "alpha0",
AcceptDNS: "false",
AcceptRoutes: "false", // AcceptRoutes defaults to true
Locked: "false",
Hostname: ptr.To(fmt.Sprintf("%s-%d", pg.Name, idx)),
}
if pg.Spec.HostnamePrefix != "" {
conf.Hostname = ptr.To(fmt.Sprintf("%s%d", pg.Spec.HostnamePrefix, idx))
}
if shouldAcceptRoutes(class) {
conf.AcceptRoutes = "true"
}
deviceAuthed := false
for _, d := range pg.Status.Devices {
if d.Hostname == *conf.Hostname {
deviceAuthed = true
break
}
}
if authKey != "" {
conf.AuthKey = &authKey
} else if !deviceAuthed {
key, err := authKeyFromSecret(oldSecret)
if err != nil {
return nil, fmt.Errorf("error retrieving auth key from Secret: %w", err)
}
conf.AuthKey = key
}
capVerConfigs := make(map[tailcfg.CapabilityVersion]ipn.ConfigVAlpha)
capVerConfigs[106] = *conf
return capVerConfigs, nil
}
func (r *ProxyGroupReconciler) validate(_ *tsapi.ProxyGroup) error {
return nil
}
// getNodeMetadata gets metadata for all the pods owned by this ProxyGroup by
// querying their state Secrets. It may not return the same number of items as
// specified in the ProxyGroup spec if e.g. it is getting scaled up or down, or
// some pods have failed to write state.
func (r *ProxyGroupReconciler) getNodeMetadata(ctx context.Context, pg *tsapi.ProxyGroup) (metadata []nodeMetadata, _ error) {
// List all state secrets owned by this ProxyGroup.
secrets := &corev1.SecretList{}
if err := r.List(ctx, secrets, client.InNamespace(r.tsNamespace), client.MatchingLabels(pgSecretLabels(pg.Name, "state"))); err != nil {
return nil, fmt.Errorf("failed to list state Secrets: %w", err)
}
for _, secret := range secrets.Items {
var ordinal int
if _, err := fmt.Sscanf(secret.Name, pg.Name+"-%d", &ordinal); err != nil {
return nil, fmt.Errorf("unexpected secret %s was labelled as owned by the ProxyGroup %s: %w", secret.Name, pg.Name, err)
}
id, dnsName, ok, err := getNodeMetadata(ctx, &secret)
if err != nil {
return nil, err
}
if !ok {
continue
}
metadata = append(metadata, nodeMetadata{
ordinal: ordinal,
stateSecret: &secret,
tsID: id,
dnsName: dnsName,
})
}
return metadata, nil
}
func (r *ProxyGroupReconciler) getDeviceInfo(ctx context.Context, pg *tsapi.ProxyGroup) (devices []tsapi.TailnetDevice, _ error) {
metadata, err := r.getNodeMetadata(ctx, pg)
if err != nil {
return nil, err
}
for _, m := range metadata {
device, ok, err := getDeviceInfo(ctx, r.tsClient, m.stateSecret)
if err != nil {
return nil, err
}
if !ok {
continue
}
devices = append(devices, tsapi.TailnetDevice{
Hostname: device.Hostname,
TailnetIPs: device.TailnetIPs,
})
}
return devices, nil
}
type nodeMetadata struct {
ordinal int
stateSecret *corev1.Secret
tsID tailcfg.StableNodeID
dnsName string
}

View File

@@ -0,0 +1,294 @@
// Copyright (c) Tailscale Inc & AUTHORS
// SPDX-License-Identifier: BSD-3-Clause
//go:build !plan9
package main
import (
"fmt"
appsv1 "k8s.io/api/apps/v1"
corev1 "k8s.io/api/core/v1"
rbacv1 "k8s.io/api/rbac/v1"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
"sigs.k8s.io/yaml"
tsapi "tailscale.com/k8s-operator/apis/v1alpha1"
"tailscale.com/kube/egressservices"
"tailscale.com/types/ptr"
)
// Returns the base StatefulSet definition for a ProxyGroup. A ProxyClass may be
// applied over the top after.
func pgStatefulSet(pg *tsapi.ProxyGroup, namespace, image, tsFirewallMode, cfgHash string) (*appsv1.StatefulSet, error) {
ss := new(appsv1.StatefulSet)
if err := yaml.Unmarshal(proxyYaml, &ss); err != nil {
return nil, fmt.Errorf("failed to unmarshal proxy spec: %w", err)
}
// Validate some base assumptions.
if len(ss.Spec.Template.Spec.InitContainers) != 1 {
return nil, fmt.Errorf("[unexpected] base proxy config had %d init containers instead of 1", len(ss.Spec.Template.Spec.InitContainers))
}
if len(ss.Spec.Template.Spec.Containers) != 1 {
return nil, fmt.Errorf("[unexpected] base proxy config had %d containers instead of 1", len(ss.Spec.Template.Spec.Containers))
}
// StatefulSet config.
ss.ObjectMeta = metav1.ObjectMeta{
Name: pg.Name,
Namespace: namespace,
Labels: pgLabels(pg.Name, nil),
OwnerReferences: pgOwnerReference(pg),
}
ss.Spec.Replicas = ptr.To(pgReplicas(pg))
ss.Spec.Selector = &metav1.LabelSelector{
MatchLabels: pgLabels(pg.Name, nil),
}
// Template config.
tmpl := &ss.Spec.Template
tmpl.ObjectMeta = metav1.ObjectMeta{
Name: pg.Name,
Namespace: namespace,
Labels: pgLabels(pg.Name, nil),
DeletionGracePeriodSeconds: ptr.To[int64](10),
Annotations: map[string]string{
podAnnotationLastSetConfigFileHash: cfgHash,
},
}
tmpl.Spec.ServiceAccountName = pg.Name
tmpl.Spec.InitContainers[0].Image = image
tmpl.Spec.Volumes = func() []corev1.Volume {
var volumes []corev1.Volume
for i := range pgReplicas(pg) {
volumes = append(volumes, corev1.Volume{
Name: fmt.Sprintf("tailscaledconfig-%d", i),
VolumeSource: corev1.VolumeSource{
Secret: &corev1.SecretVolumeSource{
SecretName: fmt.Sprintf("%s-%d-config", pg.Name, i),
},
},
})
}
if pg.Spec.Type == tsapi.ProxyGroupTypeEgress {
volumes = append(volumes, corev1.Volume{
Name: pgEgressCMName(pg.Name),
VolumeSource: corev1.VolumeSource{
ConfigMap: &corev1.ConfigMapVolumeSource{
LocalObjectReference: corev1.LocalObjectReference{
Name: pgEgressCMName(pg.Name),
},
},
},
})
}
return volumes
}()
// Main container config.
c := &ss.Spec.Template.Spec.Containers[0]
c.Image = image
c.VolumeMounts = func() []corev1.VolumeMount {
var mounts []corev1.VolumeMount
for i := range pgReplicas(pg) {
mounts = append(mounts, corev1.VolumeMount{
Name: fmt.Sprintf("tailscaledconfig-%d", i),
ReadOnly: true,
MountPath: fmt.Sprintf("/etc/tsconfig/%s-%d", pg.Name, i),
})
}
if pg.Spec.Type == tsapi.ProxyGroupTypeEgress {
mounts = append(mounts, corev1.VolumeMount{
Name: pgEgressCMName(pg.Name),
MountPath: "/etc/proxies",
ReadOnly: true,
})
}
return mounts
}()
c.Env = func() []corev1.EnvVar {
envs := []corev1.EnvVar{
{
// TODO(irbekrm): verify that .status.podIPs are always set, else read in .status.podIP as well.
Name: "POD_IPS", // this will be a comma separate list i.e 10.136.0.6,2600:1900:4011:161:0:e:0:6
ValueFrom: &corev1.EnvVarSource{
FieldRef: &corev1.ObjectFieldSelector{
FieldPath: "status.podIPs",
},
},
},
{
Name: "POD_NAME",
ValueFrom: &corev1.EnvVarSource{
FieldRef: &corev1.ObjectFieldSelector{
// Secret is named after the pod.
FieldPath: "metadata.name",
},
},
},
{
Name: "TS_KUBE_SECRET",
Value: "$(POD_NAME)",
},
{
Name: "TS_STATE",
Value: "kube:$(POD_NAME)",
},
{
Name: "TS_EXPERIMENTAL_VERSIONED_CONFIG_DIR",
Value: "/etc/tsconfig/$(POD_NAME)",
},
{
Name: "TS_USERSPACE",
Value: "false",
},
}
if tsFirewallMode != "" {
envs = append(envs, corev1.EnvVar{
Name: "TS_DEBUG_FIREWALL_MODE",
Value: tsFirewallMode,
})
}
if pg.Spec.Type == tsapi.ProxyGroupTypeEgress {
envs = append(envs, corev1.EnvVar{
Name: "TS_EGRESS_SERVICES_CONFIG_PATH",
Value: fmt.Sprintf("/etc/proxies/%s", egressservices.KeyEgressServices),
})
}
return envs
}()
return ss, nil
}
func pgServiceAccount(pg *tsapi.ProxyGroup, namespace string) *corev1.ServiceAccount {
return &corev1.ServiceAccount{
ObjectMeta: metav1.ObjectMeta{
Name: pg.Name,
Namespace: namespace,
Labels: pgLabels(pg.Name, nil),
OwnerReferences: pgOwnerReference(pg),
},
}
}
func pgRole(pg *tsapi.ProxyGroup, namespace string) *rbacv1.Role {
return &rbacv1.Role{
ObjectMeta: metav1.ObjectMeta{
Name: pg.Name,
Namespace: namespace,
Labels: pgLabels(pg.Name, nil),
OwnerReferences: pgOwnerReference(pg),
},
Rules: []rbacv1.PolicyRule{
{
APIGroups: []string{""},
Resources: []string{"secrets"},
Verbs: []string{
"get",
"patch",
"update",
},
ResourceNames: func() (secrets []string) {
for i := range pgReplicas(pg) {
secrets = append(secrets,
fmt.Sprintf("%s-%d-config", pg.Name, i), // Config with auth key.
fmt.Sprintf("%s-%d", pg.Name, i), // State.
)
}
return secrets
}(),
},
},
}
}
func pgRoleBinding(pg *tsapi.ProxyGroup, namespace string) *rbacv1.RoleBinding {
return &rbacv1.RoleBinding{
ObjectMeta: metav1.ObjectMeta{
Name: pg.Name,
Namespace: namespace,
Labels: pgLabels(pg.Name, nil),
OwnerReferences: pgOwnerReference(pg),
},
Subjects: []rbacv1.Subject{
{
Kind: "ServiceAccount",
Name: pg.Name,
Namespace: namespace,
},
},
RoleRef: rbacv1.RoleRef{
Kind: "Role",
Name: pg.Name,
},
}
}
func pgStateSecrets(pg *tsapi.ProxyGroup, namespace string) (secrets []*corev1.Secret) {
for i := range pgReplicas(pg) {
secrets = append(secrets, &corev1.Secret{
ObjectMeta: metav1.ObjectMeta{
Name: fmt.Sprintf("%s-%d", pg.Name, i),
Namespace: namespace,
Labels: pgSecretLabels(pg.Name, "state"),
OwnerReferences: pgOwnerReference(pg),
},
})
}
return secrets
}
func pgEgressCM(pg *tsapi.ProxyGroup, namespace string) *corev1.ConfigMap {
return &corev1.ConfigMap{
ObjectMeta: metav1.ObjectMeta{
Name: pgEgressCMName(pg.Name),
Namespace: namespace,
Labels: pgLabels(pg.Name, nil),
OwnerReferences: pgOwnerReference(pg),
},
}
}
func pgSecretLabels(pgName, typ string) map[string]string {
return pgLabels(pgName, map[string]string{
labelSecretType: typ, // "config" or "state".
})
}
func pgLabels(pgName string, customLabels map[string]string) map[string]string {
l := make(map[string]string, len(customLabels)+3)
for k, v := range customLabels {
l[k] = v
}
l[LabelManaged] = "true"
l[LabelParentType] = "proxygroup"
l[LabelParentName] = pgName
return l
}
func pgOwnerReference(owner *tsapi.ProxyGroup) []metav1.OwnerReference {
return []metav1.OwnerReference{*metav1.NewControllerRef(owner, tsapi.SchemeGroupVersion.WithKind("ProxyGroup"))}
}
func pgReplicas(pg *tsapi.ProxyGroup) int32 {
if pg.Spec.Replicas != nil {
return *pg.Spec.Replicas
}
return 2
}
func pgEgressCMName(pg string) string {
return fmt.Sprintf("%s-egress-config", pg)
}

View File

@@ -0,0 +1,267 @@
// Copyright (c) Tailscale Inc & AUTHORS
// SPDX-License-Identifier: BSD-3-Clause
//go:build !plan9
package main
import (
"context"
"encoding/json"
"fmt"
"testing"
"time"
"github.com/google/go-cmp/cmp"
"go.uber.org/zap"
appsv1 "k8s.io/api/apps/v1"
corev1 "k8s.io/api/core/v1"
rbacv1 "k8s.io/api/rbac/v1"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
"k8s.io/client-go/tools/record"
"sigs.k8s.io/controller-runtime/pkg/client"
"sigs.k8s.io/controller-runtime/pkg/client/fake"
"tailscale.com/client/tailscale"
tsoperator "tailscale.com/k8s-operator"
tsapi "tailscale.com/k8s-operator/apis/v1alpha1"
"tailscale.com/tstest"
"tailscale.com/types/ptr"
)
const testProxyImage = "tailscale/tailscale:test"
var defaultProxyClassAnnotations = map[string]string{
"some-annotation": "from-the-proxy-class",
}
func TestProxyGroup(t *testing.T) {
pc := &tsapi.ProxyClass{
ObjectMeta: metav1.ObjectMeta{
Name: "default-pc",
},
Spec: tsapi.ProxyClassSpec{
StatefulSet: &tsapi.StatefulSet{
Annotations: defaultProxyClassAnnotations,
},
},
}
pg := &tsapi.ProxyGroup{
ObjectMeta: metav1.ObjectMeta{
Name: "test",
Finalizers: []string{"tailscale.com/finalizer"},
},
}
fc := fake.NewClientBuilder().
WithScheme(tsapi.GlobalScheme).
WithObjects(pg, pc).
WithStatusSubresource(pg, pc).
Build()
tsClient := &fakeTSClient{}
zl, _ := zap.NewDevelopment()
fr := record.NewFakeRecorder(1)
cl := tstest.NewClock(tstest.ClockOpts{})
reconciler := &ProxyGroupReconciler{
tsNamespace: tsNamespace,
proxyImage: testProxyImage,
defaultTags: []string{"tag:test-tag"},
tsFirewallMode: "auto",
defaultProxyClass: "default-pc",
Client: fc,
tsClient: tsClient,
recorder: fr,
l: zl.Sugar(),
clock: cl,
}
t.Run("proxyclass_not_ready", func(t *testing.T) {
expectReconciled(t, reconciler, "", pg.Name)
tsoperator.SetProxyGroupCondition(pg, tsapi.ProxyGroupReady, metav1.ConditionFalse, reasonProxyGroupCreating, "the ProxyGroup's ProxyClass default-pc is not yet in a ready state, waiting...", 0, cl, zl.Sugar())
expectEqual(t, fc, pg, nil)
})
t.Run("observe_ProxyGroupCreating_status_reason", func(t *testing.T) {
pc.Status = tsapi.ProxyClassStatus{
Conditions: []metav1.Condition{{
Type: string(tsapi.ProxyClassReady),
Status: metav1.ConditionTrue,
Reason: reasonProxyClassValid,
Message: reasonProxyClassValid,
LastTransitionTime: metav1.Time{Time: cl.Now().Truncate(time.Second)},
}},
}
if err := fc.Status().Update(context.Background(), pc); err != nil {
t.Fatal(err)
}
expectReconciled(t, reconciler, "", pg.Name)
tsoperator.SetProxyGroupCondition(pg, tsapi.ProxyGroupReady, metav1.ConditionFalse, reasonProxyGroupCreating, "0/2 ProxyGroup pods running", 0, cl, zl.Sugar())
expectEqual(t, fc, pg, nil)
if expected := 1; reconciler.proxyGroups.Len() != expected {
t.Fatalf("expected %d recorders, got %d", expected, reconciler.proxyGroups.Len())
}
expectProxyGroupResources(t, fc, pg, true)
keyReq := tailscale.KeyCapabilities{
Devices: tailscale.KeyDeviceCapabilities{
Create: tailscale.KeyDeviceCreateCapabilities{
Reusable: false,
Ephemeral: false,
Preauthorized: true,
Tags: []string{"tag:test-tag"},
},
},
}
if diff := cmp.Diff(tsClient.KeyRequests(), []tailscale.KeyCapabilities{keyReq, keyReq}); diff != "" {
t.Fatalf("unexpected secrets (-got +want):\n%s", diff)
}
})
t.Run("simulate_successful_device_auth", func(t *testing.T) {
addNodeIDToStateSecrets(t, fc, pg)
expectReconciled(t, reconciler, "", pg.Name)
pg.Status.Devices = []tsapi.TailnetDevice{
{
Hostname: "hostname-nodeid-0",
TailnetIPs: []string{"1.2.3.4", "::1"},
},
{
Hostname: "hostname-nodeid-1",
TailnetIPs: []string{"1.2.3.4", "::1"},
},
}
tsoperator.SetProxyGroupCondition(pg, tsapi.ProxyGroupReady, metav1.ConditionTrue, reasonProxyGroupReady, reasonProxyGroupReady, 0, cl, zl.Sugar())
expectEqual(t, fc, pg, nil)
expectProxyGroupResources(t, fc, pg, true)
})
t.Run("scale_up_to_3", func(t *testing.T) {
pg.Spec.Replicas = ptr.To[int32](3)
mustUpdate(t, fc, "", pg.Name, func(p *tsapi.ProxyGroup) {
p.Spec = pg.Spec
})
expectReconciled(t, reconciler, "", pg.Name)
tsoperator.SetProxyGroupCondition(pg, tsapi.ProxyGroupReady, metav1.ConditionFalse, reasonProxyGroupCreating, "2/3 ProxyGroup pods running", 0, cl, zl.Sugar())
expectEqual(t, fc, pg, nil)
addNodeIDToStateSecrets(t, fc, pg)
expectReconciled(t, reconciler, "", pg.Name)
tsoperator.SetProxyGroupCondition(pg, tsapi.ProxyGroupReady, metav1.ConditionTrue, reasonProxyGroupReady, reasonProxyGroupReady, 0, cl, zl.Sugar())
pg.Status.Devices = append(pg.Status.Devices, tsapi.TailnetDevice{
Hostname: "hostname-nodeid-2",
TailnetIPs: []string{"1.2.3.4", "::1"},
})
expectEqual(t, fc, pg, nil)
expectProxyGroupResources(t, fc, pg, true)
})
t.Run("scale_down_to_1", func(t *testing.T) {
pg.Spec.Replicas = ptr.To[int32](1)
mustUpdate(t, fc, "", pg.Name, func(p *tsapi.ProxyGroup) {
p.Spec = pg.Spec
})
expectReconciled(t, reconciler, "", pg.Name)
pg.Status.Devices = pg.Status.Devices[:1] // truncate to only the first device.
expectEqual(t, fc, pg, nil)
expectProxyGroupResources(t, fc, pg, true)
})
t.Run("delete_and_cleanup", func(t *testing.T) {
if err := fc.Delete(context.Background(), pg); err != nil {
t.Fatal(err)
}
expectReconciled(t, reconciler, "", pg.Name)
expectMissing[tsapi.Recorder](t, fc, "", pg.Name)
if expected := 0; reconciler.proxyGroups.Len() != expected {
t.Fatalf("expected %d ProxyGroups, got %d", expected, reconciler.proxyGroups.Len())
}
// 2 nodes should get deleted as part of the scale down, and then finally
// the first node gets deleted with the ProxyGroup cleanup.
if diff := cmp.Diff(tsClient.deleted, []string{"nodeid-1", "nodeid-2", "nodeid-0"}); diff != "" {
t.Fatalf("unexpected deleted devices (-got +want):\n%s", diff)
}
// The fake client does not clean up objects whose owner has been
// deleted, so we can't test for the owned resources getting deleted.
})
}
func expectProxyGroupResources(t *testing.T, fc client.WithWatch, pg *tsapi.ProxyGroup, shouldExist bool) {
t.Helper()
role := pgRole(pg, tsNamespace)
roleBinding := pgRoleBinding(pg, tsNamespace)
serviceAccount := pgServiceAccount(pg, tsNamespace)
statefulSet, err := pgStatefulSet(pg, tsNamespace, testProxyImage, "auto", "")
if err != nil {
t.Fatal(err)
}
statefulSet.Annotations = defaultProxyClassAnnotations
if shouldExist {
expectEqual(t, fc, role, nil)
expectEqual(t, fc, roleBinding, nil)
expectEqual(t, fc, serviceAccount, nil)
expectEqual(t, fc, statefulSet, func(ss *appsv1.StatefulSet) {
ss.Spec.Template.Annotations[podAnnotationLastSetConfigFileHash] = ""
})
} else {
expectMissing[rbacv1.Role](t, fc, role.Namespace, role.Name)
expectMissing[rbacv1.RoleBinding](t, fc, roleBinding.Namespace, roleBinding.Name)
expectMissing[corev1.ServiceAccount](t, fc, serviceAccount.Namespace, serviceAccount.Name)
expectMissing[appsv1.StatefulSet](t, fc, statefulSet.Namespace, statefulSet.Name)
}
var expectedSecrets []string
for i := range pgReplicas(pg) {
expectedSecrets = append(expectedSecrets,
fmt.Sprintf("%s-%d", pg.Name, i),
fmt.Sprintf("%s-%d-config", pg.Name, i),
)
}
expectSecrets(t, fc, expectedSecrets)
}
func expectSecrets(t *testing.T, fc client.WithWatch, expected []string) {
t.Helper()
secrets := &corev1.SecretList{}
if err := fc.List(context.Background(), secrets); err != nil {
t.Fatal(err)
}
var actual []string
for _, secret := range secrets.Items {
actual = append(actual, secret.Name)
}
if diff := cmp.Diff(actual, expected); diff != "" {
t.Fatalf("unexpected secrets (-got +want):\n%s", diff)
}
}
func addNodeIDToStateSecrets(t *testing.T, fc client.WithWatch, pg *tsapi.ProxyGroup) {
const key = "profile-abc"
for i := range pgReplicas(pg) {
bytes, err := json.Marshal(map[string]any{
"Config": map[string]any{
"NodeID": fmt.Sprintf("nodeid-%d", i),
},
})
if err != nil {
t.Fatal(err)
}
mustUpdate(t, fc, tsNamespace, fmt.Sprintf("test-%d", i), func(s *corev1.Secret) {
s.Data = map[string][]byte{
currentProfileKey: []byte(key),
key: bytes,
}
})
}
}

View File

@@ -47,11 +47,11 @@ const (
LabelParentType = "tailscale.com/parent-resource-type"
LabelParentName = "tailscale.com/parent-resource"
LabelParentNamespace = "tailscale.com/parent-resource-ns"
labelSecretType = "tailscale.com/secret-type" // "config" or "state".
// LabelProxyClass can be set by users on Connectors, tailscale
// Ingresses and Services that define cluster ingress or cluster egress,
// to specify that configuration in this ProxyClass should be applied to
// resources created for the Connector, Ingress or Service.
// LabelProxyClass can be set by users on tailscale Ingresses and Services that define cluster ingress or
// cluster egress, to specify that configuration in this ProxyClass should be applied to resources created for
// the Ingress or Service.
LabelProxyClass = "tailscale.com/proxy-class"
FinalizerName = "tailscale.com/finalizer"
@@ -65,6 +65,8 @@ const (
//MagicDNS name of tailnet node.
AnnotationTailnetTargetFQDN = "tailscale.com/tailnet-fqdn"
AnnotationProxyGroup = "tailscale.com/proxy-group"
// Annotations settable by users on ingresses.
AnnotationFunnel = "tailscale.com/funnel"
@@ -302,7 +304,7 @@ func (a *tailscaleSTSReconciler) reconcileHeadlessService(ctx context.Context, l
return createOrUpdate(ctx, a.Client, a.operatorNamespace, hsvc, func(svc *corev1.Service) { svc.Spec = hsvc.Spec })
}
func (a *tailscaleSTSReconciler) createOrGetSecret(ctx context.Context, logger *zap.SugaredLogger, stsC *tailscaleSTSConfig, hsvc *corev1.Service) (secretName, hash string, configs tailscaleConfigs, _ error) {
func (a *tailscaleSTSReconciler) createOrGetSecret(ctx context.Context, logger *zap.SugaredLogger, stsC *tailscaleSTSConfig, hsvc *corev1.Service) (secretName, hash string, configs tailscaledConfigs, _ error) {
secret := &corev1.Secret{
ObjectMeta: metav1.ObjectMeta{
// Hardcode a -0 suffix so that in future, if we support
@@ -360,7 +362,7 @@ func (a *tailscaleSTSReconciler) createOrGetSecret(ctx context.Context, logger *
latest := tailcfg.CapabilityVersion(-1)
var latestConfig ipn.ConfigVAlpha
for key, val := range configs {
fn := tsoperator.TailscaledConfigFileNameForCap(key)
fn := tsoperator.TailscaledConfigFileName(key)
b, err := json.Marshal(val)
if err != nil {
return "", "", nil, fmt.Errorf("error marshalling tailscaled config: %w", err)
@@ -670,7 +672,7 @@ func applyProxyClassToStatefulSet(pc *tsapi.ProxyClass, ss *appsv1.StatefulSet,
if pc == nil || ss == nil {
return ss
}
if pc.Spec.Metrics != nil && pc.Spec.Metrics.Enable {
if stsCfg != nil && pc.Spec.Metrics != nil && pc.Spec.Metrics.Enable {
if stsCfg.TailnetTargetFQDN == "" && stsCfg.TailnetTargetIP == "" && !stsCfg.ForwardClusterTrafficViaL7IngressProxy {
enableMetrics(ss, pc)
} else if stsCfg.ForwardClusterTrafficViaL7IngressProxy {
@@ -792,7 +794,7 @@ func readAuthKey(secret *corev1.Secret, key string) (*string, error) {
// TODO (irbekrm): remove the legacy config once we no longer need to support
// versions older than cap94,
// https://tailscale.com/kb/1236/kubernetes-operator#operator-and-proxies
func tailscaledConfig(stsC *tailscaleSTSConfig, newAuthkey string, oldSecret *corev1.Secret) (tailscaleConfigs, error) {
func tailscaledConfig(stsC *tailscaleSTSConfig, newAuthkey string, oldSecret *corev1.Secret) (tailscaledConfigs, error) {
conf := &ipn.ConfigVAlpha{
Version: "alpha0",
AcceptDNS: "false",
@@ -821,33 +823,12 @@ func tailscaledConfig(stsC *tailscaleSTSConfig, newAuthkey string, oldSecret *co
if newAuthkey != "" {
conf.AuthKey = &newAuthkey
} else if oldSecret != nil {
var err error
latest := tailcfg.CapabilityVersion(-1)
latestStr := ""
for k, data := range oldSecret.Data {
// write to StringData, read from Data as StringData is write-only
if len(data) == 0 {
continue
}
v, err := tsoperator.CapVerFromFileName(k)
if err != nil {
continue
}
if v > latest {
latestStr = k
latest = v
}
}
// Allow for configs that don't contain an auth key. Perhaps
// users have some mechanisms to delete them. Auth key is
// normally not needed after the initial login.
if latestStr != "" {
conf.AuthKey, err = readAuthKey(oldSecret, latestStr)
if err != nil {
return nil, err
}
} else if shouldRetainAuthKey(oldSecret) {
key, err := authKeyFromSecret(oldSecret)
if err != nil {
return nil, fmt.Errorf("error retrieving auth key from Secret: %w", err)
}
conf.AuthKey = key
}
capVerConfigs := make(map[tailcfg.CapabilityVersion]ipn.ConfigVAlpha)
capVerConfigs[95] = *conf
@@ -857,6 +838,41 @@ func tailscaledConfig(stsC *tailscaleSTSConfig, newAuthkey string, oldSecret *co
return capVerConfigs, nil
}
func authKeyFromSecret(s *corev1.Secret) (key *string, err error) {
latest := tailcfg.CapabilityVersion(-1)
latestStr := ""
for k, data := range s.Data {
// write to StringData, read from Data as StringData is write-only
if len(data) == 0 {
continue
}
v, err := tsoperator.CapVerFromFileName(k)
if err != nil {
continue
}
if v > latest {
latestStr = k
latest = v
}
}
// Allow for configs that don't contain an auth key. Perhaps
// users have some mechanisms to delete them. Auth key is
// normally not needed after the initial login.
if latestStr != "" {
return readAuthKey(s, latestStr)
}
return key, nil
}
// shouldRetainAuthKey returns true if the state stored in a proxy's state Secret suggests that auth key should be
// retained (because the proxy has not yet successfully authenticated).
func shouldRetainAuthKey(s *corev1.Secret) bool {
if s == nil {
return false // nothing to retain here
}
return len(s.Data["device_id"]) == 0 // proxy has not authed yet
}
func shouldAcceptRoutes(pc *tsapi.ProxyClass) bool {
return pc != nil && pc.Spec.TailscaleConfig != nil && pc.Spec.TailscaleConfig.AcceptRoutes
}
@@ -868,7 +884,7 @@ type ptrObject[T any] interface {
*T
}
type tailscaleConfigs map[tailcfg.CapabilityVersion]ipn.ConfigVAlpha
type tailscaledConfigs map[tailcfg.CapabilityVersion]ipn.ConfigVAlpha
// hashBytes produces a hash for the provided tailscaled config that is the same across
// different invocations of this code. We do not use the
@@ -879,7 +895,7 @@ type tailscaleConfigs map[tailcfg.CapabilityVersion]ipn.ConfigVAlpha
// thing that changed is operator version (the hash is also exposed to users via
// an annotation and might be confusing if it changes without the config having
// changed).
func tailscaledConfigHash(c tailscaleConfigs) (string, error) {
func tailscaledConfigHash(c tailscaledConfigs) (string, error) {
b, err := json.Marshal(c)
if err != nil {
return "", fmt.Errorf("error marshalling tailscaled configs: %w", err)

View File

@@ -64,7 +64,7 @@ type ServiceReconciler struct {
clock tstime.Clock
proxyDefaultClass string
defaultProxyClass string
}
var (
@@ -112,6 +112,10 @@ func (a *ServiceReconciler) Reconcile(ctx context.Context, req reconcile.Request
return reconcile.Result{}, fmt.Errorf("failed to get svc: %w", err)
}
if _, ok := svc.Annotations[AnnotationProxyGroup]; ok {
return reconcile.Result{}, nil // this reconciler should not look at Services for ProxyGroup
}
if !svc.DeletionTimestamp.IsZero() || !a.isTailscaleService(svc) {
logger.Debugf("service is being deleted or is (no longer) referring to Tailscale ingress/egress, ensuring any created resources are cleaned up")
return reconcile.Result{}, a.maybeCleanup(ctx, logger, svc)
@@ -211,7 +215,7 @@ func (a *ServiceReconciler) maybeProvision(ctx context.Context, logger *zap.Suga
return nil
}
proxyClass := proxyClassForObject(svc, a.proxyDefaultClass)
proxyClass := proxyClassForObject(svc, a.defaultProxyClass)
if proxyClass != "" {
if ready, err := proxyClassIsReady(ctx, proxyClass, a.Client); err != nil {
errMsg := fmt.Errorf("error verifying ProxyClass for Service: %w", err)
@@ -354,6 +358,10 @@ func validateService(svc *corev1.Service) []string {
violations = append(violations, fmt.Sprintf("invalid value of annotation %s: %q does not appear to be a valid MagicDNS name", AnnotationTailnetTargetFQDN, fqdn))
}
}
// TODO(irbekrm): validate that tailscale.com/tailnet-ip annotation is a
// valid IP address (tailscale/tailscale#13671).
svcName := nameForService(svc)
if err := dnsname.ValidLabel(svcName); err != nil {
if _, ok := svc.Annotations[AnnotationHostname]; ok {

View File

@@ -53,6 +53,8 @@ type configOpts struct {
shouldEnableForwardingClusterTrafficViaIngress bool
proxyClass string // configuration from the named ProxyClass should be applied to proxy resources
app string
shouldRemoveAuthKey bool
secretExtraData map[string][]byte
}
func expectedSTS(t *testing.T, cl client.Client, opts configOpts) *appsv1.StatefulSet {
@@ -365,6 +367,9 @@ func expectedSecret(t *testing.T, cl client.Client, opts configOpts) *corev1.Sec
conf.AcceptRoutes = "true"
}
}
if opts.shouldRemoveAuthKey {
conf.AuthKey = nil
}
var routes []netip.Prefix
if opts.subnetRoutes != "" || opts.isExitNode {
r := opts.subnetRoutes
@@ -405,6 +410,9 @@ func expectedSecret(t *testing.T, cl client.Client, opts configOpts) *corev1.Sec
labels["tailscale.com/parent-resource-ns"] = "" // Connector is cluster scoped
}
s.Labels = labels
for key, val := range opts.secretExtraData {
mak.Set(&s.Data, key, val)
}
return s
}
@@ -596,7 +604,7 @@ func (c *fakeTSClient) CreateKey(ctx context.Context, caps tailscale.KeyCapabili
func (c *fakeTSClient) Device(ctx context.Context, deviceID string, fields *tailscale.DeviceFieldsOpts) (*tailscale.Device, error) {
return &tailscale.Device{
DeviceID: deviceID,
Hostname: "test-device",
Hostname: "hostname-" + deviceID,
Addresses: []string{
"1.2.3.4",
"::1",
@@ -631,6 +639,14 @@ func removeHashAnnotation(sts *appsv1.StatefulSet) {
delete(sts.Spec.Template.Annotations, podAnnotationLastSetConfigFileHash)
}
func removeTargetPortsFromSvc(svc *corev1.Service) {
newPorts := make([]corev1.ServicePort, 0)
for _, p := range svc.Spec.Ports {
newPorts = append(newPorts, corev1.ServicePort{Protocol: p.Protocol, Port: p.Port})
}
svc.Spec.Ports = newPorts
}
func removeAuthKeyIfExistsModifier(t *testing.T) func(s *corev1.Secret) {
return func(secret *corev1.Secret) {
t.Helper()

View File

@@ -199,7 +199,7 @@ func (r *RecorderReconciler) maybeProvision(ctx context.Context, tsr *tsapi.Reco
return fmt.Errorf("error creating StatefulSet: %w", err)
}
var devices []tsapi.TailnetDevice
var devices []tsapi.RecorderTailnetDevice
device, ok, err := r.getDeviceInfo(ctx, tsr.Name)
if err != nil {
@@ -302,9 +302,7 @@ func (r *RecorderReconciler) validate(tsr *tsapi.Recorder) error {
return nil
}
// getNodeMetadata returns 'ok == true' iff the node ID is found. The dnsName
// is expected to always be non-empty if the node ID is, but not required.
func (r *RecorderReconciler) getNodeMetadata(ctx context.Context, tsrName string) (id tailcfg.StableNodeID, dnsName string, ok bool, err error) {
func (r *RecorderReconciler) getStateSecret(ctx context.Context, tsrName string) (*corev1.Secret, error) {
secret := &corev1.Secret{
ObjectMeta: metav1.ObjectMeta{
Namespace: r.tsNamespace,
@@ -313,12 +311,27 @@ func (r *RecorderReconciler) getNodeMetadata(ctx context.Context, tsrName string
}
if err := r.Get(ctx, client.ObjectKeyFromObject(secret), secret); err != nil {
if apierrors.IsNotFound(err) {
return "", "", false, nil
return nil, nil
}
return nil, fmt.Errorf("error getting state Secret: %w", err)
}
return secret, nil
}
func (r *RecorderReconciler) getNodeMetadata(ctx context.Context, tsrName string) (id tailcfg.StableNodeID, dnsName string, ok bool, err error) {
secret, err := r.getStateSecret(ctx, tsrName)
if err != nil || secret == nil {
return "", "", false, err
}
return getNodeMetadata(ctx, secret)
}
// getNodeMetadata returns 'ok == true' iff the node ID is found. The dnsName
// is expected to always be non-empty if the node ID is, but not required.
func getNodeMetadata(ctx context.Context, secret *corev1.Secret) (id tailcfg.StableNodeID, dnsName string, ok bool, err error) {
// TODO(tomhjp): Should maybe use ipn to parse the following info instead.
currentProfile, ok := secret.Data[currentProfileKey]
if !ok {
@@ -337,20 +350,29 @@ func (r *RecorderReconciler) getNodeMetadata(ctx context.Context, tsrName string
return tailcfg.StableNodeID(profile.Config.NodeID), profile.Config.UserProfile.LoginName, ok, nil
}
func (r *RecorderReconciler) getDeviceInfo(ctx context.Context, tsrName string) (d tsapi.TailnetDevice, ok bool, err error) {
nodeID, dnsName, ok, err := r.getNodeMetadata(ctx, tsrName)
func (r *RecorderReconciler) getDeviceInfo(ctx context.Context, tsrName string) (d tsapi.RecorderTailnetDevice, ok bool, err error) {
secret, err := r.getStateSecret(ctx, tsrName)
if err != nil || secret == nil {
return tsapi.RecorderTailnetDevice{}, false, err
}
return getDeviceInfo(ctx, r.tsClient, secret)
}
func getDeviceInfo(ctx context.Context, tsClient tsClient, secret *corev1.Secret) (d tsapi.RecorderTailnetDevice, ok bool, err error) {
nodeID, dnsName, ok, err := getNodeMetadata(ctx, secret)
if !ok || err != nil {
return tsapi.TailnetDevice{}, false, err
return tsapi.RecorderTailnetDevice{}, false, err
}
// TODO(tomhjp): The profile info doesn't include addresses, which is why we
// need the API. Should we instead update the profile to include addresses?
device, err := r.tsClient.Device(ctx, string(nodeID), nil)
device, err := tsClient.Device(ctx, string(nodeID), nil)
if err != nil {
return tsapi.TailnetDevice{}, false, fmt.Errorf("failed to get device info from API: %w", err)
return tsapi.RecorderTailnetDevice{}, false, fmt.Errorf("failed to get device info from API: %w", err)
}
d = tsapi.TailnetDevice{
d = tsapi.RecorderTailnetDevice{
Hostname: device.Hostname,
TailnetIPs: device.Addresses,
}
@@ -370,6 +392,6 @@ type profile struct {
} `json:"Config"`
}
func markedForDeletion(tsr *tsapi.Recorder) bool {
return !tsr.DeletionTimestamp.IsZero()
func markedForDeletion(obj metav1.Object) bool {
return !obj.GetDeletionTimestamp().IsZero()
}

View File

@@ -105,9 +105,9 @@ func TestRecorder(t *testing.T) {
})
expectReconciled(t, reconciler, "", tsr.Name)
tsr.Status.Devices = []tsapi.TailnetDevice{
tsr.Status.Devices = []tsapi.RecorderTailnetDevice{
{
Hostname: "test-device",
Hostname: "hostname-nodeid-123",
TailnetIPs: []string{"1.2.3.4", "::1"},
URL: "https://test-0.example.ts.net",
},

View File

@@ -91,7 +91,7 @@ tailscale.com/cmd/stund dependencies: (generated by github.com/tailscale/depawar
golang.org/x/crypto/nacl/secretbox from golang.org/x/crypto/nacl/box
golang.org/x/crypto/salsa20/salsa from golang.org/x/crypto/nacl/box+
golang.org/x/crypto/sha3 from crypto/internal/mlkem768+
golang.org/x/net/dns/dnsmessage from net
golang.org/x/net/dns/dnsmessage from net+
golang.org/x/net/http/httpguts from net/http
golang.org/x/net/http/httpproxy from net/http
golang.org/x/net/http2/hpack from net/http

View File

@@ -0,0 +1,78 @@
// Copyright (c) Tailscale Inc & AUTHORS
// SPDX-License-Identifier: BSD-3-Clause
package cli
import (
"context"
"flag"
"fmt"
"strings"
"github.com/peterbourgon/ff/v3/ffcli"
"tailscale.com/envknob"
"tailscale.com/ipn"
"tailscale.com/tailcfg"
)
var advertiseArgs struct {
services string // comma-separated list of services to advertise
}
// TODO(naman): This flag may move to set.go or serve_v2.go after the WIPCode
// envknob is not needed.
var advertiseCmd = &ffcli.Command{
Name: "advertise",
ShortUsage: "tailscale advertise --services=<services>",
ShortHelp: "Advertise this node as a destination for a service",
Exec: runAdvertise,
FlagSet: (func() *flag.FlagSet {
fs := newFlagSet("advertise")
fs.StringVar(&advertiseArgs.services, "services", "", "comma-separated services to advertise; each must start with \"svc:\" (e.g. \"svc:idp,svc:nas,svc:database\")")
return fs
})(),
}
func maybeAdvertiseCmd() []*ffcli.Command {
if !envknob.UseWIPCode() {
return nil
}
return []*ffcli.Command{advertiseCmd}
}
func runAdvertise(ctx context.Context, args []string) error {
if len(args) > 0 {
return flag.ErrHelp
}
services, err := parseServiceNames(advertiseArgs.services)
if err != nil {
return err
}
_, err = localClient.EditPrefs(ctx, &ipn.MaskedPrefs{
AdvertiseServicesSet: true,
Prefs: ipn.Prefs{
AdvertiseServices: services,
},
})
return err
}
// parseServiceNames takes a comma-separated list of service names
// (eg. "svc:hello,svc:webserver,svc:catphotos"), splits them into
// a list and validates each service name. If valid, it returns
// the service names in a slice of strings.
func parseServiceNames(servicesArg string) ([]string, error) {
var services []string
if servicesArg != "" {
services = strings.Split(servicesArg, ",")
for _, svc := range services {
err := tailcfg.CheckServiceName(svc)
if err != nil {
return nil, fmt.Errorf("service %q: %s", svc, err)
}
}
}
return services, nil
}

View File

@@ -177,7 +177,7 @@ For help on subcommands, add --help after: "tailscale status --help".
This CLI is still under active development. Commands and flags will
change in the future.
`),
Subcommands: []*ffcli.Command{
Subcommands: append([]*ffcli.Command{
upCmd,
downCmd,
setCmd,
@@ -207,7 +207,7 @@ change in the future.
debugCmd,
driveCmd,
idTokenCmd,
},
}, maybeAdvertiseCmd()...),
FlagSet: rootfs,
Exec: func(ctx context.Context, args []string) error {
if len(args) > 0 {

View File

@@ -946,6 +946,10 @@ func TestPrefFlagMapping(t *testing.T) {
// Handled by the tailscale share subcommand, we don't want a CLI
// flag for this.
continue
case "AdvertiseServices":
// Handled by the tailscale advertise subcommand, we don't want a
// CLI flag for this.
continue
case "InternalExitNodePrior":
// Used internally by LocalBackend as part of exit node usage toggling.
// No CLI flag for this.
@@ -1448,7 +1452,7 @@ func TestParseNLArgs(t *testing.T) {
name: "disablements not allowed",
input: []string{"disablement:" + strings.Repeat("02", 32)},
parseKeys: true,
wantErr: fmt.Errorf("parsing key 1: key hex string doesn't have expected type prefix nlpub:"),
wantErr: fmt.Errorf("parsing key 1: key hex string doesn't have expected type prefix tlpub:"),
},
{
name: "keys not allowed",

View File

@@ -844,7 +844,8 @@ func runTS2021(ctx context.Context, args []string) error {
if ts2021Args.verbose {
logf = log.Printf
}
conn, err := (&controlhttp.Dialer{
noiseDialer := &controlhttp.Dialer{
Hostname: ts2021Args.host,
HTTPPort: "80",
HTTPSPort: "443",
@@ -853,7 +854,21 @@ func runTS2021(ctx context.Context, args []string) error {
ProtocolVersion: uint16(ts2021Args.version),
Dialer: dialFunc,
Logf: logf,
}).Dial(ctx)
}
const tries = 2
for i := range tries {
err := tryConnect(ctx, keys.PublicKey, noiseDialer)
if err != nil {
log.Printf("error on attempt %d/%d: %v", i+1, tries, err)
continue
}
break
}
return nil
}
func tryConnect(ctx context.Context, controlPublic key.MachinePublic, noiseDialer *controlhttp.Dialer) error {
conn, err := noiseDialer.Dial(ctx)
log.Printf("controlhttp.Dial = %p, %v", conn, err)
if err != nil {
return err
@@ -861,8 +876,8 @@ func runTS2021(ctx context.Context, args []string) error {
log.Printf("did noise handshake")
gotPeer := conn.Peer()
if gotPeer != keys.PublicKey {
log.Printf("peer = %v, want %v", gotPeer, keys.PublicKey)
if gotPeer != controlPublic {
log.Printf("peer = %v, want %v", gotPeer, controlPublic)
return errors.New("key mismatch")
}
@@ -894,7 +909,7 @@ func runTS2021(ctx context.Context, args []string) error {
// Make a /whoami request to the server to verify that we can actually
// communicate over the newly-established connection.
whoamiURL := "http://" + ts2021Args.host + "/machine/whoami"
req, err = http.NewRequestWithContext(ctx, "GET", whoamiURL, nil)
req, err := http.NewRequestWithContext(ctx, "GET", whoamiURL, nil)
if err != nil {
return err
}

View File

@@ -0,0 +1,163 @@
// Copyright (c) Tailscale Inc & AUTHORS
// SPDX-License-Identifier: BSD-3-Clause
package cli
import (
"context"
"flag"
"fmt"
"net/netip"
"os"
"text/tabwriter"
"golang.org/x/net/dns/dnsmessage"
"tailscale.com/types/dnstype"
)
func runDNSQuery(ctx context.Context, args []string) error {
if len(args) < 1 {
return flag.ErrHelp
}
name := args[0]
queryType := "A"
if len(args) >= 2 {
queryType = args[1]
}
fmt.Printf("DNS query for %q (%s) using internal resolver:\n", name, queryType)
fmt.Println()
bytes, resolvers, err := localClient.QueryDNS(ctx, name, queryType)
if err != nil {
fmt.Printf("failed to query DNS: %v\n", err)
return nil
}
if len(resolvers) == 1 {
fmt.Printf("Forwarding to resolver: %v\n", makeResolverString(*resolvers[0]))
} else {
fmt.Println("Multiple resolvers available:")
for _, r := range resolvers {
fmt.Printf(" - %v\n", makeResolverString(*r))
}
}
fmt.Println()
var p dnsmessage.Parser
header, err := p.Start(bytes)
if err != nil {
fmt.Printf("failed to parse DNS response: %v\n", err)
return err
}
fmt.Printf("Response code: %v\n", header.RCode.String())
fmt.Println()
p.SkipAllQuestions()
if header.RCode != dnsmessage.RCodeSuccess {
fmt.Println("No answers were returned.")
return nil
}
answers, err := p.AllAnswers()
if err != nil {
fmt.Printf("failed to parse DNS answers: %v\n", err)
return err
}
if len(answers) == 0 {
fmt.Println(" (no answers found)")
}
w := tabwriter.NewWriter(os.Stdout, 0, 0, 2, ' ', 0)
fmt.Fprintln(w, "Name\tTTL\tClass\tType\tBody")
fmt.Fprintln(w, "----\t---\t-----\t----\t----")
for _, a := range answers {
fmt.Fprintf(w, "%s\t%d\t%s\t%s\t%s\n", a.Header.Name.String(), a.Header.TTL, a.Header.Class.String(), a.Header.Type.String(), makeAnswerBody(a))
}
w.Flush()
fmt.Println()
return nil
}
// makeAnswerBody returns a string with the DNS answer body in a human-readable format.
func makeAnswerBody(a dnsmessage.Resource) string {
switch a.Header.Type {
case dnsmessage.TypeA:
return makeABody(a.Body)
case dnsmessage.TypeAAAA:
return makeAAAABody(a.Body)
case dnsmessage.TypeCNAME:
return makeCNAMEBody(a.Body)
case dnsmessage.TypeMX:
return makeMXBody(a.Body)
case dnsmessage.TypeNS:
return makeNSBody(a.Body)
case dnsmessage.TypeOPT:
return makeOPTBody(a.Body)
case dnsmessage.TypePTR:
return makePTRBody(a.Body)
case dnsmessage.TypeSRV:
return makeSRVBody(a.Body)
case dnsmessage.TypeTXT:
return makeTXTBody(a.Body)
default:
return a.Body.GoString()
}
}
func makeABody(a dnsmessage.ResourceBody) string {
if a, ok := a.(*dnsmessage.AResource); ok {
return netip.AddrFrom4(a.A).String()
}
return ""
}
func makeAAAABody(aaaa dnsmessage.ResourceBody) string {
if a, ok := aaaa.(*dnsmessage.AAAAResource); ok {
return netip.AddrFrom16(a.AAAA).String()
}
return ""
}
func makeCNAMEBody(cname dnsmessage.ResourceBody) string {
if c, ok := cname.(*dnsmessage.CNAMEResource); ok {
return c.CNAME.String()
}
return ""
}
func makeMXBody(mx dnsmessage.ResourceBody) string {
if m, ok := mx.(*dnsmessage.MXResource); ok {
return fmt.Sprintf("%s (Priority=%d)", m.MX, m.Pref)
}
return ""
}
func makeNSBody(ns dnsmessage.ResourceBody) string {
if n, ok := ns.(*dnsmessage.NSResource); ok {
return n.NS.String()
}
return ""
}
func makeOPTBody(opt dnsmessage.ResourceBody) string {
if o, ok := opt.(*dnsmessage.OPTResource); ok {
return o.GoString()
}
return ""
}
func makePTRBody(ptr dnsmessage.ResourceBody) string {
if p, ok := ptr.(*dnsmessage.PTRResource); ok {
return p.PTR.String()
}
return ""
}
func makeSRVBody(srv dnsmessage.ResourceBody) string {
if s, ok := srv.(*dnsmessage.SRVResource); ok {
return fmt.Sprintf("Target=%s, Port=%d, Priority=%d, Weight=%d", s.Target.String(), s.Port, s.Priority, s.Weight)
}
return ""
}
func makeTXTBody(txt dnsmessage.ResourceBody) string {
if t, ok := txt.(*dnsmessage.TXTResource); ok {
return fmt.Sprintf("%q", t.TXT)
}
return ""
}
func makeResolverString(r dnstype.Resolver) string {
if len(r.BootstrapResolution) > 0 {
return fmt.Sprintf("%s (bootstrap: %v)", r.Addr, r.BootstrapResolution)
}
return fmt.Sprintf("%s", r.Addr)
}

View File

@@ -75,7 +75,7 @@ func runDNSStatus(ctx context.Context, args []string) error {
fmt.Print("\n")
fmt.Println("Split DNS Routes:")
if len(dnsConfig.Routes) == 0 {
fmt.Println(" (no routes configured: split DNS might not be in use)")
fmt.Println(" (no routes configured: split DNS disabled)")
}
for _, k := range slices.Sorted(maps.Keys(dnsConfig.Routes)) {
v := dnsConfig.Routes[k]

View File

@@ -28,8 +28,13 @@ var dnsCmd = &ffcli.Command{
return fs
})(),
},
// TODO: implement `tailscale query` here
{
Name: "query",
ShortUsage: "tailscale dns query <name> [a|aaaa|cname|mx|ns|opt|ptr|srv|txt]",
Exec: runDNSQuery,
ShortHelp: "Perform a DNS query",
LongHelp: "The 'tailscale dns query' subcommand performs a DNS query for the specified name using the internal DNS forwarder (100.100.100.100).\n\nIt also provides information about the resolver(s) used to resolve the query.",
},
// TODO: implement `tailscale log` here

View File

@@ -136,6 +136,7 @@ func printReport(dm *tailcfg.DERPMap, report *netcheck.Report) error {
}
printf("\nReport:\n")
printf("\t* Time: %v\n", report.Now.Format(time.RFC3339Nano))
printf("\t* UDP: %v\n", report.UDP)
if report.GlobalV4.IsValid() {
printf("\t* IPv4: yes, %s\n", report.GlobalV4)

View File

@@ -151,13 +151,15 @@ func runNetworkLockInit(ctx context.Context, args []string) error {
return nil
}
fmt.Printf("%d disablement secrets have been generated and are printed below. Take note of them now, they WILL NOT be shown again.\n", nlInitArgs.numDisablements)
var successMsg strings.Builder
fmt.Fprintf(&successMsg, "%d disablement secrets have been generated and are printed below. Take note of them now, they WILL NOT be shown again.\n", nlInitArgs.numDisablements)
for range nlInitArgs.numDisablements {
var secret [32]byte
if _, err := rand.Read(secret[:]); err != nil {
return err
}
fmt.Printf("\tdisablement-secret:%X\n", secret[:])
fmt.Fprintf(&successMsg, "\tdisablement-secret:%X\n", secret[:])
disablementValues = append(disablementValues, tka.DisablementKDF(secret[:]))
}
@@ -168,7 +170,7 @@ func runNetworkLockInit(ctx context.Context, args []string) error {
return err
}
disablementValues = append(disablementValues, tka.DisablementKDF(supportDisablement))
fmt.Println("A disablement secret for Tailscale support has been generated and will be transmitted to Tailscale upon initialization.")
fmt.Fprintln(&successMsg, "A disablement secret for Tailscale support has been generated and transmitted to Tailscale.")
}
// The state returned by NetworkLockInit likely doesn't contain the initialized state,
@@ -177,6 +179,7 @@ func runNetworkLockInit(ctx context.Context, args []string) error {
return err
}
fmt.Print(successMsg.String())
fmt.Println("Initialization complete.")
return nil
}

View File

@@ -32,10 +32,12 @@ import (
"tailscale.com/ipn"
"tailscale.com/ipn/ipnstate"
"tailscale.com/net/netutil"
"tailscale.com/net/tsaddr"
"tailscale.com/safesocket"
"tailscale.com/tailcfg"
"tailscale.com/types/logger"
"tailscale.com/types/preftype"
"tailscale.com/types/views"
"tailscale.com/util/dnsname"
"tailscale.com/version"
"tailscale.com/version/distro"
@@ -162,6 +164,9 @@ func defaultNetfilterMode() string {
return "on"
}
// upArgsT is the type of upArgs, the argument struct for `tailscale up`.
// As of 2024-10-08, upArgsT is frozen and no new arguments should be
// added to it. Add new arguments to setArgsT instead.
type upArgsT struct {
qr bool
reset bool
@@ -1015,7 +1020,7 @@ func prefsToFlags(env upCheckEnv, prefs *ipn.Prefs) (flagVal map[string]any) {
set(prefs.OperatorUser)
case "advertise-routes":
var sb strings.Builder
for i, r := range withoutExitNodes(prefs.AdvertiseRoutes) {
for i, r := range tsaddr.WithoutExitRoutes(views.SliceOf(prefs.AdvertiseRoutes)).All() {
if i > 0 {
sb.WriteByte(',')
}
@@ -1023,7 +1028,7 @@ func prefsToFlags(env upCheckEnv, prefs *ipn.Prefs) (flagVal map[string]any) {
}
set(sb.String())
case "advertise-exit-node":
set(hasExitNodeRoutes(prefs.AdvertiseRoutes))
set(tsaddr.ContainsExitRoutes(views.SliceOf(prefs.AdvertiseRoutes)))
case "advertise-connector":
set(prefs.AppConnector.Advertise)
case "snat-subnet-routes":
@@ -1057,36 +1062,6 @@ func fmtFlagValueArg(flagName string, val any) string {
return fmt.Sprintf("--%s=%v", flagName, shellquote.Join(fmt.Sprint(val)))
}
func hasExitNodeRoutes(rr []netip.Prefix) bool {
var v4, v6 bool
for _, r := range rr {
if r.Bits() == 0 {
if r.Addr().Is4() {
v4 = true
} else if r.Addr().Is6() {
v6 = true
}
}
}
return v4 && v6
}
// withoutExitNodes returns rr unchanged if it has only 1 or 0 /0
// routes. If it has both IPv4 and IPv6 /0 routes, then it returns
// a copy with all /0 routes removed.
func withoutExitNodes(rr []netip.Prefix) []netip.Prefix {
if !hasExitNodeRoutes(rr) {
return rr
}
var out []netip.Prefix
for _, r := range rr {
if r.Bits() > 0 {
out = append(out, r)
}
}
return out
}
// exitNodeIP returns the exit node IP from p, using st to map
// it from its ID form to an IP address if needed.
func exitNodeIP(p *ipn.Prefs, st *ipnstate.Status) (ip netip.Addr) {
@@ -1180,6 +1155,7 @@ func resolveAuthKey(ctx context.Context, v, tags string) (string, error) {
}
tsClient := tailscale.NewClient("-", nil)
tsClient.UserAgent = "tailscale-cli"
tsClient.HTTPClient = credentials.Client(ctx)
tsClient.BaseURL = baseURL

View File

@@ -26,7 +26,7 @@ tailscale.com/cmd/tailscale dependencies: (generated by github.com/tailscale/dep
L github.com/google/nftables/expr from github.com/google/nftables+
L github.com/google/nftables/internal/parseexprfunc from github.com/google/nftables+
L github.com/google/nftables/xt from github.com/google/nftables/expr+
github.com/google/uuid from tailscale.com/clientupdate+
DW github.com/google/uuid from tailscale.com/clientupdate+
github.com/gorilla/csrf from tailscale.com/client/web
github.com/gorilla/securecookie from github.com/gorilla/csrf
github.com/hdevalence/ed25519consensus from tailscale.com/clientupdate/distsign+
@@ -80,7 +80,7 @@ tailscale.com/cmd/tailscale dependencies: (generated by github.com/tailscale/dep
tailscale.com/client/tailscale/apitype from tailscale.com/client/tailscale+
tailscale.com/client/web from tailscale.com/cmd/tailscale/cli
tailscale.com/clientupdate from tailscale.com/client/web+
tailscale.com/clientupdate/distsign from tailscale.com/clientupdate
LW tailscale.com/clientupdate/distsign from tailscale.com/clientupdate
tailscale.com/cmd/tailscale/cli from tailscale.com/cmd/tailscale
tailscale.com/cmd/tailscale/cli/ffcomplete from tailscale.com/cmd/tailscale/cli
tailscale.com/cmd/tailscale/cli/ffcomplete/internal from tailscale.com/cmd/tailscale/cli/ffcomplete
@@ -92,6 +92,7 @@ tailscale.com/cmd/tailscale dependencies: (generated by github.com/tailscale/dep
tailscale.com/disco from tailscale.com/derp
tailscale.com/drive from tailscale.com/client/tailscale+
tailscale.com/envknob from tailscale.com/client/tailscale+
tailscale.com/envknob/featureknob from tailscale.com/client/web
tailscale.com/health from tailscale.com/net/tlsdial+
tailscale.com/health/healthmsg from tailscale.com/cmd/tailscale/cli
tailscale.com/hostinfo from tailscale.com/client/web+
@@ -120,6 +121,7 @@ tailscale.com/cmd/tailscale dependencies: (generated by github.com/tailscale/dep
tailscale.com/net/stun from tailscale.com/net/netcheck
L tailscale.com/net/tcpinfo from tailscale.com/derp
tailscale.com/net/tlsdial from tailscale.com/cmd/tailscale/cli+
tailscale.com/net/tlsdial/blockblame from tailscale.com/net/tlsdial
tailscale.com/net/tsaddr from tailscale.com/client/web+
💣 tailscale.com/net/tshttpproxy from tailscale.com/clientupdate/distsign+
tailscale.com/net/wsconn from tailscale.com/control/controlhttp+
@@ -134,7 +136,7 @@ tailscale.com/cmd/tailscale dependencies: (generated by github.com/tailscale/dep
tailscale.com/tstime/mono from tailscale.com/tstime/rate
tailscale.com/tstime/rate from tailscale.com/cmd/tailscale/cli+
tailscale.com/tsweb/varz from tailscale.com/util/usermetric
tailscale.com/types/dnstype from tailscale.com/tailcfg
tailscale.com/types/dnstype from tailscale.com/tailcfg+
tailscale.com/types/empty from tailscale.com/ipn
tailscale.com/types/ipproto from tailscale.com/net/flowtrack+
tailscale.com/types/key from tailscale.com/client/tailscale+
@@ -153,7 +155,7 @@ tailscale.com/cmd/tailscale dependencies: (generated by github.com/tailscale/dep
tailscale.com/util/clientmetric from tailscale.com/net/netcheck+
tailscale.com/util/cloudenv from tailscale.com/net/dnscache+
tailscale.com/util/cmpver from tailscale.com/net/tshttpproxy+
tailscale.com/util/ctxkey from tailscale.com/types/logger
tailscale.com/util/ctxkey from tailscale.com/types/logger+
💣 tailscale.com/util/deephash from tailscale.com/util/syspolicy/setting
L 💣 tailscale.com/util/dirwalk from tailscale.com/metrics
tailscale.com/util/dnsname from tailscale.com/cmd/tailscale/cli+
@@ -171,13 +173,14 @@ tailscale.com/cmd/tailscale dependencies: (generated by github.com/tailscale/dep
tailscale.com/util/singleflight from tailscale.com/net/dnscache+
tailscale.com/util/slicesx from tailscale.com/net/dns/recursive+
tailscale.com/util/syspolicy from tailscale.com/ipn
tailscale.com/util/syspolicy/internal from tailscale.com/util/syspolicy/setting
tailscale.com/util/syspolicy/internal from tailscale.com/util/syspolicy/setting+
tailscale.com/util/syspolicy/internal/loggerx from tailscale.com/util/syspolicy
tailscale.com/util/syspolicy/setting from tailscale.com/util/syspolicy
tailscale.com/util/testenv from tailscale.com/cmd/tailscale/cli
tailscale.com/util/truncate from tailscale.com/cmd/tailscale/cli
tailscale.com/util/usermetric from tailscale.com/health
tailscale.com/util/vizerror from tailscale.com/tailcfg+
💣 tailscale.com/util/winutil from tailscale.com/clientupdate+
W 💣 tailscale.com/util/winutil from tailscale.com/clientupdate+
W 💣 tailscale.com/util/winutil/authenticode from tailscale.com/clientupdate
W 💣 tailscale.com/util/winutil/winenv from tailscale.com/hostinfo+
tailscale.com/version from tailscale.com/client/web+
@@ -257,7 +260,7 @@ tailscale.com/cmd/tailscale dependencies: (generated by github.com/tailscale/dep
crypto/tls from github.com/miekg/dns+
crypto/x509 from crypto/tls+
crypto/x509/pkix from crypto/x509+
database/sql/driver from github.com/google/uuid
DW database/sql/driver from github.com/google/uuid
W debug/dwarf from debug/pe
W debug/pe from github.com/dblohm7/wingoes/pe
embed from crypto/internal/nistec+

View File

@@ -111,7 +111,7 @@ tailscale.com/cmd/tailscaled dependencies: (generated by github.com/tailscale/de
L github.com/google/nftables/expr from github.com/google/nftables+
L github.com/google/nftables/internal/parseexprfunc from github.com/google/nftables+
L github.com/google/nftables/xt from github.com/google/nftables/expr+
github.com/google/uuid from tailscale.com/clientupdate+
DW github.com/google/uuid from tailscale.com/clientupdate+
github.com/gorilla/csrf from tailscale.com/client/web
github.com/gorilla/securecookie from github.com/gorilla/csrf
github.com/hdevalence/ed25519consensus from tailscale.com/clientupdate/distsign+
@@ -221,7 +221,7 @@ tailscale.com/cmd/tailscaled dependencies: (generated by github.com/tailscale/de
gvisor.dev/gvisor/pkg/tcpip/network/internal/ip from gvisor.dev/gvisor/pkg/tcpip/network/ipv4+
gvisor.dev/gvisor/pkg/tcpip/network/internal/multicast from gvisor.dev/gvisor/pkg/tcpip/network/ipv4+
gvisor.dev/gvisor/pkg/tcpip/network/ipv4 from tailscale.com/net/tstun+
gvisor.dev/gvisor/pkg/tcpip/network/ipv6 from tailscale.com/wgengine/netstack
gvisor.dev/gvisor/pkg/tcpip/network/ipv6 from tailscale.com/wgengine/netstack+
gvisor.dev/gvisor/pkg/tcpip/ports from gvisor.dev/gvisor/pkg/tcpip/stack+
gvisor.dev/gvisor/pkg/tcpip/seqnum from gvisor.dev/gvisor/pkg/tcpip/header+
💣 gvisor.dev/gvisor/pkg/tcpip/stack from gvisor.dev/gvisor/pkg/tcpip/adapters/gonet+
@@ -244,12 +244,13 @@ tailscale.com/cmd/tailscaled dependencies: (generated by github.com/tailscale/de
tailscale.com/client/tailscale/apitype from tailscale.com/client/tailscale+
tailscale.com/client/web from tailscale.com/ipn/ipnlocal
tailscale.com/clientupdate from tailscale.com/client/web+
tailscale.com/clientupdate/distsign from tailscale.com/clientupdate
LW tailscale.com/clientupdate/distsign from tailscale.com/clientupdate
tailscale.com/cmd/tailscaled/childproc from tailscale.com/cmd/tailscaled+
tailscale.com/control/controlbase from tailscale.com/control/controlhttp+
tailscale.com/control/controlclient from tailscale.com/cmd/tailscaled+
tailscale.com/control/controlhttp from tailscale.com/control/controlclient
tailscale.com/control/controlknobs from tailscale.com/control/controlclient+
tailscale.com/control/keyfallback from tailscale.com/control/controlclient
tailscale.com/derp from tailscale.com/derp/derphttp+
tailscale.com/derp/derphttp from tailscale.com/cmd/tailscaled+
tailscale.com/disco from tailscale.com/derp+
@@ -263,6 +264,7 @@ tailscale.com/cmd/tailscaled dependencies: (generated by github.com/tailscale/de
tailscale.com/drive/driveimpl/dirfs from tailscale.com/drive/driveimpl+
tailscale.com/drive/driveimpl/shared from tailscale.com/drive/driveimpl+
tailscale.com/envknob from tailscale.com/client/tailscale+
tailscale.com/envknob/featureknob from tailscale.com/client/web+
tailscale.com/health from tailscale.com/control/controlclient+
tailscale.com/health/healthmsg from tailscale.com/ipn/ipnlocal
tailscale.com/hostinfo from tailscale.com/client/web+
@@ -321,6 +323,7 @@ tailscale.com/cmd/tailscaled dependencies: (generated by github.com/tailscale/de
tailscale.com/net/stun from tailscale.com/ipn/localapi+
L tailscale.com/net/tcpinfo from tailscale.com/derp
tailscale.com/net/tlsdial from tailscale.com/control/controlclient+
tailscale.com/net/tlsdial/blockblame from tailscale.com/net/tlsdial
tailscale.com/net/tsaddr from tailscale.com/client/web+
tailscale.com/net/tsdial from tailscale.com/cmd/tailscaled+
💣 tailscale.com/net/tshttpproxy from tailscale.com/clientupdate/distsign+
@@ -398,7 +401,8 @@ tailscale.com/cmd/tailscaled dependencies: (generated by github.com/tailscale/de
tailscale.com/util/singleflight from tailscale.com/control/controlclient+
tailscale.com/util/slicesx from tailscale.com/net/dns/recursive+
tailscale.com/util/syspolicy from tailscale.com/cmd/tailscaled+
tailscale.com/util/syspolicy/internal from tailscale.com/util/syspolicy/setting
tailscale.com/util/syspolicy/internal from tailscale.com/util/syspolicy/setting+
tailscale.com/util/syspolicy/internal/loggerx from tailscale.com/util/syspolicy
tailscale.com/util/syspolicy/setting from tailscale.com/util/syspolicy
tailscale.com/util/sysresources from tailscale.com/wgengine/magicsock
tailscale.com/util/systemd from tailscale.com/control/controlclient+
@@ -507,7 +511,7 @@ tailscale.com/cmd/tailscaled dependencies: (generated by github.com/tailscale/de
crypto/tls from github.com/aws/aws-sdk-go-v2/aws/transport/http+
crypto/x509 from crypto/tls+
crypto/x509/pkix from crypto/x509+
database/sql/driver from github.com/google/uuid
DW database/sql/driver from github.com/google/uuid
W debug/dwarf from debug/pe
W debug/pe from github.com/dblohm7/wingoes/pe
embed from crypto/internal/nistec+

View File

@@ -680,12 +680,15 @@ func tryEngine(logf logger.Logf, sys *tsd.System, name string) (onlyNetstack boo
ListenPort: args.port,
NetMon: sys.NetMon.Get(),
HealthTracker: sys.HealthTracker(),
Metrics: sys.UserMetricsRegistry(),
Dialer: sys.Dialer.Get(),
SetSubsystem: sys.Set,
ControlKnobs: sys.ControlKnobs(),
DriveForLocal: driveimpl.NewFileSystemForLocal(logf),
}
sys.HealthTracker().SetMetricsRegistry(sys.UserMetricsRegistry())
onlyNetstack = name == "userspace-networking"
netstackSubnetRouter := onlyNetstack // but mutated later on some platforms
netns.SetEnabled(!onlyNetstack)

View File

@@ -108,6 +108,7 @@ func newIPN(jsConfig js.Value) map[string]any {
SetSubsystem: sys.Set,
ControlKnobs: sys.ControlKnobs(),
HealthTracker: sys.HealthTracker(),
Metrics: sys.UserMetricsRegistry(),
})
if err != nil {
log.Fatal(err)
@@ -128,6 +129,9 @@ func newIPN(jsConfig js.Value) map[string]any {
dialer.NetstackDialTCP = func(ctx context.Context, dst netip.AddrPort) (net.Conn, error) {
return ns.DialContextTCP(ctx, dst)
}
dialer.NetstackDialUDP = func(ctx context.Context, dst netip.AddrPort) (net.Conn, error) {
return ns.DialContextUDP(ctx, dst)
}
sys.NetstackRouter.Set(true)
sys.Tun.Get().Start()

View File

@@ -64,6 +64,7 @@ var (
flagLocalPort = flag.Int("local-port", -1, "allow requests from localhost")
flagUseLocalTailscaled = flag.Bool("use-local-tailscaled", false, "use local tailscaled instead of tsnet")
flagFunnel = flag.Bool("funnel", false, "use Tailscale Funnel to make tsidp available on the public internet")
flagDir = flag.String("dir", "", "tsnet state directory; a default one will be created if not provided")
)
func main() {
@@ -120,6 +121,7 @@ func main() {
} else {
ts := &tsnet.Server{
Hostname: "idp",
Dir: *flagDir,
}
if *flagVerbose {
ts.Logf = log.Printf

View File

@@ -258,6 +258,7 @@ func genView(buf *bytes.Buffer, it *codegen.ImportTracker, typ *types.Named, thi
writeTemplate("unsupportedField")
continue
}
it.Import("tailscale.com/types/views")
args.MapKeyType = it.QualifiedName(key)
mElem := m.Elem()
var template string

78
cmd/viewer/viewer_test.go Normal file
View File

@@ -0,0 +1,78 @@
// Copyright (c) Tailscale Inc & AUTHORS
// SPDX-License-Identifier: BSD-3-Clause
package main
import (
"bytes"
"fmt"
"go/ast"
"go/parser"
"go/token"
"go/types"
"testing"
"tailscale.com/util/codegen"
)
func TestViewerImports(t *testing.T) {
tests := []struct {
name string
content string
typeNames []string
wantImports []string
}{
{
name: "Map",
content: `type Test struct { Map map[string]int }`,
typeNames: []string{"Test"},
wantImports: []string{"tailscale.com/types/views"},
},
{
name: "Slice",
content: `type Test struct { Slice []int }`,
typeNames: []string{"Test"},
wantImports: []string{"tailscale.com/types/views"},
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
fset := token.NewFileSet()
f, err := parser.ParseFile(fset, "test.go", "package test\n\n"+tt.content, 0)
if err != nil {
fmt.Println("Error parsing:", err)
return
}
info := &types.Info{
Types: make(map[ast.Expr]types.TypeAndValue),
}
conf := types.Config{}
pkg, err := conf.Check("", fset, []*ast.File{f}, info)
if err != nil {
t.Fatal(err)
}
var output bytes.Buffer
tracker := codegen.NewImportTracker(pkg)
for i := range tt.typeNames {
typeName, ok := pkg.Scope().Lookup(tt.typeNames[i]).(*types.TypeName)
if !ok {
t.Fatalf("type %q does not exist", tt.typeNames[i])
}
namedType, ok := typeName.Type().(*types.Named)
if !ok {
t.Fatalf("%q is not a named type", tt.typeNames[i])
}
genView(&output, tracker, namedType, pkg)
}
for _, pkgName := range tt.wantImports {
if !tracker.Has(pkgName) {
t.Errorf("missing import %q", pkgName)
}
}
})
}
}

View File

@@ -29,9 +29,11 @@ import (
"go4.org/mem"
"tailscale.com/control/controlknobs"
"tailscale.com/control/keyfallback"
"tailscale.com/envknob"
"tailscale.com/health"
"tailscale.com/hostinfo"
"tailscale.com/ipn"
"tailscale.com/ipn/ipnstate"
"tailscale.com/logtail"
"tailscale.com/net/dnscache"
@@ -82,12 +84,15 @@ type Direct struct {
onControlTime func(time.Time) // or nil
onTailnetDefaultAutoUpdate func(bool) // or nil
panicOnUse bool // if true, panic if client is used (for testing)
closedCtx context.Context // alive until Direct.Close is called
closeCtx context.CancelFunc // cancels closedCtx
dialPlan ControlDialPlanner // can be nil
mu sync.Mutex // mutex guards the following fields
serverLegacyKey key.MachinePublic // original ("legacy") nacl crypto_box-based public key; only used for signRegisterRequest on Windows now
serverNoiseKey key.MachinePublic
mu sync.Mutex // mutex guards the following fields
serverLegacyKey key.MachinePublic // original ("legacy") nacl crypto_box-based public key; only used for signRegisterRequest on Windows now
serverNoiseKey key.MachinePublic
usedFallbackNoiseKey bool // true if we used the baked-in fallback key
sfGroup singleflight.Group[struct{}, *NoiseClient] // protects noiseClient creation.
noiseClient *NoiseClient
@@ -303,6 +308,8 @@ func NewDirect(opts Options) (*Direct, error) {
dnsCache: dnsCache,
dialPlan: opts.DialPlan,
}
c.closedCtx, c.closeCtx = context.WithCancel(context.Background())
if opts.Hostinfo == nil {
c.SetHostinfo(hostinfo.New())
} else {
@@ -325,6 +332,8 @@ func NewDirect(opts Options) (*Direct, error) {
// Close closes the underlying Noise connection(s).
func (c *Direct) Close() error {
c.closeCtx()
c.mu.Lock()
defer c.mu.Unlock()
if c.noiseClient != nil {
@@ -492,6 +501,7 @@ func (c *Direct) doLogin(ctx context.Context, opt loginOpt) (mustRegen bool, new
tryingNewKey := c.tryingNewKey
serverKey := c.serverLegacyKey
serverNoiseKey := c.serverNoiseKey
usedFallback := c.usedFallbackNoiseKey
authKey, isWrapped, wrappedSig, wrappedKey := tka.DecodeWrappedAuthkey(c.authKey, c.logf)
hi := c.hostInfoLocked()
backendLogID := hi.BackendLogID
@@ -522,7 +532,7 @@ func (c *Direct) doLogin(ctx context.Context, opt loginOpt) (mustRegen bool, new
}
c.logf("doLogin(regen=%v, hasUrl=%v)", regen, opt.URL != "")
if serverKey.IsZero() {
if serverKey.IsZero() || usedFallback {
keys, err := loadServerPubKeys(ctx, c.httpc, c.serverURL)
if err != nil && c.interceptedDial != nil && c.interceptedDial.Load() {
c.health.SetUnhealthy(macOSScreenTime, nil)
@@ -530,13 +540,21 @@ func (c *Direct) doLogin(ctx context.Context, opt loginOpt) (mustRegen bool, new
c.health.SetHealthy(macOSScreenTime)
}
if err != nil {
return regen, opt.URL, nil, err
if k2, err := c.getFallbackServerPubKeys(); err == nil {
keys = k2
usedFallback = true
} else {
return regen, opt.URL, nil, err
}
} else {
usedFallback = false
c.logf("control server key from %s: ts2021=%s", c.serverURL, keys.PublicKey.ShortString())
}
c.logf("control server key from %s: ts2021=%s, legacy=%v", c.serverURL, keys.PublicKey.ShortString(), keys.LegacyPublicKey.ShortString())
c.mu.Lock()
c.serverLegacyKey = keys.LegacyPublicKey
c.serverNoiseKey = keys.PublicKey
c.usedFallbackNoiseKey = usedFallback
c.mu.Unlock()
serverKey = keys.LegacyPublicKey
serverNoiseKey = keys.PublicKey
@@ -745,6 +763,22 @@ func (c *Direct) doLogin(ctx context.Context, opt loginOpt) (mustRegen bool, new
return false, resp.AuthURL, nil, nil
}
func (c *Direct) getFallbackServerPubKeys() (*tailcfg.OverTLSPublicKeyResponse, error) {
// If we saw an error, try to use the fallback key if
// we're dialing the default control server.
if ipn.IsLoginServerSynonym(c.serverURL) {
return nil, errors.New("not using default control server")
}
kf, err := keyfallback.Get()
if err != nil {
return nil, err
}
c.logf("using fallback server key: ts2021=%s", kf.PublicKey.ShortString())
return kf, nil
}
// newEndpoints acquires c.mu and sets the local port and endpoints and reports
// whether they've changed.
//
@@ -1223,7 +1257,7 @@ func loadServerPubKeys(ctx context.Context, httpc *http.Client, serverURL string
return nil, fmt.Errorf("fetch control key response: %v", err)
}
if res.StatusCode != 200 {
return nil, fmt.Errorf("fetch control key: %d", res.StatusCode)
return nil, fmt.Errorf("fetch control key: %v", res.Status)
}
var out tailcfg.OverTLSPublicKeyResponse
jsonErr := json.Unmarshal(b, &out)
@@ -1628,7 +1662,7 @@ func (c *Direct) ReportHealthChange(w *health.Warnable, us *health.UnhealthyStat
}
// Best effort, no logging:
ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
ctx, cancel := context.WithTimeout(c.closedCtx, 5*time.Second)
defer cancel()
res, err := np.post(ctx, "/machine/update-health", nodeKey, req)
if err != nil {

View File

@@ -5,6 +5,7 @@ package controlclient
import (
"bytes"
"cmp"
"context"
"encoding/json"
"errors"
@@ -16,6 +17,7 @@ import (
"golang.org/x/net/http2"
"tailscale.com/control/controlhttp"
"tailscale.com/envknob"
"tailscale.com/health"
"tailscale.com/internal/noiseconn"
"tailscale.com/net/dnscache"
@@ -28,6 +30,7 @@ import (
"tailscale.com/util/mak"
"tailscale.com/util/multierr"
"tailscale.com/util/singleflight"
"tailscale.com/util/testenv"
)
// NoiseClient provides a http.Client to connect to tailcontrol over
@@ -56,8 +59,8 @@ type NoiseClient struct {
privKey key.MachinePrivate
serverPubKey key.MachinePublic
host string // the host part of serverURL
httpPort string // the default port to call
httpsPort string // the fallback Noise-over-https port
httpPort string // the default port to dial
httpsPort string // the fallback Noise-over-https port or empty if none
// dialPlan optionally returns a ControlDialPlan previously received
// from the control server; either the function or the return value can
@@ -104,6 +107,11 @@ type NoiseOpts struct {
DialPlan func() *tailcfg.ControlDialPlan
}
// controlIsPlaintext is whether we should assume that the controlplane is only accessible
// over plaintext HTTP (as the first hop, before the ts2021 encryption begins).
// This is used by some tests which don't have a real TLS certificate.
var controlIsPlaintext = envknob.RegisterBool("TS_CONTROL_IS_PLAINTEXT_HTTP")
// NewNoiseClient returns a new noiseClient for the provided server and machine key.
// serverURL is of the form https://<host>:<port> (no trailing slash).
//
@@ -116,14 +124,17 @@ func NewNoiseClient(opts NoiseOpts) (*NoiseClient, error) {
}
var httpPort string
var httpsPort string
if u.Port() != "" {
if port := u.Port(); port != "" {
// If there is an explicit port specified, trust the scheme and hope for the best
if u.Scheme == "http" {
httpPort = u.Port()
httpPort = port
httpsPort = "443"
if (testenv.InTest() || controlIsPlaintext()) && (u.Hostname() == "127.0.0.1" || u.Hostname() == "localhost") {
httpsPort = ""
}
} else {
httpPort = "80"
httpsPort = u.Port()
httpsPort = port
}
} else {
// Otherwise, use the standard ports
@@ -340,7 +351,7 @@ func (nc *NoiseClient) dial(ctx context.Context) (*noiseconn.Conn, error) {
clientConn, err := (&controlhttp.Dialer{
Hostname: nc.host,
HTTPPort: nc.httpPort,
HTTPSPort: nc.httpsPort,
HTTPSPort: cmp.Or(nc.httpsPort, controlhttp.NoPort),
MachineKey: nc.privKey,
ControlKey: nc.serverPubKey,
ProtocolVersion: uint16(tailcfg.CurrentCapabilityVersion),

View File

@@ -86,9 +86,6 @@ func (a *Dialer) getProxyFunc() func(*http.Request) (*url.URL, error) {
// httpsFallbackDelay is how long we'll wait for a.HTTPPort to work before
// starting to try a.HTTPSPort.
func (a *Dialer) httpsFallbackDelay() time.Duration {
if forceNoise443() {
return time.Nanosecond
}
if v := a.testFallbackDelay; v != 0 {
return v
}
@@ -151,10 +148,7 @@ func (a *Dialer) dial(ctx context.Context) (*ClientConn, error) {
// before we do anything.
if c.DialStartDelaySec > 0 {
a.logf("[v2] controlhttp: waiting %.2f seconds before dialing %q @ %v", c.DialStartDelaySec, a.Hostname, c.IP)
if a.Clock == nil {
a.Clock = tstime.StdClock{}
}
tmr, tmrChannel := a.Clock.NewTimer(time.Duration(c.DialStartDelaySec * float64(time.Second)))
tmr, tmrChannel := a.clock().NewTimer(time.Duration(c.DialStartDelaySec * float64(time.Second)))
defer tmr.Stop()
select {
case <-ctx.Done():
@@ -268,12 +262,43 @@ func (a *Dialer) dial(ctx context.Context) (*ClientConn, error) {
// fixed, this is a workaround. It might also be useful for future debugging.
var forceNoise443 = envknob.RegisterBool("TS_FORCE_NOISE_443")
// forceNoise443 reports whether the controlclient noise dialer should always
// use HTTPS connections as its underlay connection (double crypto). This can
// be necessary when networks or middle boxes are messing with port 80.
func (d *Dialer) forceNoise443() bool {
if forceNoise443() {
return true
}
if d.HealthTracker.LastNoiseDialWasRecent() {
// If we dialed recently, assume there was a recent failure and fall
// back to HTTPS dials for the subsequent retries.
//
// This heuristic works around networks where port 80 is MITMed and
// appears to work for a bit post-Upgrade but then gets closed,
// such as seen in https://github.com/tailscale/tailscale/issues/13597.
d.logf("controlhttp: forcing port 443 dial due to recent noise dial")
return true
}
return false
}
func (d *Dialer) clock() tstime.Clock {
if d.Clock != nil {
return d.Clock
}
return tstime.StdClock{}
}
var debugNoiseDial = envknob.RegisterBool("TS_DEBUG_NOISE_DIAL")
// dialHost connects to the configured Dialer.Hostname and upgrades the
// connection into a controlbase.Conn. If addr is valid, then no DNS is used
// and the connection will be made to the provided address.
func (a *Dialer) dialHost(ctx context.Context, addr netip.Addr) (*ClientConn, error) {
// connection into a controlbase.Conn.
//
// If optAddr is valid, then no DNS is used and the connection will be made to the
// provided address.
func (a *Dialer) dialHost(ctx context.Context, optAddr netip.Addr) (*ClientConn, error) {
// Create one shared context used by both port 80 and port 443 dials.
// If port 80 is still in flight when 443 returns, this deferred cancel
// will stop the port 80 dial.
@@ -295,6 +320,9 @@ func (a *Dialer) dialHost(ctx context.Context, addr netip.Addr) (*ClientConn, er
Host: net.JoinHostPort(a.Hostname, strDef(a.HTTPSPort, "443")),
Path: serverUpgradePath,
}
if a.HTTPSPort == NoPort {
u443 = nil
}
type tryURLRes struct {
u *url.URL // input (the URL conn+err are for/from)
@@ -304,11 +332,11 @@ func (a *Dialer) dialHost(ctx context.Context, addr netip.Addr) (*ClientConn, er
ch := make(chan tryURLRes) // must be unbuffered
try := func(u *url.URL) {
if debugNoiseDial() {
a.logf("trying noise dial (%v, %v) ...", u, addr)
a.logf("trying noise dial (%v, %v) ...", u, optAddr)
}
cbConn, err := a.dialURL(ctx, u, addr)
cbConn, err := a.dialURL(ctx, u, optAddr)
if debugNoiseDial() {
a.logf("noise dial (%v, %v) = (%v, %v)", u, addr, cbConn, err)
a.logf("noise dial (%v, %v) = (%v, %v)", u, optAddr, cbConn, err)
}
select {
case ch <- tryURLRes{u, cbConn, err}:
@@ -319,18 +347,24 @@ func (a *Dialer) dialHost(ctx context.Context, addr netip.Addr) (*ClientConn, er
}
}
forceTLS := a.forceNoise443()
// Start the plaintext HTTP attempt first, unless disabled by the envknob.
if !forceNoise443() {
if !forceTLS || u443 == nil {
go try(u80)
}
// In case outbound port 80 blocked or MITM'ed poorly, start a backup timer
// to dial port 443 if port 80 doesn't either succeed or fail quickly.
if a.Clock == nil {
a.Clock = tstime.StdClock{}
var try443Timer tstime.TimerController
if u443 != nil {
delay := a.httpsFallbackDelay()
if forceTLS {
delay = 0
}
try443Timer = a.clock().AfterFunc(delay, func() { try(u443) })
defer try443Timer.Stop()
}
try443Timer := a.Clock.AfterFunc(a.httpsFallbackDelay(), func() { try(u443) })
defer try443Timer.Stop()
var err80, err443 error
for {
@@ -349,7 +383,7 @@ func (a *Dialer) dialHost(ctx context.Context, addr netip.Addr) (*ClientConn, er
// Stop the fallback timer and run it immediately. We don't use
// Timer.Reset(0) here because on AfterFuncs, that can run it
// again.
if try443Timer.Stop() {
if try443Timer != nil && try443Timer.Stop() {
go try(u443)
} // else we lost the race and it started already which is what we want
case u443:
@@ -365,12 +399,15 @@ func (a *Dialer) dialHost(ctx context.Context, addr netip.Addr) (*ClientConn, er
}
// dialURL attempts to connect to the given URL.
func (a *Dialer) dialURL(ctx context.Context, u *url.URL, addr netip.Addr) (*ClientConn, error) {
//
// If optAddr is valid, then no DNS is used and the connection will be made to the
// provided address.
func (a *Dialer) dialURL(ctx context.Context, u *url.URL, optAddr netip.Addr) (*ClientConn, error) {
init, cont, err := controlbase.ClientDeferred(a.MachineKey, a.ControlKey, a.ProtocolVersion)
if err != nil {
return nil, err
}
netConn, err := a.tryURLUpgrade(ctx, u, addr, init)
netConn, err := a.tryURLUpgrade(ctx, u, optAddr, init)
if err != nil {
return nil, err
}
@@ -416,19 +453,20 @@ var macOSScreenTime = health.Register(&health.Warnable{
ImpactsConnectivity: true,
})
// tryURLUpgrade connects to u, and tries to upgrade it to a net.Conn. If addr
// is valid, then no DNS is used and the connection will be made to the
// provided address.
// tryURLUpgrade connects to u, and tries to upgrade it to a net.Conn.
//
// If optAddr is valid, then no DNS is used and the connection will be made to
// the provided address.
//
// Only the provided ctx is used, not a.ctx.
func (a *Dialer) tryURLUpgrade(ctx context.Context, u *url.URL, addr netip.Addr, init []byte) (_ net.Conn, retErr error) {
func (a *Dialer) tryURLUpgrade(ctx context.Context, u *url.URL, optAddr netip.Addr, init []byte) (_ net.Conn, retErr error) {
var dns *dnscache.Resolver
// If we were provided an address to dial, then create a resolver that just
// returns that value; otherwise, fall back to DNS.
if addr.IsValid() {
if optAddr.IsValid() {
dns = &dnscache.Resolver{
SingleHostStaticResult: []netip.Addr{addr},
SingleHostStaticResult: []netip.Addr{optAddr},
SingleHost: u.Hostname(),
Logf: a.Logf, // not a.logf method; we want to propagate nil-ness
}

View File

@@ -32,6 +32,11 @@ const (
serverUpgradePath = "/ts2021"
)
// NoPort is a sentinel value for Dialer.HTTPSPort to indicate that HTTPS
// should not be tried on any port. It exists primarily for some localhost
// tests where the control plane only runs on HTTP.
const NoPort = "none"
// Dialer contains configuration on how to dial the Tailscale control server.
type Dialer struct {
// Hostname is the hostname to connect to, with no port number.
@@ -62,6 +67,8 @@ type Dialer struct {
// HTTPSPort is the port number to use when making a HTTPS connection.
//
// If not specified, this defaults to port 443.
//
// If "none" (NoPort), HTTPS is disabled.
HTTPSPort string
// Dialer is the dialer used to make outbound connections.
@@ -95,8 +102,9 @@ type Dialer struct {
omitCertErrorLogging bool
testFallbackDelay time.Duration
// tstime.Clock is used instead of time package for methods such as time.Now.
// If not specified, will default to tstime.StdClock{}.
// Clock, if non-nil, overrides the clock to use.
// If nil, tstime.StdClock is used.
// This exists primarily for tests.
Clock tstime.Clock
}

View File

@@ -1,6 +1,8 @@
// Copyright (c) Tailscale Inc & AUTHORS
// SPDX-License-Identifier: BSD-3-Clause
//go:build !ios
package controlhttp
import (

View File

@@ -0,0 +1,4 @@
{
"legacyPublicKey": "mkey:9e5156a4c65121306dd2d8ed8f92cb8d738e2533011344b522c5d28409bc4970",
"publicKey": "mkey:7d2792f9c98d753d2042471536801949104c247f95eac770f8fb321595e2173b"
}

View File

@@ -0,0 +1,32 @@
// Copyright (c) Tailscale Inc & AUTHORS
// SPDX-License-Identifier: BSD-3-Clause
// Package keyfallback contains a fallback mechanism for starting up Tailscale
// when the control server cannot be reached to obtain the primary Noise key.
//
// The data is backed by a JSON file `control-key.json` that is updated by
// `update.go`:
//
// (cd control/keyfallback; go run update.go)
package keyfallback
import (
_ "embed"
"encoding/json"
"tailscale.com/tailcfg"
)
// Get returns the fallback control server public key that was baked into the
// binary at compile time. It is only valid for the main Tailscale control
// server instance.
func Get() (*tailcfg.OverTLSPublicKeyResponse, error) {
out := &tailcfg.OverTLSPublicKeyResponse{}
if err := json.Unmarshal(controlKeyJSON, out); err != nil {
return nil, err
}
return out, nil
}
//go:embed control-key.json
var controlKeyJSON []byte

View File

@@ -0,0 +1,77 @@
// Copyright (c) Tailscale Inc & AUTHORS
// SPDX-License-Identifier: BSD-3-Clause
package keyfallback
import (
"context"
"encoding/json"
"fmt"
"io"
"net/http"
"reflect"
"testing"
"time"
"tailscale.com/ipn"
"tailscale.com/tailcfg"
"tailscale.com/tstest/nettest"
"tailscale.com/util/must"
)
func TestHasValidControlKey(t *testing.T) {
t.Parallel()
keys, err := Get()
if err != nil {
t.Fatalf("Get: %v", err)
}
if keys.PublicKey.IsZero() {
t.Fatalf("zero key")
}
}
// TestKeyIsUpToDate fetches the control key from the control server and
// compares it to the baked-in key, to verify that it's up-to-date. If the
// control server is unreachable, the test is skipped.
func TestKeyIsUpToDate(t *testing.T) {
nettest.SkipIfNoNetwork(t)
// Optimistically fetch the control key and check if it's up to date,
// but ignore if we don't have network access (e.g. running tests on an
// airplane).
ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
defer cancel()
keyURL := fmt.Sprintf("%v/key?v=%d", ipn.DefaultControlURL, tailcfg.CurrentCapabilityVersion)
req := must.Get(http.NewRequestWithContext(ctx, "GET", keyURL, nil))
res, err := http.DefaultClient.Do(req)
if err != nil {
t.Logf("fetch control key: %v", err)
return
}
defer res.Body.Close()
if res.StatusCode != 200 {
t.Fatalf("fetch control key: bad status; got %v, want 200", res.Status)
}
b, err := io.ReadAll(res.Body)
if err != nil {
t.Fatalf("read control key: %v", err)
}
// Verify that the key is up to date and matches the baked-in key.
out := &tailcfg.OverTLSPublicKeyResponse{}
if err := json.Unmarshal(b, out); err != nil {
t.Fatalf("unmarshal control key: %v", err)
}
keys, err := Get()
if err != nil {
t.Fatalf("Get: %v", err)
}
if !reflect.DeepEqual(keys, out) {
t.Errorf("control key is out of date")
t.Logf("old key: %v", keys)
t.Logf("new key: %v", out)
}
}

View File

@@ -0,0 +1,47 @@
// Copyright (c) Tailscale Inc & AUTHORS
// SPDX-License-Identifier: BSD-3-Clause
//go:build ignore
package main
import (
"encoding/json"
"fmt"
"io"
"log"
"net/http"
"os"
"tailscale.com/ipn"
"tailscale.com/tailcfg"
)
func main() {
keyURL := fmt.Sprintf("%v/key?v=%d", ipn.DefaultControlURL, tailcfg.CurrentCapabilityVersion)
res, err := http.Get(keyURL)
if err != nil {
log.Fatalf("fetch control key: %v", err)
}
defer res.Body.Close()
b, err := io.ReadAll(io.LimitReader(res.Body, 64<<10))
if err != nil {
log.Fatalf("read control key: %v", err)
}
if res.StatusCode != 200 {
log.Fatalf("fetch control key: bad status; got %v, want 200", res.Status)
}
// Unmarshal to make sure it's valid.
var out tailcfg.OverTLSPublicKeyResponse
if err := json.Unmarshal(b, &out); err != nil {
log.Fatalf("unmarshal control key: %v", err)
}
if out.PublicKey.IsZero() {
log.Fatalf("control key is zero")
}
if err := os.WriteFile("control-key.json", b, 0644); err != nil {
log.Fatalf("write control key: %v", err)
}
}

View File

@@ -147,6 +147,7 @@ const (
PeerPresentIsRegular = 1 << 0
PeerPresentIsMeshPeer = 1 << 1
PeerPresentIsProber = 1 << 2
PeerPresentNotIdeal = 1 << 3 // client said derp server is not its Region.Nodes[0] ideal node
)
var bin = binary.BigEndian

View File

@@ -356,6 +356,10 @@ func (ReceivedPacket) msg() {}
// PeerGoneMessage is a ReceivedMessage that indicates that the client
// identified by the underlying public key is not connected to this
// server.
//
// It has only historically been sent by the server when the client
// connection count decremented from 1 to 0 and not from e.g. 2 to 1.
// See https://github.com/tailscale/tailscale/issues/13566 for details.
type PeerGoneMessage struct {
Peer key.NodePublic
Reason PeerGoneReasonType
@@ -363,8 +367,13 @@ type PeerGoneMessage struct {
func (PeerGoneMessage) msg() {}
// PeerPresentMessage is a ReceivedMessage that indicates that the client
// is connected to the server. (Only used by trusted mesh clients)
// PeerPresentMessage is a ReceivedMessage that indicates that the client is
// connected to the server. (Only used by trusted mesh clients)
//
// It will be sent to client watchers for every new connection from a client,
// even if the client's already connected with that public key.
// See https://github.com/tailscale/tailscale/issues/13566 for PeerPresentMessage
// and PeerGoneMessage not being 1:1.
type PeerPresentMessage struct {
// Key is the public key of the client.
Key key.NodePublic

View File

@@ -26,6 +26,7 @@ import (
"net"
"net/http"
"net/netip"
"os"
"os/exec"
"runtime"
"strconv"
@@ -46,6 +47,7 @@ import (
"tailscale.com/tstime/rate"
"tailscale.com/types/key"
"tailscale.com/types/logger"
"tailscale.com/util/ctxkey"
"tailscale.com/util/mak"
"tailscale.com/util/set"
"tailscale.com/util/slicesx"
@@ -56,6 +58,16 @@ import (
// verbosely log whenever DERP drops a packet.
var verboseDropKeys = map[key.NodePublic]bool{}
// IdealNodeHeader is the HTTP request header sent on DERP HTTP client requests
// to indicate that they're connecting to their ideal (Region.Nodes[0]) node.
// The HTTP header value is the name of the node they wish they were connected
// to. This is an optional header.
const IdealNodeHeader = "Ideal-Node"
// IdealNodeContextKey is the context key used to pass the IdealNodeHeader value
// from the HTTP handler to the DERP server's Accept method.
var IdealNodeContextKey = ctxkey.New[string]("ideal-node", "")
func init() {
keys := envknob.String("TS_DEBUG_VERBOSE_DROPS")
if keys == "" {
@@ -74,6 +86,7 @@ func init() {
const (
perClientSendQueueDepth = 32 // packets buffered for sending
writeTimeout = 2 * time.Second
privilegedWriteTimeout = 30 * time.Second // for clients with the mesh key
)
// dupPolicy is a temporary (2021-08-30) mechanism to change the policy
@@ -131,6 +144,7 @@ type Server struct {
sentPong expvar.Int // number of pong frames enqueued to client
accepts expvar.Int
curClients expvar.Int
curClientsNotIdeal expvar.Int
curHomeClients expvar.Int // ones with preferred
dupClientKeys expvar.Int // current number of public keys we have 2+ connections for
dupClientConns expvar.Int // current number of connections sharing a public key
@@ -141,10 +155,12 @@ type Server struct {
multiForwarderCreated expvar.Int
multiForwarderDeleted expvar.Int
removePktForwardOther expvar.Int
sclientWriteTimeouts expvar.Int
avgQueueDuration *uint64 // In milliseconds; accessed atomically
tcpRtt metrics.LabelMap // histogram
meshUpdateBatchSize *metrics.Histogram
meshUpdateLoopCount *metrics.Histogram
bufferedWriteFrames *metrics.Histogram // how many sendLoop frames (or groups of related frames) get written per flush
// verifyClientsLocalTailscaled only accepts client connections to the DERP
// server if the clientKey is a known peer in the network, as specified by a
@@ -349,6 +365,7 @@ func NewServer(privateKey key.NodePrivate, logf logger.Logf) *Server {
tcpRtt: metrics.LabelMap{Label: "le"},
meshUpdateBatchSize: metrics.NewHistogram([]float64{0, 1, 2, 5, 10, 20, 50, 100, 200, 500, 1000}),
meshUpdateLoopCount: metrics.NewHistogram([]float64{0, 1, 2, 5, 10, 20, 50, 100}),
bufferedWriteFrames: metrics.NewHistogram([]float64{0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 50, 100}),
keyOfAddr: map[netip.AddrPort]key.NodePublic{},
clock: tstime.StdClock{},
}
@@ -598,6 +615,9 @@ func (s *Server) registerClient(c *sclient) {
}
s.keyOfAddr[c.remoteIPPort] = c.key
s.curClients.Add(1)
if c.isNotIdealConn {
s.curClientsNotIdeal.Add(1)
}
s.broadcastPeerStateChangeLocked(c.key, c.remoteIPPort, c.presentFlags(), true)
}
@@ -688,6 +708,9 @@ func (s *Server) unregisterClient(c *sclient) {
if c.preferred {
s.curHomeClients.Add(-1)
}
if c.isNotIdealConn {
s.curClientsNotIdeal.Add(-1)
}
}
// addPeerGoneFromRegionWatcher adds a function to be called when peer is gone
@@ -804,8 +827,8 @@ func (s *Server) accept(ctx context.Context, nc Conn, brw *bufio.ReadWriter, rem
return fmt.Errorf("receive client key: %v", err)
}
clientAP, _ := netip.ParseAddrPort(remoteAddr)
if err := s.verifyClient(ctx, clientKey, clientInfo, clientAP.Addr()); err != nil {
remoteIPPort, _ := netip.ParseAddrPort(remoteAddr)
if err := s.verifyClient(ctx, clientKey, clientInfo, remoteIPPort.Addr()); err != nil {
return fmt.Errorf("client %v rejected: %v", clientKey, err)
}
@@ -815,8 +838,6 @@ func (s *Server) accept(ctx context.Context, nc Conn, brw *bufio.ReadWriter, rem
ctx, cancel := context.WithCancel(ctx)
defer cancel()
remoteIPPort, _ := netip.ParseAddrPort(remoteAddr)
c := &sclient{
connNum: connNum,
s: s,
@@ -833,6 +854,7 @@ func (s *Server) accept(ctx context.Context, nc Conn, brw *bufio.ReadWriter, rem
sendPongCh: make(chan [8]byte, 1),
peerGone: make(chan peerGoneMsg),
canMesh: s.isMeshPeer(clientInfo),
isNotIdealConn: IdealNodeContextKey.Value(ctx) != "",
peerGoneLim: rate.NewLimiter(rate.Every(time.Second), 3),
}
@@ -879,6 +901,9 @@ func (c *sclient) run(ctx context.Context) error {
if errors.Is(err, context.Canceled) {
c.debugLogf("sender canceled by reader exiting")
} else {
if errors.Is(err, os.ErrDeadlineExceeded) {
c.s.sclientWriteTimeouts.Add(1)
}
c.logf("sender failed: %v", err)
}
}
@@ -1503,6 +1528,7 @@ type sclient struct {
peerGone chan peerGoneMsg // write request that a peer is not at this server (not used by mesh peers)
meshUpdate chan struct{} // write request to write peerStateChange
canMesh bool // clientInfo had correct mesh token for inter-region routing
isNotIdealConn bool // client indicated it is not its ideal node in the region
isDup atomic.Bool // whether more than 1 sclient for key is connected
isDisabled atomic.Bool // whether sends to this peer are disabled due to active/active dups
debug bool // turn on for verbose logging
@@ -1538,6 +1564,9 @@ func (c *sclient) presentFlags() PeerPresentFlags {
if c.canMesh {
f |= PeerPresentIsMeshPeer
}
if c.isNotIdealConn {
f |= PeerPresentNotIdeal
}
if f == 0 {
return PeerPresentIsRegular
}
@@ -1653,10 +1682,12 @@ func (c *sclient) sendLoop(ctx context.Context) error {
defer keepAliveTick.Stop()
var werr error // last write error
inBatch := -1 // for bufferedWriteFrames
for {
if werr != nil {
return werr
}
inBatch++
// First, a non-blocking select (with a default) that
// does as many non-flushing writes as possible.
select {
@@ -1688,6 +1719,10 @@ func (c *sclient) sendLoop(ctx context.Context) error {
if werr = c.bw.Flush(); werr != nil {
return werr
}
if inBatch != 0 { // the first loop will almost always hit default & be size zero
c.s.bufferedWriteFrames.Observe(float64(inBatch))
inBatch = 0
}
}
// Then a blocking select with same:
@@ -1698,7 +1733,6 @@ func (c *sclient) sendLoop(ctx context.Context) error {
werr = c.sendPeerGone(msg.peer, msg.reason)
case <-c.meshUpdate:
werr = c.sendMeshUpdates()
continue
case msg := <-c.sendQueue:
werr = c.sendPacket(msg.src, msg.bs)
c.recordQueueTime(msg.enqueuedAt)
@@ -1707,7 +1741,6 @@ func (c *sclient) sendLoop(ctx context.Context) error {
c.recordQueueTime(msg.enqueuedAt)
case msg := <-c.sendPongCh:
werr = c.sendPong(msg)
continue
case <-keepAliveTickChannel:
werr = c.sendKeepAlive()
}
@@ -1715,7 +1748,19 @@ func (c *sclient) sendLoop(ctx context.Context) error {
}
func (c *sclient) setWriteDeadline() {
c.nc.SetWriteDeadline(time.Now().Add(writeTimeout))
d := writeTimeout
if c.canMesh {
// Trusted peers get more tolerance.
//
// The "canMesh" is a bit of a misnomer; mesh peers typically run over a
// different interface for a per-region private VPC and are not
// throttled. But monitoring software elsewhere over the internet also
// use the private mesh key to subscribe to connect/disconnect events
// and might hit throttling and need more time to get the initial dump
// of connected peers.
d = privilegedWriteTimeout
}
c.nc.SetWriteDeadline(time.Now().Add(d))
}
// sendKeepAlive sends a keep-alive frame, without flushing.
@@ -2027,6 +2072,7 @@ func (s *Server) ExpVar() expvar.Var {
m.Set("gauge_current_file_descriptors", expvar.Func(func() any { return metrics.CurrentFDs() }))
m.Set("gauge_current_connections", &s.curClients)
m.Set("gauge_current_home_connections", &s.curHomeClients)
m.Set("gauge_current_notideal_connections", &s.curClientsNotIdeal)
m.Set("gauge_clients_total", expvar.Func(func() any { return len(s.clientsMesh) }))
m.Set("gauge_clients_local", expvar.Func(func() any { return len(s.clients) }))
m.Set("gauge_clients_remote", expvar.Func(func() any { return len(s.clientsMesh) - len(s.clients) }))
@@ -2054,12 +2100,14 @@ func (s *Server) ExpVar() expvar.Var {
m.Set("multiforwarder_created", &s.multiForwarderCreated)
m.Set("multiforwarder_deleted", &s.multiForwarderDeleted)
m.Set("packet_forwarder_delete_other_value", &s.removePktForwardOther)
m.Set("sclient_write_timeouts", &s.sclientWriteTimeouts)
m.Set("average_queue_duration_ms", expvar.Func(func() any {
return math.Float64frombits(atomic.LoadUint64(s.avgQueueDuration))
}))
m.Set("counter_tcp_rtt", &s.tcpRtt)
m.Set("counter_mesh_update_batch_size", s.meshUpdateBatchSize)
m.Set("counter_mesh_update_loop_count", s.meshUpdateLoopCount)
m.Set("counter_buffered_write_frames", s.bufferedWriteFrames)
var expvarVersion expvar.String
expvarVersion.Set(version.Long())
m.Set("version", &expvarVersion)

View File

@@ -498,7 +498,7 @@ func (c *Client) connect(ctx context.Context, caller string) (client *derp.Clien
req.Header.Set("Connection", "Upgrade")
if !idealNodeInRegion && reg != nil {
// This is purely informative for now (2024-07-06) for stats:
req.Header.Set("Ideal-Node", reg.Nodes[0].Name)
req.Header.Set(derp.IdealNodeHeader, reg.Nodes[0].Name)
// TODO(bradfitz,raggi): start a time.AfterFunc for 30m-1h or so to
// dialNode(reg.Nodes[0]) and see if we can even TCP connect to it. If
// so, TLS handshake it as well (which is mixed up in this massive

View File

@@ -21,6 +21,8 @@ const fastStartHeader = "Derp-Fast-Start"
// Handler returns an http.Handler to be mounted at /derp, serving s.
func Handler(s *derp.Server) http.Handler {
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
ctx := r.Context()
// These are installed both here and in cmd/derper. The check here
// catches both cmd/derper run with DERP disabled (STUN only mode) as
// well as DERP being run in tests with derphttp.Handler directly,
@@ -66,7 +68,11 @@ func Handler(s *derp.Server) http.Handler {
pubKey.UntypedHexString())
}
s.Accept(r.Context(), netConn, conn, netConn.RemoteAddr().String())
if v := r.Header.Get(derp.IdealNodeHeader); v != "" {
ctx = derp.IdealNodeContextKey.WithValue(ctx, v)
}
s.Accept(ctx, netConn, conn, netConn.RemoteAddr().String())
})
}

View File

@@ -26,6 +26,10 @@ var testHookWatchLookConnectResult func(connectError error, wasSelfConnect bool)
// returns.
//
// Otherwise, the add and remove funcs are called as clients come & go.
// Note that add is called for every new connection and remove is only
// called for the final disconnection. See https://github.com/tailscale/tailscale/issues/13566.
// This behavior will likely change. Callers should do their own accounting
// and dup suppression as needed.
//
// infoLogf, if non-nil, is the logger to write periodic status updates about
// how many peers are on the server. Error log output is set to the c's logger,

View File

@@ -14,6 +14,7 @@
<string id="PARTIAL_FULL_SINCE_V1_56">Tailscale version 1.56.0 and later (full support), some earlier versions (partial support)</string>
<string id="SINCE_V1_58">Tailscale version 1.58.0 and later</string>
<string id="SINCE_V1_62">Tailscale version 1.62.0 and later</string>
<string id="SINCE_V1_74">Tailscale version 1.74.0 and later</string>
<string id="Tailscale_Category">Tailscale</string>
<string id="UI_Category">UI customization</string>
<string id="Settings_Category">Settings</string>
@@ -42,6 +43,20 @@ To require logging in to a particular tailnet, add the "required:" prefix, such
If you configure this policy, set it to the name of the tailnet, possibly with the "required:" prefix, as described above.
If you disable this policy, the standard login page will be used.]]></string>
<string id="AuthKey">Specify the auth key to authenticate devices without user interaction</string>
<string id="AuthKey_Help"><![CDATA[This policy allows specifying the default auth key to be used when registering new devices without requiring sign-in via a web browser, unless the user specifies a different auth key via the CLI.
Managing authentication keys via Group Policy and MDM solutions poses significant security risks. Group Policy is not designed to store and deploy secrets, and by default, Group Policy settings can be read by all domain-authenticated users and devices, regardless of their privilege level or whether the policy setting applies to them.
While MDM solutions tend to offer better control over who can access the policy setting values, they can still be compromised. Additionally, with both Group Policy and MDM solutions, the auth key is always readable by all users who have access to the device where this policy setting applies, as well as by all applications running on the device. A compromised auth key can potentially be used by a malicious actor to gain or elevate access to the target network.
Only consider this option after carefully reviewing the organization's security posture. For example, ensure you configure the auth keys specifically for the tag of the device and that access control policies only grant necessary access between the tailnet and the tagged device. Additionally, consider using short-lived auth keys, one-time auth keys (with one GPO/MDM configuration per device), Device Approval, and/or Tailnet lock to minimize risk. If you suspect an auth key has been compromised, revoke the auth key immediately.
If you configure this policy setting and specify an auth key, it will be used to authenticate the device unless the device is already logged in or an auth key is explicitly specified via the CLI.
If you disable or do not configure this policy setting, an interactive user login will be required..
See https://tailscale.com/kb/1315/mdm-keys#set-an-auth-key for more details.]]></string>
<string id="ExitNodeID">Require using a specific Exit Node</string>
<string id="ExitNodeID_Help"><![CDATA[This policy can be used to require always using the specified Exit Node whenever the Tailscale client is connected.
See https://tailscale.com/kb/1315/mdm-keys#force-an-exit-node-to-always-be-used and https://tailscale.com/kb/1103/exit-nodes for more details.
@@ -219,6 +234,11 @@ See https://tailscale.com/kb/1315/mdm-keys#set-your-organization-name for more d
<label>Tailnet:</label>
</textBox>
</presentation>
<presentation id="AuthKey">
<textBox refId="AuthKeyPrompt">
<label>Auth Key:</label>
</textBox>
</presentation>
<presentation id="ExitNodeID">
<textBox refId="ExitNodeIDPrompt">
<label>Exit Node:</label>

View File

@@ -46,6 +46,10 @@
displayName="$(string.SINCE_V1_62)">
<and><reference ref="TAILSCALE_PRODUCT"/></and>
</definition>
<definition name="SINCE_V1_74"
displayName="$(string.SINCE_V1_74)">
<and><reference ref="TAILSCALE_PRODUCT"/></and>
</definition>
</definitions>
</supportedOn>
<categories>
@@ -79,6 +83,13 @@
<text id="TailnetPrompt" valueName="Tailnet" required="true" />
</elements>
</policy>
<policy name="AuthKey" class="Machine" displayName="$(string.AuthKey)" explainText="$(string.AuthKey_Help)" presentation="$(presentation.AuthKey)" key="Software\Policies\Tailscale">
<parentCategory ref="Top_Category" />
<supportedOn ref="SINCE_V1_74" />
<elements>
<text id="AuthKeyPrompt" valueName="AuthKey" required="true" />
</elements>
</policy>
<policy name="ExitNodeID" class="Machine" displayName="$(string.ExitNodeID)" explainText="$(string.ExitNodeID_Help)" presentation="$(presentation.ExitNodeID)" key="Software\Policies\Tailscale">
<parentCategory ref="Settings_Category" />
<supportedOn ref="SINCE_V1_56" />

Some files were not shown because too many files have changed in this diff Show More