Expose enough metrics to get a sense of queue depth, use and if it has
stalled.
Updates tailscale/corp#26058
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I271ac8d03f3db587a33aca6964fe92f2833e1251
The CN field is technically deprecated; set the requested name in a DNS SAN
extension in addition to maximise compatibility with RFC 8555.
Fixes#14762
Change-Id: If5d27f1e7abc519ec86489bf034ac98b2e613043
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
The timeout still defaults to 2 seconds, but can now be changed via command-line flag.
Updates tailscale/corp#26045
Signed-off-by: Percy Wegmann <percy@tailscale.com>
This interface is used both by the DERP client as well as the server.
Defining the interface in derp.go makes it clear that it is shared.
Updates tailscale/corp#26045
Signed-off-by: Percy Wegmann <percy@tailscale.com>
3dabea0fc2 added some docs with inconsistent usage docs.
This fixes them, and adds a test.
It also adds some other tests and fixes other verb tense
inconsistencies.
Updates tailscale/corp#25278
Change-Id: I94c2a8940791bddd7c35c1c3d5fb791a317370c2
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
In v1.78, we started acquiring the GP lock when reading policy settings. This led to a deadlock during
Tailscale installation via Group Policy Software Installation because the GP engine holds the write lock
for the duration of policy processing, which in turn waits for the installation to complete, which in turn
waits for the service to enter the running state.
In this PR, we prevent the acquisition of GP locks (aka EnterCriticalPolicySection) during service startup
and update the Windows Registry-based util/syspolicy/source.PlatformPolicyStore to handle this failure
gracefully. The GP lock is somewhat optional; it’s safe to read policy settings without it, but acquiring
the lock is recommended when reading multiple values to prevent the Group Policy engine from modifying
settings mid-read and to avoid inconsistent results.
Fixes#14416
Signed-off-by: Nick Khyl <nickk@tailscale.com>
This was a slow memory leak on busy tailnets with lots of tagged
ephemeral nodes.
Updates tailscale/corp#26058
Change-Id: I298e7d438e3ffbb3cde795640e344671d244c632
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Still behind the same ts_omit_tap build tag.
See #14738 for background on the pattern.
Updates #12614
Change-Id: I03fb3d2bf137111e727415bd8e713d8568156ecc
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
If we fail to parse the upstream DNS response in an app connector, we
might miss new IPs for the target domain. Log parsing errors to be able
to diagnose that.
Updates #14606
Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
Remove "unexpected" labelling of PeerGoneReasonNotHere.
A peer being no longer connected to a DERP server
is not an unexpected case and causes confusion in looking at logs.
Fixestailscale/corp#25609
Signed-off-by: Mike O'Driscoll <mikeo@tailscale.com>
The new ProxyGroup-based Ingress reconciler is causing a fatal log at
startup because it has the same name as the existing Ingress reconciler.
Explicitly name both to ensure they have unique names that are consistent
with other explicitly named reconcilers.
Updates #14583
Change-Id: Ie76e3eaf3a96b1cec3d3615ea254a847447372ea
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
This pulls out the Wake-on-LAN (WoL) code out into its own package
(feature/wakeonlan) that registers itself with various new hooks
around tailscaled.
Then a new build tag (ts_omit_wakeonlan) causes the package to not
even be linked in the binary.
Ohter new packages include:
* feature: to just record which features are loaded. Future:
dependencies between features.
* feature/condregister: the package with all the build tags
that tailscaled, tsnet, and the Tailscale Xcode project
extension can empty (underscore) import to load features
as a function of the defined build tags.
Future commits will move of our "ts_omit_foo" build tags into this
style.
Updates #12614
Change-Id: I9c5378dafb1113b62b816aabef02714db3fc9c4a
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
* Reapply "ipn/ipnlocal: re-advertise appc routes on startup (#14609)"
This reverts commit 51adaec35a.
Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
* ipn/ipnlocal: fix a deadlock in readvertiseAppConnectorRoutes
Don't hold LocalBackend.mu while calling the methods of
appc.AppConnector. Those methods could call back into LocalBackend and
try to acquire it's mutex.
Fixes https://github.com/tailscale/corp/issues/25965Fixes#14606
Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
---------
Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
Updates tailscale/corp#25278
Adds definitions for new CLI commands getting added in v1.80. Refactors some pre-existing CLI commands within the `configure` tree to clean up code.
Signed-off-by: Andrea Gottardo <andrea@gottardo.me>
Rather than using a string everywhere and needing to clarify that the
string should have the svc: prefix, create a separate type for Service
names.
Updates tailscale/corp#24607
Change-Id: I720e022f61a7221644bb60955b72cacf42f59960
Signed-off-by: Adrian Dewhurst <adrian@tailscale.com>
This commit intend to provide support for TCP and Web VIP services and also allow user to use Tun
for VIP services if they want to.
The commit includes:
1.Setting TCP intercept function for VIP Services.
2.Update netstack to send packet written from WG to netStack handler for VIP service.
3.Return correct TCP hander for VIP services when netstack acceptTCP.
This commit also includes unit tests for if the local backend setServeConfig would set correct TCP intercept
function and test if a hander gets returned when getting TCPHandlerForDst. The shouldProcessInbound
check is not unit tested since the test result just depends on mocked functions. There should be an integration
test to cover shouldProcessInbound and if the returned TCP handler actually does what the serveConfig says.
Updates tailscale/corp#24604
Signed-off-by: KevinLiang10 <37811973+KevinLiang10@users.noreply.github.com>
We previously baked in the LetsEncrypt x509 root CA for our tlsdial
package.
This moves that out into a new "bakedroots" package and is now also
shared by ipn/ipnlocal's cert validation code (validCertPEM) that
decides whether it's time to fetch a new cert.
Otherwise, a machine without LetsEncrypt roots locally in its system
roots is unable to use tailscale cert/serve and fetch certs.
Fixes#14690
Change-Id: Ic88b3bdaabe25d56b9ff07ada56a27e3f11d7159
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Both @agottardo and I tripped over this today.
Updates #cleanup
Change-Id: I64380a03bfc952b9887b1512dbcadf26499ff1cd
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
If unable to accept a connection from the bandwidth probe listener,
return from the goroutine immediately since the accepted connection
will be nil.
Updates tailscale/corp#25958
Signed-off-by: Percy Wegmann <percy@tailscale.com>
I saw this panic while writing a new test for #14715:
panic: send on closed channel
goroutine 826 [running]:
tailscale.com/tsnet.(*listener).handle(0x1400031a500, {0x1035fbb00, 0x14000b82300})
/Users/bradfitz/src/tailscale.com/tsnet/tsnet.go:1317 +0xac
tailscale.com/wgengine/netstack.(*Impl).acceptTCP(0x14000204700, 0x14000882100)
/Users/bradfitz/src/tailscale.com/wgengine/netstack/netstack.go:1320 +0x6dc
created by gvisor.dev/gvisor/pkg/tcpip/transport/tcp.(*Forwarder).HandlePacket in goroutine 807
/Users/bradfitz/go/pkg/mod/gvisor.dev/gvisor@v0.0.0-20240722211153-64c016c92987/pkg/tcpip/transport/tcp/forwarder.go:98 +0x32c
FAIL tailscale.com/tsnet 0.927s
Updates #14715
Change-Id: I9924e0a6c2b801d46ee44eb8eeea0da2f9ea17c4
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
cmd/k8s-operator: add logic to parse L7 Ingresses in HA mode
- Wrap the Tailscale API client used by the Kubernetes Operator
into a client that knows how to manage VIPServices.
- Create/Delete VIPServices and update serve config for L7 Ingresses
for ProxyGroup.
- Ensure that ingress ProxyGroup proxies mount serve config from a shared ConfigMap.
Updates tailscale/corp#24795
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
Adds a new Hostinfo.IngressEnabled bool field that holds whether
funnel is currently enabled for the node. Triggers control update
when this value changes.
Bumps capver so that control can distinguish the new field being false
vs non-existant in previous clients.
This is part of a fix for an issue where nodes with any AllowFunnel
block set in their serve config are being displayed as if actively
routing funnel traffic in the admin panel.
Updates tailscale/tailscale#11572
Updates tailscale/corp#25931
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
We throw error early with a warning if users attempt to enable background funnel
for a node that does not allow incoming connections
(shields up), but if it done in foreground mode, we just silently fail
(the funnel command succeeds, but the connections are not allowed).
This change makes sure that we also error early in foreground mode.
Updates tailscale/tailscale#11049
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
Updates tailscale/corp#25936
This defines a new syspolicy 'Hostname' and allows an IT administrator to override the value we normally read from os.Hostname(). This is particularly useful on Android and iOS devices, where the hostname we get from the OS is really just the device model (a platform restriction to prevent fingerprinting).
If we don't implement this, all devices on the customer's side will look like `google-pixel-7a-1`, `google-pixel-7a-2`, `google-pixel-7a-3`, etc. and it is not feasible for the customer to use the API or worse the admin console to manually fix these names.
Apply code review comment by @nickkhyl
Signed-off-by: Andrea Gottardo <andrea@gottardo.me>
Co-authored-by: Nick Khyl <1761190+nickkhyl@users.noreply.github.com>
Since 5297bd2cff, tstun.Wrapper has required its Start
method to be called for it to function. Failure to do so just
results in weird hangs and I've wasted too much time multiple
times now debugging. Hopefully this prevents more lost time.
Updates tailscale/corp#24454
Change-Id: I87f4539f7be7dc154627f8835a37a8db88c31be0
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Metrics currently exist for dropped packets by reason, and total
received packets by kind (e.g., `disco` or `other`), but relating these
two together to gleam information about the drop rate for specific
reasons on a per-kind basis is not currently possible.
Change `derp_packets_dropped` to use a `metrics.MultiLabelMap` to
track both the `reason` and `kind` in the same metric to allow for this
desired level of granularity.
Drop metrics that this makes unnecessary (namely `packetsDroppedReason`
and `packetsDroppedType`).
Updates https://github.com/tailscale/corp/issues/25489
Signed-off-by: Mario Minardi <mario@tailscale.com>
Most users should not run into this because it's set in the helm chart
and the deploy manifest, but if namespace is not set we get confusing
authz errors because the kube client tries to fetch some namespaced resources
as though they're cluster-scoped and reports permission denied. Try to
detect namespace from the default projected volume, and otherwise fatal.
Fixes #cleanup
Change-Id: I64b34191e440b61204b9ad30bbfa117abbbe09c3
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
If the server was in use at the time of the initial check, but disconnected and was removed
from the activeReqs map by the time we registered a waiter, the ready channel will never
be closed, resulting in a deadlock. To avoid this, we check whether the server is still busy
after registering the wait.
Fixes#14655
Signed-off-by: Nick Khyl <nickk@tailscale.com>
I made a last-minute change in #14626 to split a single loop that created 1_000 concurrent
connections into an inner and outer loop that create 100 concurrent connections 10 times.
This introduced a race because the last user's connection may still be active (from the server's
perspective) when a new outer iteration begins. Since every new client gets a unique ClientID,
but we reuse usernames and UIDs, the server may let a user in (as the UID matches, which is fine),
but the test might then fail due to a ClientID mismatch:
server_test.go:232: CurrentUser(Initial): got &{S-1-5-21-1-0-0-1001 User-4 <nil> Client-2 false false};
want &{S-1-5-21-1-0-0-1001 User-4 <nil> Client-114 false false}
In this PR, we update (*testIPNServer).blockWhileInUse to check whether the server is currently busy
and wait until it frees up. We then call blockWhileInUse at the end of each outer iteration so that the server
is always in a known idle state at the beginning of the inner loop. We also check that the current user
is not set when the server is idle.
Updates tailscale/corp#25804
Updates #14655 (found when working on it)
Signed-off-by: Nick Khyl <nickk@tailscale.com>
As we look to add github.com/prometheus/client_golang/prometheus to
more parts of the codebase, lock in that we don't use it in tailscaled,
primarily for binary size reasons.
Updates #12614
Change-Id: I03c100d12a05019a22bdc23ce5c4df63d5a03ec6
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This test verifies, among other things, that init functions cannot be deferred after (*DeferredFuncs).Do
has already been called and that all subsequent calls to (*DeferredFuncs).Defer return false.
However, the initial implementation of this check was racy: by the time (*DeferredFuncs).Do returned,
not all goroutines that successfully deferred an init function may have incremented the atomic variable
tracking the number of deferred functions. As a result, the variable's value could differ immediately
after (*DeferredFuncs).Do returned and after all goroutines had completed execution (i.e., after wg.Wait()).
In this PR, we replace the original racy check with a different one. Although this new check is also racy,
it can only produce false negatives. This means that if the test fails, it indicates an actual bug rather than
a flaky test.
Fixes#14039
Signed-off-by: Nick Khyl <nickk@tailscale.com>
There's at least one example of stored routes and advertised routes
getting out of sync. I don't know how they got there yet, but this would
backfill missing advertised routes on startup from stored routes.
Also add logging in LocalBackend.AdvertiseRoute to record when new
routes actually get put into prefs.
Updates #14606
Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
I moved the actual rename into separate, GOOS-specific files. On
non-Windows, we do a simple os.Rename. On Windows, we first try
ReplaceFile with a fallback to os.Rename if the target file does
not exist.
ReplaceFile is the recommended way to rename the file in this use case,
as it preserves attributes and ACLs set on the target file.
Updates #14428
Signed-off-by: Aaron Klotz <aaron@tailscale.com>
The --mesh-with flag now supports the specification of hostname tuples like
derp1a.tailscale.com/derp1a-vpc.tailscale.com, which instructs derp to mesh
with host 'derp1a.tailscale.com' but dial TCP connections to 'derp1a-vpc.tailscale.com'.
For backwards compatibility, --mesh-with still supports individual hostnames.
The logic which attempts to auto-discover '[host]-vpc.tailscale.com' dial hosts
has been removed.
Updates tailscale/corp#25653
Signed-off-by: Percy Wegmann <percy@tailscale.com>
We have observed some clients with extremely large lists of IPv6
endpoints, in some cases from subnets where the machine also has the
zero address for a whole /48 with then arbitrary addresses additionally
assigned within that /48. It is in general unnecessary for reachability
to report all of these addresses, typically only one will be necessary
for reachability. We report two, to cover some other common cases such
as some styles of IPv6 private address rotations.
Updates tailscale/corp#25850
Signed-off-by: James Tucker <james@tailscale.com>
In this commit, we add a failing test to verify that ipn/ipnserver.Server correctly
sets and unsets the current user when two different clients send requests concurrently
(A sends request, B sends request, A's request completes, B's request completes).
The expectation is that the user who wins the race becomes the current user
from the LocalBackend's perspective, remaining in this state until they disconnect,
after which a different user should be able to connect and use the LocalBackend.
We then fix the second of two bugs in (*Server).addActiveHTTPRequest, where a race
condition causes the LocalBackend's state to be reset after a new client connects,
instead of after the last active request of the previous client completes and the server
becomes idle.
Fixestailscale/corp#25804
Signed-off-by: Nick Khyl <nickk@tailscale.com>
In this commit, we add a failing test to verify that ipn/ipnserver.Server correctly
sets and unsets the current user when two different users connect sequentially
(A connects, A disconnects, B connects, B disconnects).
We then fix the test by updating (*ipn/ipnserver.Server).addActiveHTTPRequest
to avoid calling (*LocalBackend).ResetForClientDisconnect again after a new user
has connected and been set as the current user with (*LocalBackend).SetCurrentUser().
Since ipn/ipnserver.Server does not allow simultaneous connections from different
Windows users and relies on the LocalBackend's current user, and since we already
reset the LocalBackend's state by calling ResetForClientDisconnect when the last
active request completes (indicating the server is idle and can accept connections
from any Windows user), it is unnecessary to track the last connected user on the
ipnserver.Server side or call ResetForClientDisconnect again when the user changes.
Additionally, the second call to ResetForClientDisconnect occurs after the new user
has been set as the current user, resetting the correct state for the new user
instead of the old state of the now-disconnected user, causing issues.
Updates tailscale/corp#25804
Signed-off-by: Nick Khyl <nickk@tailscale.com>
We update client/tailscale.LocalClient to allow specifying an optional Transport
(http.RoundTripper) for LocalAPI HTTP requests, and implement one that injects
an ipnauth.TestActor via request headers. We also add several functions and types
to make testing an ipn/ipnserver.Server possible (or at least easier).
We then use these updates to write basic tests for ipnserver.Server,
ensuring it works on non-Windows platforms and correctly sets and unsets
the LocalBackend's current user when a Windows user connects and disconnects.
We intentionally omit tests for switching between different OS users
and will add them in follow-up commits.
Updates tailscale/corp#25804
Signed-off-by: Nick Khyl <nickk@tailscale.com>
In preparation for adding test coverage for ipn/ipnserver.Server, we update it
to use ipnauth.Actor instead of its concrete implementation where possible.
Updates tailscale/corp#25804
Signed-off-by: Nick Khyl <nickk@tailscale.com>
We build up maps of both the existing MagicDNS configuration in hosts
and the desired MagicDNS configuration, compare the two, and only
write out a new one if there are changes. The comparison doesn't need
to be perfect, as the occasional false-positive is fine, but this
should greatly reduce rewrites of the hosts file.
I also changed the hosts updating code to remove the CRLF/LF conversion
stuff, and use Fprintf instead of Frintln to let us write those inline.
Updates #14428
Signed-off-by: Aaron Klotz <aaron@tailscale.com>
This deprecates the old "DERP string" packing a DERP region ID into an
IP:port of 127.3.3.40:$REGION_ID and just uses an integer, like
PeerChange.DERPRegion does.
We still support servers sending the old form; they're converted to
the new form internally right when they're read off the network.
Updates #14636
Change-Id: I9427ec071f02a2c6d75ccb0fcbf0ecff9f19f26f
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This doesn't seem to have any immediate impact, but not allowing access via the IPv6 masquerade
address when an IPv4 masquerade address is also set seems like a bug.
Updates #cleanup
Updates #14570 (found when working on it)
Signed-off-by: Nick Khyl <nickk@tailscale.com>
This finishes the work started in #14616.
Updates #8632
Change-Id: I4dc07d45b1e00c3db32217c03b21b8b1ec19e782
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
In this PR, we add a generic views.ValuePointer type that can be used as a view for pointers
to basic types and struct types that do not require deep cloning and do not have corresponding
view types. Its Get/GetOk methods return stack-allocated shallow copies of the underlying value.
We then update the cmd/viewer codegen to produce getters that return either concrete views
when available or ValuePointer views when not, for pointer fields in generated view types.
This allows us to avoid unnecessary allocations compared to returning pointers to newly
allocated shallow copies.
Updates #14570
Signed-off-by: Nick Khyl <nickk@tailscale.com>
This amends commit b7e48058c8.
That commit broke all documented ways of starting Tailscale on gokrazy:
https://gokrazy.org/packages/tailscale/ — both Option A (tailscale up)
and Option B (tailscale up --auth-key) rely on the tailscale CLI working.
I verified that the tailscale CLI just prints it help when started
without arguments, i.e. it does not stay running and is not restarted.
I verified that the tailscale CLI successfully exits when started with
tailscale up --auth-key, regardless of whether the node has joined
the tailnet yet or not.
I verified that the tailscale CLI successfully waits and exits when
started with tailscale up, as expected.
fixes https://github.com/gokrazy/gokrazy/issues/286
Signed-off-by: Michael Stapelberg <michael@stapelberg.de>
This will enable Prometheus queries to look at the bandwidth over time windows,
for example 'increase(derp_bw_bytes_total)[1h] / increase(derp_bw_transfer_time_seconds_total)[1h]'.
Fixes commit a51672cafd.
Updates tailscale/corp#25503
Signed-off-by: Percy Wegmann <percy@tailscale.com>
sync.OnceValue and slices.Compact were both added in Go 1.21.
cmp.Or was added in Go 1.22.
Updates #8632
Updates #11058
Change-Id: I89ba4c404f40188e1f8a9566c8aaa049be377754
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Bump the versions to pick up some CVE patches. They don't affect us, but
customer scanners will complain.
Updates #cleanup
Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
Most of these are effectively no-ops, but appease security scanners.
At least one (x/net for x/net/html) only affect builds from the open source repo,
since we already had it updated in our "corp" repo:
golang.org/x/net v0.33.1-0.20241230221519-e9d95ba163f7
... and that's where we do the official releases from. e.g.
tailscale.io % go install tailscale.com/cmd/tailscaled
tailscale.io % go version -m ~/go/bin/tailscaled | grep x/net
dep golang.org/x/net v0.33.1-0.20241230221519-e9d95ba163f7 h1:raAbYgZplPuXQ6s7jPklBFBmmLh6LjnFaJdp3xR2ljY=
tailscale.io % cd ../tailscale.com
tailscale.com % go install tailscale.com/cmd/tailscaled
tailscale.com % go version -m ~/go/bin/tailscaled | grep x/net
dep golang.org/x/net v0.33.0 h1:74SYHlV8BIgHIFC/LrYkOGIwL19eTYXQ5wc6TBuO36I=
Updates #8043
Updates #14599
Change-Id: I6e238cef62ca22444145a5313554aab8709b33c9
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
cmd/containerboot: load containerboot serve config that does not contain HTTPS endpoint in tailnets with HTTPS disabled
Fixes an issue where, if a tailnet has HTTPS disabled, no serve config
set via TS_SERVE_CONFIG was loaded, even if it does not contain an HTTPS endpoint.
Now for tailnets with HTTPS disabled serve config provided to containerboot is considered invalid
(and therefore not loaded) only if there is an HTTPS endpoint defined in the config.
Fixestailscale/tailscale#14495
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
cmd/{k8s-operator,containerboot}: reload tailscaled configfile when its contents have changed
Instead of restarting the Kubernetes Operator proxies each time
tailscaled config has changed, this dynamically reloads the configfile
using the new reload endpoint.
Older annotation based mechanism will be supported till 1.84
to ensure that proxy versions prior to 1.80 keep working with
operator 1.80 and newer.
Updates tailscale/tailscale#13032
Updates tailscale/corp#24795
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
If the total number of differences is less than a small amount, just do
the dumb quadratic thing and compare every single object instead of
allocating a map.
Updates tailscale/corp#25479
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I8931b4355a2da4ec0f19739927311cf88711a840
Extracted from some code written in the other repo.
Updates tailscale/corp#25479
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I6df062fdffa1705524caa44ac3b6f2788cf64595
This will enable Prometheus queries to look at the bandwidth over time windows,
for example 'increase(derp_bw_bytes_total)[1h] / increase(derp_bw_transfer_time_seconds_total)[1h]'.
Updates tailscale/corp#25503
Signed-off-by: Percy Wegmann <percy@tailscale.com>
* cmd/k8s-operator,k8s-operator: allow users to set custom labels for the optional ServiceMonitor
Updates tailscale/tailscale#14381
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
Change the type of the `IPv4` and `IPv6` members in the `nodeData`
struct to be `netip.Addr` instead of `string`.
We were previously calling `String()` on this struct, which returns
"invalid IP" when the `netip.Addr` is its zero value, and passing this
value into the aforementioned attributes.
This caused rendering issues on the frontend
as we were assuming that the value for `IPv4` and `IPv6` would be falsy
in this case.
The zero value for a `netip.Addr` marshalls to an empty string instead
which is the behaviour we want downstream.
Updates https://github.com/tailscale/tailscale/issues/14568
Signed-off-by: Mario Minardi <mario@tailscale.com>
Extracted from some code written in the other repo.
Updates tailscale/corp#25479
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I92c97a63a8f35cace6e89a730938ea587dcefd9b
Currently this does not yet do anything apart from creating
the ProxyGroup resources like StatefulSet.
Updates tailscale/corp#24795
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
These erroneously blocked a recent PR, which I fixed by simply
re-running CI. But we might as well fix them anyway.
These are mostly `printf` to `print` and a couple of `!=` to `!Equal()`
Updates #cleanup
Signed-off-by: Will Norris <will@tailscale.com>
Remove the platform specificity, it is unnecessary complexity.
Deduplicate repeated code as a result of reduced complexity.
Split out error identification code.
Update call-sites and tests.
Updates #14551
Updates tailscale/corp#25648
Signed-off-by: James Tucker <james@tailscale.com>
Fixestailscale/tailscale#14563
When creating a NoiseClient, ensure that if any private IP address is provided, with both an `http` scheme and an explicit port number, we do not ever attempt to use HTTPS. We were only handling the case of `127.0.0.1` and `localhost`, but `192.168.x.y` is a private IP as well. This uses the `netip` package to check and adds some logging in case we ever need to troubleshoot this.
Signed-off-by: Andrea Gottardo <andrea@gottardo.me>
Observed in the wild some macOS machines gain broken sockets coming out
of sleep (we observe "time jumped", followed by EPIPE on sendto). The
cause of this in the platform is unclear, but the fix is clear: always
rebind if the socket is broken. This can also be created artificially on
Linux via `ss -K`, and other conditions or software on a system could
also lead to the same outcomes.
Updates tailscale/corp#25648
Signed-off-by: James Tucker <james@tailscale.com>
In the process, because I needed it for testing, make all
LocalBackend-managed goroutines be accounted for. And then in tests,
verify they're no longer running during LocalBackend.Shutdown.
Updates tailscale/corp#19681
Change-Id: Iad873d4df7d30103a4a7863dfacf9e078c77e6a3
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Updates #14520
Updates #14517 (in that I pulled this out of there)
Change-Id: Ibc28162816e083fcadf550586c06805c76e378fc
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
It was failing about an unaccepted risk ("mac-app-connector") because
it was checking runtime.GOOS ("darwin") instead of the test's env.goos
string value ("linux", which doesn't have the warning).
Fixes#14544
Change-Id: I470d86a6ad4bb18e1dd99d334538e56556147835
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Updates #cleanup
Updates #1909 (noticed while working on that)
Change-Id: I505001e5294287ad2a937b4db61d9e67de70fa14
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Fixes#14492
-----
Developer Certificate of Origin
Version 1.1
Copyright (C) 2004, 2006 The Linux Foundation and its contributors.
Everyone is permitted to copy and distribute verbatim copies of this
license document, but changing it is not allowed.
Developer's Certificate of Origin 1.1
By making a contribution to this project, I certify that:
(a) The contribution was created in whole or in part by me and I
have the right to submit it under the open source license
indicated in the file; or
(b) The contribution is based upon previous work that, to the best
of my knowledge, is covered under an appropriate open source
license and I have the right under that license to submit that
work with modifications, whether created in whole or in part
by me, under the same open source license (unless I am
permitted to submit under a different license), as indicated
in the file; or
(c) The contribution was provided directly to me by some other
person who certified (a), (b) or (c) and I have not modified
it.
(d) I understand and agree that this project and the contribution
are public and that a record of the contribution (including all
personal information I submit with it, including my sign-off) is
maintained indefinitely and may be redistributed consistent with
this project or the open source license(s) involved.
Change-Id: I6dc1068d34bbfa7477e7b7a56a4325b3868c92e1
Signed-off-by: Marc Paquette <marcphilippaquette@gmail.com>
These were the last two Range funcs in this repo.
Updates #12912
Change-Id: I6ba0a911933cb5fc4e43697a9aac58a8035f9622
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
The remaining range funcs in the tree are RangeOverTCPs and
RangeOverWebs in ServeConfig; those will be cleaned up separately.
Updates #12912
Change-Id: Ieeae4864ab088877263c36b805f77aa8e6be938d
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
And misc cleanup along the way.
Updates #12912
Change-Id: I0cab148b49efc668c6f5cdf09c740b84a713e388
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
While working on #13390, I ran across this non-idiomatic
pointer-to-view and parallel-sorted-map accounting code that was all
just to avoid a sort later.
But the sort later when building a new netmap.NetworkMap is already a
drop in the bucket of CPU compared to how much work & allocs
mapSession.netmap and LocalBackend's spamming of the full netmap
(potentially tens of thousands of peers, MBs of JSON) out to IPNBus
clients for any tiny little change (node changing online status, etc).
Removing the parallel sorted slice let everything be simpler to reason
about, so this does that. The sort might take a bit more CPU time now
in theory, but in practice for any netmap size for which it'd matter,
the quadratic netmap IPN bus spam (which we need to fix soon) will
overshadow that little sort.
Updates #13390
Updates #1909
Change-Id: I3092d7c67dc10b2a0f141496fe0e7e98ccc07712
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Importing the ~deprecated golang.org/x/exp/maps as "xmaps" to not
shadow the std "maps" was getting ugly.
And using slices.Collect on an iterator is verbose & allocates more.
So copy (x)maps.Keys+Values into our slicesx package instead.
Updates #cleanup
Updates #12912
Updates #14514 (pulled out of that change)
Change-Id: I5e68d12729934de93cf4a9cd87c367645f86123a
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Using context.CancelFunc as the type (instead of func()) answers
questions like whether it's okay to call it multiple times, whether
it blocks, etc. And that's the type it actually is in this case.
Updates #cleanup
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
On Linux, systray.SetTitle actually seems to set the tooltip on all
desktops I've tested on. But on macOS, it actually does set a title
that is always displayed in the systray area next to the icon. This
change should properly set the tooltip across platforms.
Updates #1708
Change-Id: Ia101a4a3005adb9118051b3416f5a64a4a45987d
Signed-off-by: Will Norris <will@tailscale.com>
Move a number of global state vars into the Menu struct, keeping things
better encapsulated. The systray package still relies on its own global
state, so only a single Menu instance can run at a time.
Move a lot of the initialization logic out of onReady, in particular
fetching the latest tailscale state. Instead, populate the state before
calling systray.Run, which fixes a timing issue in GNOME (#14477).
This change also creates a separate bgContext for actions not tied menu
item clicks. Because we have to rebuild the entire menu regularly, we
cancel that context as needed, which can cancel subsequent updateState
calls.
Also exit cleanly on SIGINT and SIGTERM.
Updates #1708Fixes#14477
Change-Id: Ia101a4a3005adb9118051b3416f5a64a4a45987d
Signed-off-by: Will Norris <will@tailscale.com>
Refactor code to set app icon and title as part of rebuild, rather than
separately in eventLoop. This fixes several cases where they weren't
getting updated properly. This change also makes use of the new exit
node icons.
Updates #1708
Change-Id: Ia101a4a3005adb9118051b3416f5a64a4a45987d
Signed-off-by: Will Norris <will@tailscale.com>
restructure tsLogo to allow setting a mask to be used when drawing the
logo dots, as well as add an overlay icon, such as the arrow when
connected to an exit node.
The icon is still renders as white on black, but this change also
prepare for doing a black on white version, as well a fully transparent
icon. I don't know if we can consistently determine which to use, so
this just keeps the single icon for now.
Updates #1708
Change-Id: Ia101a4a3005adb9118051b3416f5a64a4a45987d
Signed-off-by: Will Norris <will@tailscale.com>
metrics.LabelMap grows slightly more heavy, needing a lock to ensure
proper ordering for newly initialized ShardedInt values. An Add method
enables callers to use .Add for both expvar.Int and syncs.ShardedInt
values, but retains the original behavior of defaulting to initializing
expvar.Int values.
Updates tailscale/corp#25450
Co-Authored-By: Andrew Dunham <andrew@du.nham.ca>
Signed-off-by: James Tucker <james@tailscale.com>
- rebuild menu when prefs change outside of systray, such as setting an
exit node
- refactor onClick handler code
- compare lowercase country name, the same as macOS and Windows (now
sorts Ukraine before USA)
- fix "connected / disconnected" menu items on stopped status
- prevent nil pointer on "This Device" menu item
Updates #1708
Change-Id: Ia101a4a3005adb9118051b3416f5a64a4a45987d
Signed-off-by: Will Norris <will@tailscale.com>
Noted as useful during review of #14448.
Updates #14457
Change-Id: I0f16f08d5b05a8e9044b19ef6c02d3dab497f131
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Remove EOL Ubuntu versions.
Add new Ubuntu LTS.
Update Alpine to test latest version.
Also, make the test run when its workflow is updated and installer.sh isn't.
Updates #cleanup
Signed-off-by: Erisa A <erisa@tailscale.com>
This commit builds the exit node menu including the recommended exit
node, if available, as well as tailnet and mullvad exit nodes.
This does not yet update the menu based on changes in exit node outside
of the systray app, which will come later. This also does not include
the ability to run as an exit node.
Updates #1708
Change-Id: Ia101a4a3005adb9118051b3416f5a64a4a45987d
Signed-off-by: Will Norris <will@tailscale.com>
* tailcfg: rename and retype ServiceHost capability, add value type
Updates tailscale/corp#22743.
In #14046, this was accidentally made a PeerCapability when it
should have been NodeCapability. Also, renaming it to use the
nomenclature that we decided on after #14046 went up, and adding
the type of the value that will be passed down in the RawMessage
for this capability.
This shouldn't break anything, since no one was using this string or
variable yet.
Signed-off-by: Naman Sood <mail@nsood.in>
The new menu delay added to fix libdbusmenu systrays causes problems
with KDE. Given the state of wildly varying systray implementations, I
suspect we may need more desktop-specific hacks, so I'm setting this up
to accommodate that.
Updates #1708
Updates #14431
Change-Id: Ia101a4a3005adb9118051b3416f5a64a4a45987d
Signed-off-by: Will Norris <will@tailscale.com>
Bring UI closer to macOS and windows:
- split login and tailnet name over separate lines
- render profile picture (with very simple caching)
- use checkbox to indicate active profile. I've not found any desktops
that can't render checkboxes, so I'd like to explore other options
if needed.
Updates #1708
Change-Id: Ia101a4a3005adb9118051b3416f5a64a4a45987d
Signed-off-by: Will Norris <will@tailscale.com>
ShardedInt provides an int type expvar.Var that supports more efficient
writes at high frequencies (one order of magnigude on an M1 Max, much
more on NUMA systems).
There are two implementations of ShardValue, one that abuses sync.Pool
that will work on current public Go versions, and one that takes a
dependency on a runtime.TailscaleP function exposed in Tailscale's Go
fork. The sync.Pool variant has about 10x the throughput of a single
atomic integer on an M1 Max, and the runtime.TailscaleP variant is about
10x faster than the sync.Pool variant.
Neither variant have perfect distribution, or perfectly always avoid
cross-CPU sharing, as there is no locking or affinity to ensure that the
time of yield is on the same core as the time of core biasing, but in
the average case the distributions are enough to provide substantially
better performance.
See golang/go#18802 for a related upstream proposal.
Updates tailscale/go#109
Updates tailscale/corp#25450
Signed-off-by: James Tucker <james@tailscale.com>
Some notification managers crop the application icon to a circle, so
ensure we have enough padding to account for that.
Updates #1708
Change-Id: Ia101a4a3005adb9118051b3416f5a64a4a45987d
Signed-off-by: Will Norris <will@tailscale.com>
This new type of probe sends DERP packets sized similarly to CallMeMaybe packets
at a rate of 10 packets per second. It records the round-trip times in a Prometheus
histogram. It also keeps track of how many packets are dropped. Packets that fail to
arrive within 5 seconds are considered dropped.
Updates tailscale/corp#24522
Signed-off-by: Percy Wegmann <percy@tailscale.com>
MutexValue is simply a value guarded by a mutex.
For any type that is not pointer-sized,
MutexValue will perform much better than AtomicValue
since it will not incur an allocation boxing the value
into an interface value (which is how Go's atomic.Value
is implemented under-the-hood).
Updates #cleanup
Signed-off-by: Joe Tsai <joetsai@digital-static.net>
This is an experiment to see how useful we will find it to have some
text-based diagrams to document how various components of the operator
work. There are no plans to link to this from elsewhere yet, but
hopefully it will be a useful reference internally.
Updates #cleanup
Change-Id: If5911ed39b09378fec0492e87738ec0cc3d8731e
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
For https://github.com/tailscale/go/pull/108 so we can depend on it in
other repos. (This repo can't yet use it; we permit building
tailscale/tailscale with the latest stock Go release) But that will be
in Go 1.24. We're just impatient elsewhere and would like it in the
control plane code earlier.
Updates tailscale/corp#25406
Change-Id: I53ff367318365c465cbd02cea387c8ff1eb49fab
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
The omitzero tag option has been backported to v1 "encoding/json"
from the "encoding/json/v2" prototype and will land in Go1.24.
Until we fully upgrade to Go1.24, adjust the test to be agnostic
to which version of Go someone is using.
Updates tailscale/corp#25406
Signed-off-by: Joe Tsai <joetsai@digital-static.net>
The go-httpstat package has a data race when used with connections that
are performing happy-eyeballs connection setups as we are in the DERP
client. There is a long-stale PR upstream to address this, however
revisiting the purpose of this code suggests we don't really need
httpstat here.
The code populates a latency table that may be used to compare to STUN
latency, which is a lightweight RTT check. Switching out the reported
timing here to simply the request HTTP request RTT avoids the
problematic package.
Fixestailscale/corp#25095
Signed-off-by: James Tucker <james@tailscale.com>
When we first made Tailscale SSH, we assumed people would want public
key support soon after. Turns out that hasn't been the case; people
love the Tailscale identity authentication and check mode.
In light of CVE-2024-45337, just remove all our public key code to not
distract people, and to make the code smaller. We can always get it
back from git if needed.
Updates tailscale/corp#25131
Updates golang/go#70779
Co-authored-by: Percy Wegmann <percy@tailscale.com>
Change-Id: I87a6e79c2215158766a81942227a18b247333c22
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
The errors emitted by util/dnsname are all written at least moderately
friendly and none of them emit sensitive information. They should be
safe to display to end users.
Updates tailscale/corp#9025
Change-Id: Ic58705075bacf42f56378127532c5f28ff6bfc89
Signed-off-by: Adrian Dewhurst <adrian@tailscale.com>
The IfElse function is equivalent to the ternary (c ? a : b) operator
in many other languages like C. Unfortunately, this function
cannot perform short-circuit evaluation like in many other languages,
but this is a restriction that's not much different
than the pre-existing cmp.Or function.
The argument against ternary operators in Go is that
nested ternary operators become unreadable
(e.g., (c1 ? (c2 ? a : b) : (c2 ? x : y))).
But a single layer of ternary expressions can sometimes
make code much more readable.
Having the bools.IfElse function gives code authors the
ability to decide whether use of this is more readable or not.
Obviously, code authors will need to be judicious about
their use of this helper function.
Readability is more of an art than a science.
Updates #cleanup
Signed-off-by: Joe Tsai <joetsai@digital-static.net>
Throughout our codebase we have types that only exist only
to implement an io.Reader or io.Writer, when it would have been
simpler, cleaner, and more readable to use an inlined function literal
that closes over the relevant types.
This is arguably more readable since it keeps the semantic logic
in place rather than have it be isolated elsewhere.
Note that a function literal that closes over some variables
is semantic equivalent to declaring a struct with fields and
having the Read or Write method mutate those fields.
Updates #cleanup
Signed-off-by: Joe Tsai <joetsai@digital-static.net>
This is the start of an integration/e2e test suite for the tailscale operator.
It currently only tests two major features, ingress proxy and API server proxy,
but we intend to expand it to cover more features over time. It also only
supports manual runs for now. We intend to integrate it into CI checks in a
separate update when we have planned how to securely provide CI with the secrets
required for connecting to a test tailnet.
Updates #12622
Change-Id: I31e464bb49719348b62a563790f2bc2ba165a11b
Co-authored-by: Irbe Krumina <irbe@tailscale.com>
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
A method on kc was called unconditionally, even if was not initialized,
leading to a nil pointer dereference when TS_SERVE_CONFIG was set
outside Kubernetes.
Add a guard symmetric with other uses of the kubeClient.
Fixes#14354.
Signed-off-by: Bjorn Neergaard <bjorn@neersighted.com>
Make dev-mode DERP probes work without TLS. Properly dial port `3340`
when not using HTTPS when dialing nodes in `derphttp_client`. Skip
verifying TLS state in `newConn` if we are not running a prober.
Updates tailscale/corp#24635
Signed-off-by: Percy Wegmann <percy@tailscale.com>
Co-authored-by: Percy Wegmann <percy@tailscale.com>
Use envknob to configure the per client send
queue depth for the derp server.
Fixestailscale/corp#24978
Signed-off-by: Mike O'Driscoll <mikeo@tailscale.com>
Previously this unit test failed if it was run in a container. Update the assert
to focus on exactly the condition we are trying to assert: the package type
should only be 'container' if we use the build tag.
Updates #14317
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
Make argparsing use flag for adding a new
parameter that requires parsing.
Enforce a read timeout deadline waiting for response
from the stun server provided in the args. Otherwise
the program will never exit.
Fixes#14267
Signed-off-by: Mike O'Driscoll <mikeo@tailscale.com>
OAuth clients that were used to generate an auth_key previously
specified the scope 'device'. 'device' is not an actual scope,
the real scope is 'devices'. The resulting OAuth token ended up
including all scopes from the specified OAuth client, so the code
was able to successfully create auth_keys.
It's better not to hardcode a scope here anyway, so that we have
the flexibility of changing which scope(s) are used in the future
without having to update old clients.
Since the qualifier never actually did anything, this commit simply
removes it.
Updates tailscale/corp#24934
Signed-off-by: Percy Wegmann <percy@tailscale.com>
This package grew organically over time and
is an awful mix of explicitly declared options and
globally set parameters via environment variables and
other subtle effects.
Add a new Options and TransportOptions type to
allow for the creation of a Policy or http.RoundTripper
with some set of options.
The options struct avoids the need to add yet more
NewXXX functions for every possible combination of
ordered arguments.
The goal of this refactor is to allow specifying the http.Client
to use with the Policy.
Updates tailscale/corp#18177
Signed-off-by: Joe Tsai <joetsai@digital-static.net>
If previousEtag is empty, then we assume control ACLs were not modified
manually and push the local ACLs. Instead, we defaulted to localEtag
which would be different if local ACLs were different from control.
AFAIK this was always buggy, but never reported?
Fixes#14295
Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
Every so often, the ProxyGroup and other controllers lose an optimistic locking race
with other controllers that update the objects they create. Stop treating
this as an error event, and instead just log an info level log line for it.
Fixes#14072
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
This provides an interface for a user to force a preferred DERP outcome
for all future netchecks that will take precedence unless the forced
region is unreachable.
The option does not persist and will be lost when the daemon restarts.
Updates tailscale/corp#18997
Updates tailscale/corp#24755
Signed-off-by: James Tucker <james@tailscale.com>
cmd/containerboot,kube/kubetypes,cmd/k8s-operator: detect if Ingress is created in a tailnet that has no HTTPS
This attempts to make Kubernetes Operator L7 Ingress setup failures more explicit:
- the Ingress resource now only advertises HTTPS endpoint via status.ingress.loadBalancer.hostname when/if the proxy has succesfully loaded serve config
- the proxy attempts to catch cases where HTTPS is disabled for the tailnet and logs a warning
Updates tailscale/tailscale#12079
Updates tailscale/tailscale#10407
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
cmd/k8s-operator/deploy/chart: allow reading OAuth creds from a CSI driver's volume and annotating operator's Service account
Updates #14264
Signed-off-by: Oliver Rahner <o.rahner@dke-data.com>
When the operator enables metrics on a proxy, it uses the port 9001,
and in the near future it will start using 9002 for the debug endpoint
as well. Make sure we don't choose ports from a range that includes
9001 so that we never clash. Setting TS_SOCKS5_SERVER, TS_HEALTHCHECK_ADDR_PORT,
TS_OUTBOUND_HTTP_PROXY_LISTEN, and PORT could also open arbitrary ports,
so we will need to document that users should not choose ports from the
10000-11000 range for those settings.
Updates #13406
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
* cmd/k8s-operator,k8s-operator,go.mod: optionally create ServiceMonitor
Adds a new spec.metrics.serviceMonitor field to ProxyClass.
If that's set to true (and metrics are enabled), the operator
will create a Prometheus ServiceMonitor for each proxy to which
the ProxyClass applies.
Additionally, create a metrics Service for each proxy that has
metrics enabled.
Updates tailscale/tailscale#11292
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
We were previously relying on unintended behaviour by runc where
all containers where by default given read/write/mknod permissions
for tun devices.
This behaviour was removed in https://github.com/opencontainers/runc/pull/3468
and released in runc 1.2.
Containerd container runtime, used by Docker and majority of Kubernetes distributions
bumped runc to 1.2 in 1.7.24 https://github.com/containerd/containerd/releases/tag/v1.7.24
thus breaking our reference tun mode Tailscale Kubernetes manifests and Kubernetes
operator proxies.
This PR changes the all Kubernetes container configs that run Tailscale in tun mode
to privileged. This should not be a breaking change because all these containers would
run in a Pod that already has a privileged init container.
Updates tailscale/tailscale#14256
Updates tailscale/tailscale#10814
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
This commit updates ServeConfig to allow configuration to Services (VIPServices for now) via Serve.
The scope of this commit is only adding the Services field to ServeConfig. The field doesn't actually
allow packet flowing yet. The purpose of this commit is to unblock other work on k8s end.
Updates #22953
Signed-off-by: KevinLiang10 <37811973+KevinLiang10@users.noreply.github.com>
* cmd/containerboot: serve health on local endpoint
We introduced stable (user) metrics in #14035, and `TS_LOCAL_ADDR_PORT`
with it. Rather than requiring users to specify a new addr/port
combination for each new local endpoint they want the container to
serve, this combines the health check endpoint onto the local addr/port
used by metrics if `TS_ENABLE_HEALTH_CHECK` is used instead of
`TS_HEALTHCHECK_ADDR_PORT`.
`TS_LOCAL_ADDR_PORT` now defaults to binding to all interfaces on 9002
so that it works more seamlessly and with less configuration in
environments other than Kubernetes, where the operator always overrides
the default anyway. In particular, listening on localhost would not be
accessible from outside the container, and many scripted container
environments do not know the IP address of the container before it's
started. Listening on all interfaces allows users to just set one env
var (`TS_ENABLE_METRICS` or `TS_ENABLE_HEALTH_CHECK`) to get a fully
functioning local endpoint they can query from outside the container.
Updates #14035, #12898
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
This commit adds a command to validate that all the metrics that
are registring in the client are also present in a path or url.
It is intended to be ran from the KB against the latest version of
tailscale.
Updates tailscale/corp#24066
Updates tailscale/corp#22075
Co-Authored-By: Brad Fitzpatrick <bradfitz@tailscale.com>
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
Ensure that the ExternalName Service port names are always synced to the
ClusterIP Service, to fix a bug where if users created a Service with
a single unnamed port and later changed to 1+ named ports, the operator
attempted to apply an invalid multi-port Service with an unnamed port.
Also, fixes a small internal issue where not-yet Service status conditons
were lost on a spec update.
Updates tailscale/tailscale#10102
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
this commit reduced the amount of data sent in the metrics
data integration test from 10MB to 1MB.
On various machines 10MB was quite flaky, while 1MB has not failed
once on 10000 runs.
Updates #13420
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
Re-use a pre-allocated bytes.Buffer struct and
shallow the copy the result of bytes.NewBuffer into it
to avoid allocating the struct.
Note that we're only reusing the bytes.Buffer struct itself
and not the underling []byte temporarily stored within it.
Updates #cleanup
Updates tailscale/corp#18514
Updates golang/go#67004
Signed-off-by: Joe Tsai <joetsai@digital-static.net>
In https://github.com/tailscale/tailscale/pull/13726 we added logic to
`checkExitNodePrefsLocked` to error out on platforms where using an
exit node is unsupported in order to give users more obvious feedback
than having this silently fail downstream.
The above change neglected to properly check whether the device in
question was actually trying to use an exit node when doing the check
and was incorrectly returning an error on any calls to
`checkExitNodePrefsLocked` on platforms where using an exit node is not
supported as a result.
This change remedies this by adding a check to see whether the device is
attempting to use an exit node before doing the `CanUseExitNode` check.
Updates https://github.com/tailscale/corp/issues/24835
Signed-off-by: Mario Minardi <mario@tailscale.com>
I was hoping we'd catch an example input quickly, but the reporter had
rebooted their machine and it is no longer exhibiting the behavior. As
such this code may be sticking around quite a bit longer and we might
encounter other errors, so include the panic in the log entry.
Updates #14201
Updates #14202
Updates golang/go#70528
Signed-off-by: James Tucker <james@tailscale.com>
We add a policy definition for the AllowedSuggestedExitNodes syspolicy setting, allowing admins
to configure a list of exit node IDs to be used as a pool for automatic suggested exit node selection.
We update definitions for policy settings configurable on both a per-user and per-machine basis,
such as UI customizations, to specify class="Both".
Lastly, we update the help text for existing policy definitions to include a link to the KB article
as the last line instead of in the first paragraph.
Updates #12687
Updates tailscale/corp#19681
Signed-off-by: Nick Khyl <nickk@tailscale.com>
In this PR, we update LocalBackend to rebuild the set of allowed suggested exit nodes whenever
the AllowedSuggestedExitNodes syspolicy setting changes. Additionally, we request a new suggested
exit node when this occurs, enabling its use if the ExitNodeID syspolicy setting is set to auto:any.
Updates #12687
Signed-off-by: Nick Khyl <nickk@tailscale.com>
This PR removes the sync.Once wrapper around retrieving the MachineCertificateSubject policy
setting value, ensuring the most recent version is always used if it changes after the service starts.
Although this policy setting is used by a very limited number of customers, recent support escalations have highlighted issues caused by outdated or incorrect policy values being applied.
Updates #12687
Signed-off-by: Nick Khyl <nickk@tailscale.com>
In this PR, we update ipnlocal.NewLocalBackend to subscribe to policy change notifications
and reapply syspolicy settings to the current profile's ipn.Prefs whenever a change occurs.
Updates #12687
Signed-off-by: Nick Khyl <nickk@tailscale.com>
This moves code that handles ExitNodeID/ExitNodeIP syspolicy settings
from (*LocalBackend).setExitNodeID to applySysPolicy.
Updates #12687
Signed-off-by: Nick Khyl <nickk@tailscale.com>
This updates the syspolicy.LogSCMInteractions check to run at the time of an interaction,
just before logging a message, instead of during service startup. This ensures the most
recent policy setting is used if it has changed since the service started.
Updates #12687
Signed-off-by: Nick Khyl <nickk@tailscale.com>
In this PR, we move the syspolicy.FlushDNSOnSessionUnlock check from service startup
to when a session change notification is received. This ensures that the most recent policy
setting value is used if it has changed since the service started.
We also plan to handle session change notifications for unrelated reasons
and need to decouple notification subscriptions from DNS anyway.
Updates #12687
Updates tailscale/corp#18342
Signed-off-by: Nick Khyl <nickk@tailscale.com>
These delays determine how soon syspolicy change callbacks are invoked after a policy setting is updated
in a policy source. For tests, we shorten these delays to minimize unnecessary wait times. This adjustment
only affects tests that subscribe to policy change notifications and modify policy settings after they have
already been set. Initial policy settings are always available immediately without delay.
Updates #12687
Signed-off-by: Nick Khyl <nickk@tailscale.com>
We have several places where LocalBackend instances are created for testing, but they are rarely shut down
when the tests that created them exit.
In this PR, we update newTestLocalBackend and similar functions to use testing.TB.Cleanup(lb.Shutdown)
to ensure LocalBackend instances are properly shut down during test cleanup.
Updates #12687
Signed-off-by: Nick Khyl <nickk@tailscale.com>
containerboot:
Adds 3 new environment variables for containerboot, `TS_LOCAL_ADDR_PORT` (default
`"${POD_IP}:9002"`), `TS_METRICS_ENABLED` (default `false`), and `TS_DEBUG_ADDR_PORT`
(default `""`), to configure metrics and debug endpoints. In a follow-up PR, the
health check endpoint will be updated to use the `TS_LOCAL_ADDR_PORT` if
`TS_HEALTHCHECK_ADDR_PORT` hasn't been set.
Users previously only had access to internal debug metrics (which are unstable
and not recommended) via passing the `--debug` flag to tailscaled, but can now
set `TS_METRICS_ENABLED=true` to expose the stable metrics documented at
https://tailscale.com/kb/1482/client-metrics at `/metrics` on the addr/port
specified by `TS_LOCAL_ADDR_PORT`.
Users can also now configure a debug endpoint more directly via the
`TS_DEBUG_ADDR_PORT` environment variable. This is not recommended for production
use, but exposes an internal set of debug metrics and pprof endpoints.
operator:
The `ProxyClass` CRD's `.spec.metrics.enable` field now enables serving the
stable user metrics documented at https://tailscale.com/kb/1482/client-metrics
at `/metrics` on the same "metrics" container port that debug metrics were
previously served on. To smooth the transition for anyone relying on the way the
operator previously consumed this field, we also _temporarily_ serve tailscaled's
internal debug metrics on the same `/debug/metrics` path as before, until 1.82.0
when debug metrics will be turned off by default even if `.spec.metrics.enable`
is set. At that point, anyone who wishes to continue using the internal debug
metrics (not recommended) will need to set the new `ProxyClass` field
`.spec.statefulSet.pod.tailscaleContainer.debug.enable`.
Users who wish to opt out of the transitional behaviour, where enabling
`.spec.metrics.enable` also enables debug metrics, can set
`.spec.statefulSet.pod.tailscaleContainer.debug.enable` to false (recommended).
Separately but related, the operator will no longer specify a host port for the
"metrics" container port definition. This caused scheduling conflicts when k8s
needs to schedule more than one proxy per node, and was not necessary for allowing
the pod's port to be exposed to prometheus scrapers.
Updates #11292
---------
Co-authored-by: Kristoffer Dalby <kristoffer@tailscale.com>
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
A small follow-up to #14112- ensures that the operator itself can emit
Events for its kube state store changes.
Updates tailscale/tailscale#14080
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
Otherwise we'll see a panic if we hit the dnsfallback code and try to
call NewDialer with a nil NetMon.
Updates #14161
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I81c6e72376599b341cb58c37134c2a948b97cf5f
So we can locate them in logs more easily.
Updates tailscale/corp#24721
Change-Id: Ia766c75608050dde7edc99835979a6e9bb328df2
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Extracts tsaddr.IsTailscaleIPv4 out of tsaddr.IsTailscaleIP.
This will allow for checking valid Tailscale assigned IPv4 addresses
without checking IPv6 addresses.
Updates #14168
Updates tailscale/corp#24620
Signed-off-by: James Scott <jim@tailscale.com>
This is a follow-up to #14112 where our internal kube client was updated
to allow it to emit Events - this updates our sample kube manifests
and tsrecorder manifest templates so they can benefit from this functionality.
Updates tailscale/tailscale#14080
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
Initial support for SrcCaps was added in 5ec01bf but it was not actually
working without this.
Updates #12542
Signed-off-by: Anton Tolchanov <anton@tailscale.com>
Adds functionality to kube client to emit Events.
Updates kube store to emit Events when tailscaled state has been loaded, updated or if any errors where
encountered during those operations.
This should help in cases where an error related to state loading/updating caused the Pod to crash in a loop-
unlike logs of the originally failed container instance, Events associated with the Pod will still be
accessible even after N restarts.
Updates tailscale/tailscale#14080
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
I merged 5cae7c51bf (removing Notify.BackendLogID) and 93db503565
(adding another reference to Notify.BackendLogID) that didn't have merge
conflicts, but didn't compile together.
This removes the new reference, fixing the build.
Updates #14129
Change-Id: I9bb68efd977342ea8822e525d656817235039a66
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Limit spamming GUIs with boring updates to once in 3 seconds, unless
the notification is relatively interesting and the GUI should update
immediately.
This is basically @barnstar's #14119 but with the logic moved to be
per-watch-session (since the bit is per session), rather than
globally. And this distinguishes notable Notify messages (such as
state changes) and makes them send immediately.
Updates tailscale/corp#24553
Change-Id: I79cac52cce85280ce351e65e76ea11e107b00b49
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
The v2 endpoint supports HTTP/2 bidirectional streaming and acks for
received bytes. This is used to detect when a recorder disappears to
more quickly terminate the session.
Updates https://github.com/tailscale/corp/issues/24023
Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
* ipn,tailcfg: add VIPService struct and c2n to fetch them from client
Updates tailscale/corp#22743, tailscale/corp#22955
Signed-off-by: Naman Sood <mail@nsood.in>
* more review fixes
Signed-off-by: Naman Sood <mail@nsood.in>
* don't mention PeerCapabilityServicesDestination since it's currently unused
Signed-off-by: Naman Sood <mail@nsood.in>
---------
Signed-off-by: Naman Sood <mail@nsood.in>
Back in the day this testcontrol package only spoke the
nacl-boxed-based control protocol, which used this.
Then we added ts2021, which didn't, but still sometimes used it.
Then we removed the old mode and didn't remove this parameter
in 2409661a0d.
Updates #11585
Change-Id: Ifd290bd7dbbb52b681b3599786437a15bc98b6a5
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Previously we required the program to be running in a test or have
TS_CONTROL_IS_PLAINTEXT_HTTP before we disabled its https fallback
on "http" schema control URLs to localhost with ports.
But nobody accidentally does all three of "http", explicit port
number, localhost and doesn't mean it. And when they mean it, they're
testing a localhost dev control server (like I was) and don't want 443
getting involved.
As of the changes for #13597, this became more annoying in that we
were trying to use a port which wasn't even available.
Updates #13597
Change-Id: Icd00bca56043d2da58ab31de7aa05a3b269c490f
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
We currently annotate pods with a hash of the tailscaled config so that
we can trigger pod restarts whenever it changes. However, the hash
updates more frequently than is necessary causing more restarts than is
necessary. This commit removes two causes; scaling up/down and removing
the auth key after pods have initially authed to control. However, note
that pods will still restart on scale-up/down because of the updated set
of volumes mounted into each pod. Hopefully we can fix that in a planned
follow-up PR.
Updates #13406
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
This gets close to all of the remaining ones.
Updates #12912
Change-Id: I9c672bbed2654a6c5cab31e0cbece6c107d8c6fa
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
It doesn't need a Clone method, like a time.Time, etc.
And then, because Go 1.23+ uses unique.Handle internally for
the netip package types, we can remove those special cases.
Updates #14058 (pulled out from that PR)
Updates tailscale/corp#24485
Change-Id: Iac3548a9417ccda5987f98e0305745a6e178b375
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Perhaps I was too opimistic in #13323 thinking we won't need logs for
this. Let's log a summary of the response without logging specific
identifiers.
Updates tailscale/corp#24437
Signed-off-by: Anton Tolchanov <anton@tailscale.com>
Or unless the new "ts_debug_websockets" build tag is set.
Updates #1278
Change-Id: Ic4c4f81c1924250efd025b055585faec37a5491d
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Otherwise all the clients only using control/controlhttp for the
ts2021 HTTP client were also pulling in WebSocket libraries, as the
server side always needs to speak websockets, but only GOOS=js clients
speak it.
This doesn't yet totally remove the websocket dependency on Linux because
Linux has a envknob opt-in to act like GOOS=js for manual testing and force
the use of WebSockets for DERP only (not control). We can put that behind
a build tag in a future change to eliminate the dep on all GOOSes.
Updates #1278
Change-Id: I4f60508f4cad52bf8c8943c8851ecee506b7ebc9
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Some environments would like to remove Tailscale SSH support for the
binary for various reasons when not needed (either for peace of mind,
or the ~1MB of binary space savings).
Updates tailscale/corp#24454
Updates #1278
Updates #12614
Change-Id: Iadd6c5a393992c254b5dc9aa9a526916f96fd07a
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Adds a /disconnect-control local API endpoint that just shuts down control client.
This can be run before shutting down an HA subnet router/app connector replica - it will ensure
that all connection to control are dropped and control thus considers this node inactive and tells
peers to switch over to another replica. Meanwhile the existing connections keep working (assuming
that the replica is given some graceful shutdown period).
Updates tailscale/tailscale#14020
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
Sets a custom hostinfo app type for ProxyGroup replicas, similarly
to how we do it for all other Kubernetes Operator managed components.
Updates tailscale/tailscale#13406,tailscale/corp#22920
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
- Basic description of DERP
If configured to do so, also show
- Mailto link to security@tailscale.com
- Link to Tailscale Security Policies
- Link to Tailscale Acceptable Use Policy
Updates tailscale/corp#24092
Signed-off-by: Percy Wegmann <percy@tailscale.com>
This adds a new generic result type (motivated by golang/go#70084) to
try it out, and uses it in the new lineutil package (replacing the old
lineread package), changing that package to return iterators:
sometimes over []byte (when the input is all in memory), but sometimes
iterators over results of []byte, if errors might happen at runtime.
Updates #12912
Updates golang/go#70084
Change-Id: Iacdc1070e661b5fb163907b1e8b07ac7d51d3f83
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Thanks to @davidbuzz for raising the issue in #13973.
Fixes#8272Fixes#13973
Change-Id: Ic413e14d34c82df3c70a97e591b90316b0b4946b
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Key changes:
- No mutex for every udp package: replace syncs.Map with regular map for udpTargetConns
- Use socksAddr as map key for better type safety
- Add test for multi udp target
Updates #7581
Change-Id: Ic3d384a9eab62dcbf267d7d6d268bf242cc8ed3c
Signed-off-by: VimT <me@vimt.me>
This commit addresses an issue with the SOCKS5 UDP relay functionality
when using the --tun=userspace-networking option. Previously, UDP packets
were not being correctly routed into the Tailscale network in this mode.
Key changes:
- Replace single UDP connection with a map of connections per target
- Use c.srv.dial for creating connections to ensure proper routing
Updates #7581
Change-Id: Iaaa66f9de6a3713218014cf3f498003a7cac9832
Signed-off-by: VimT <me@vimt.me>
A filesystem was plumbed into netstack in 993acf4475
but hasn't been used since 2d5d6f5403. Remove it.
Noticed while rebasing a Tailscale fork elsewhere.
Updates tailscale/corp#16827
Change-Id: Ib76deeda205ffe912b77a59b9d22853ebff42813
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
We were only updating the ProfileManager and not going down
the EditPrefs path which meant the prefs weren't applied
till either the process restarted or some other pref changed.
This makes it so that we reconfigure everything correctly when
ReloadConfig is called.
Updates #13032
Signed-off-by: Maisem Ali <maisem@tailscale.com>
Add an explicit case for exercising preferred DERP hysteresis around
the branch that compares latencies on a percentage basis.
Updates #cleanup
Signed-off-by: Jordan Whited <jordan@tailscale.com>
By counting "/" elements in the pattern we catch many scenarios, but not
the root-level handler. If either of the patterns is "/", compare the
pattern length to pick the right one.
Updates https://github.com/tailscale/corp/issues/8027
Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
In this PR, we add the tailscale syspolicy command with two subcommands: list, which displays
policy settings, and reload, which forces a reload of those settings. We also update the LocalAPI
and LocalClient to facilitate these additions.
Updates #12687
Signed-off-by: Nick Khyl <nickk@tailscale.com>
Make it possible to advertise app connector via a new conffile field.
Also bumps capver - conffile deserialization errors out if unknonw
fields are set, so we need to know which clients understand the new field.
Updates tailscale/tailscale#11113
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
This required sharing the dropped packet metric between two packages
(tstun and magicsock), so I've moved its definition to util/usermetric.
Updates tailscale/corp#22075
Signed-off-by: Anton Tolchanov <anton@tailscale.com>
The user-facing metrics are intended to track data transmitted at
the overlay network level.
Updates tailscale/corp#22075
Signed-off-by: Anton Tolchanov <anton@tailscale.com>
In an environment with unstable latency, such as upstream bufferbloat,
there are cases where a full netcheck could drop the prior preferred
DERP (likely home DERP) from future netcheck probe plans. This will then
likely result in a home DERP having a missing sample on the next
incremental netcheck, ultimately resulting in a home DERP move.
This change does not fix our overall response to highly unstable
latency, but it is an incremental improvement to prevent single spurious
samples during a full netcheck from alone triggering a flapping
condition, as now the prior changes to include historical latency will
still provide the desired resistance, and the home DERP should not move
unless latency is consistently worse over a 5 minute period.
Note that there is a nomenclature and semantics issue remaining in the
difference between a report preferred DERP and a home DERP. A report
preferred DERP is aspirational, it is what will be picked as a home DERP
if a home DERP connection needs to be established. A nodes home DERP may
be different than a recent preferred DERP, in which case a lot of
netcheck logic is fallible. In future enhancements much of the DERP move
logic should move to consider the home DERP, rather than recent report
preferred DERP.
Updates #8603
Updates #13969
Signed-off-by: James Tucker <james@tailscale.com>
We make setting.Snapshot JSON-marshallable in preparation for returning it from the LocalAPI.
Updates #12687
Signed-off-by: Nick Khyl <nickk@tailscale.com>
We add setting.RawValue, a new type that facilitates unmarshalling JSON numbers and arrays
as uint64 and []string (instead of float64 and []any) for policy setting values.
We then use it to make setting.RawItem JSON-marshallable and update the tests.
Updates #12687
Signed-off-by: Nick Khyl <nickk@tailscale.com>
In this PR, we implement (but do not use yet, pending #13727 review) a syspolicy/source.Store
that reads policy settings from environment variables. It converts a CamelCase setting.Key,
such as AuthKey or ExitNodeID, to a SCREAMING_SNAKE_CASE, TS_-prefixed environment
variable name, such as TS_AUTH_KEY and TS_EXIT_NODE_ID. It then looks up the variable
and attempts to parse it according to the expected value type. If the environment variable
is not set, the policy setting is considered not configured in this store (the syspolicy package
will still read it from other sources). Similarly, if the environment variable has an invalid value
for the setting type, it won't be used (though the reported/logged error will differ).
Updates #13193
Updates #12687
Signed-off-by: Nick Khyl <nickk@tailscale.com>
Now when we have HA for egress proxies, it makes sense to support topology
spread constraints that would allow users to define more complex
topologies of how proxy Pods need to be deployed in relation with other
Pods/across regions etc.
Updates tailscale/tailscale#13406
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
This adds additional logging on DERP home changes to allow
better troubleshooting.
Updates tailscale/corp#18095
Signed-off-by: Tim Walters <tim@tailscale.com>
updates tailscale/corp#24197
tailmac run now supports the --share option which will allow you
to specify a directory on the host which can be mounted in the guest
using mount_virtiofs vmshare <path>.
Signed-off-by: Jonathan Nobels <jonathan@tailscale.com>
updates tailscale/corp#24197
Generation of the Host.app path was erroneous and tailmac run
would not work unless the pwd was tailmac/bin. Now you can
be able to invoke tailmac from anywhere.
Signed-off-by: Jonathan Nobels <jonathan@tailscale.com>
- `tailscale metrics print`: to show metric values in console
- `tailscale metrics write`: to write metrics to a file (with a tempfile
& rename dance, which is atomic on Unix).
Also, remove the `TS_DEBUG_USER_METRICS` envknob as we are getting
more confident in these metrics.
Updates tailscale/corp#22075
Signed-off-by: Anton Tolchanov <anton@tailscale.com>
Not confident this is the right way to expose this, so let's remote it
for now.
Updates tailscale/corp#22075
Signed-off-by: Anton Tolchanov <anton@tailscale.com>
If the client cannot fetch a serial number, write a log message helping
the user understand what happened. Also, don't just return the error
immediately, since we still have a chance to collect network interface
addresses.
Updates #5902
Signed-off-by: Anton Tolchanov <anton@tailscale.com>
trimpath can be inconvenient for IDEs and LSPs that do not always
correctly handle module relative paths, and can also contribute to
caching bugs taking effect. We rarely have a real need for trimpath of
test produced binaries, so avoiding it should be a net win.
Updates #2988
Signed-off-by: James Tucker <james@tailscale.com>
A simple implementation of latency and loss simulation, applied to
writes to the ethernet interface of the NIC. The latency implementation
could be optimized substantially later if necessary.
Updates #13355
Signed-off-by: James Tucker <james@tailscale.com>
During resolv.conf update, old 'search' lines are cleared but '\n' is not
deleted, leaving behind a new blank line on every update.
This adds 's' flag to regexp, so '\n' is included in the match and deleted when
old lines are cleared.
Also, insert missing `\n` when updated 'search' line is appended to resolv.conf.
Signed-off-by: Renato Aguiar <renato@renatoaguiar.net>
Cache state in memory on writes, read from memory
in reads.
kubestore was previously always reading state from a Secret.
This change should fix bugs caused by temporary loss of access
to kube API server and imporove overall performance
Fixes#7671
Updates tailscale/tailscale#12079,tailscale/tailscale#13900
Signed-off-by: Maisem Ali <maisem@tailscale.com>
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
Co-authored-by: Maisem Ali <maisem@tailscale.com>
In this PR, we update the syspolicy package to utilize syspolicy/rsop under the hood,
and remove syspolicy.CachingHandler, syspolicy.windowsHandler and related code
which is no longer used.
We mark the syspolicy.Handler interface and RegisterHandler/SetHandlerForTest functions
as deprecated, but keep them temporarily until they are no longer used in other repos.
We also update the package to register setting definitions for all existing policy settings
and to register the Registry-based, Windows-specific policy stores when running on Windows.
Finally, we update existing internal and external tests to use the new API and add a few more
tests and benchmarks.
Updates #12687
Signed-off-by: Nick Khyl <nickk@tailscale.com>
This allows us to print the time that a netcheck was run, which is
useful in debugging.
Updates #10972
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: Id48d30d4eb6d5208efb2b1526a71d83fe7f9320b
Few changes to resolve TODOs in the code:
- Instead of using a hardcoded IP, get it from the netmap.
- Use 100.100.100.100 as the gateway IP
- Use the /10 CGNAT range instead of a random /24
Updates #2589
Signed-off-by: Maisem Ali <maisem@tailscale.com>
It had bit-rotted likely during the transition to vector io in
76389d8baf. Tested on Ubuntu 24.04
by creating a netns and doing the DHCP dance to get an IP.
Updates #2589
Signed-off-by: Maisem Ali <maisem@tailscale.com>
In f77821fd63 (released in v1.72.0), we made the client tell a DERP server
when the connection was not its ideal choice (the first node in its region).
But we didn't do anything with that information until now. This adds a
metric about how many such connections are on a given derper, and also
adds a bit to the PeerPresentFlags bitmask so watchers can identify
(and rebalance) them.
Updates tailscale/corp#372
Change-Id: Ief8af448750aa6d598e5939a57c062f4e55962be
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Updates tailscale/tailscale#13839
Adds a new blockblame package which can detect common MITM SSL certificates used by network appliances. We use this in `tlsdial` to display a dedicated health warning when we cannot connect to control, and a network appliance MITM attack is detected.
Signed-off-by: Andrea Gottardo <andrea@gottardo.me>
Clamp the min and max version for DSM 7.0 and DSM 7.2 packages when we
are building packages for the synology package centre. This change
leaves packages destined for pkgs.tailscale.com with just the min
version set to not break packages in the wild / our update flow.
Updates https://github.com/tailscale/corp/issues/22908
Signed-off-by: Mario Minardi <mario@tailscale.com>
GetReport() may have side effects when the caller enforces a deadline
that is shorter than ReportTimeout.
Updates #13783
Updates #13394
Signed-off-by: Jordan Whited <jordan@tailscale.com>
We add the ClientID() method to the ipnauth.Actor interface and updated ipnserver.actor to implement it.
This method returns a unique ID of the connected client if the actor represents one. It helps link a series
of interactions initiated by the client, such as when a notification needs to be sent back to a specific session,
rather than all active sessions, in response to a certain request.
We also add LocalBackend.WatchNotificationsAs and LocalBackend.StartLoginInteractiveAs methods,
which are like WatchNotifications and StartLoginInteractive but accept an additional parameter
specifying an ipnauth.Actor who initiates the operation. We store these actor identities in
watchSession.owner and LocalBackend.authActor, respectively,and implement LocalBackend.sendTo
and related helper methods to enable sending notifications to watchSessions associated with actors
(or, more broadly, identifiable recipients).
We then use the above to change who receives the BrowseToURL notifications:
- For user-initiated, interactive logins, the notification is delivered only to the user who initiated the
process. If the initiating actor represents a specific connected client, the URL notification is sent back
to the same LocalAPI client that called StartLoginInteractive. Otherwise, the notification is sent to all
clients connected as that user.
Currently, we only differentiate between users on Windows, as it is inherently a multi-user OS.
- In all other cases (e.g., node key expiration), we send the notification to all connected users.
Updates tailscale/corp#18342
Signed-off-by: Nick Khyl <nickk@tailscale.com>
Write timeouts can be indicative of stalled TCP streams. Understanding
changes in the rate of such events can be helpful in an ops context.
Updates tailscale/corp#23668
Signed-off-by: Jordan Whited <jordan@tailscale.com>
This fixes the installation on newer Fedora versions that use dnf5 as
the 'dnf' binary.
Updates #13828
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I39513243c81640fab244a32b7dbb3f32071e9fce
Adds logic to `checkExitNodePrefsLocked` to return an error when
attempting to use exit nodes on a platform where this is not supported.
This mirrors logic that was added to error out when trying to use `ssh`
on an unsupported platform, and has very similar semantics.
Fixes https://github.com/tailscale/tailscale/issues/13724
Signed-off-by: Mario Minardi <mario@tailscale.com>
While looking at deflaking TestTwoDevicePing/ping_1.0.0.2_via_SendPacket,
there were a bunch of distracting:
WARNING: (non-fatal) nil health.Tracker (being strict in CI): ...
This pacifies those so it's easier to work on actually deflaking the test.
Updates #11762
Updates #11874
Change-Id: I08dcb44511d4996b68d5f1ce5a2619b555a2a773
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
* updates to LocalBackend require metrics to be passed in which are now initialized
* os.MkdirTemp isn't supported in wasm/js so we simply return empty
string for logger
* adds a UDP dialer which was missing and led to the dialer being
incompletely initialized
Fixes#10454 and #8272
Signed-off-by: Christian <christian@devzero.io>
In this PR we add syspolicy/rsop package that facilitates policy source registration
and provides access to the resultant policy merged from all registered sources for a
given scope.
Updates #12687
Signed-off-by: Nick Khyl <nickk@tailscale.com>
For a customer that wants to run their own DERP prober, let's add a
/healthz endpoint that can be used to monitor derpprobe itself.
Updates #6526
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: Iba315c999fc0b1a93d8c503c07cc733b4c8d5b6b
Our existing container-detection tricks did not work on Kubernetes,
where Docker is no longer used as a container runtime. Extends the
existing go build tags for containers to the other container packages
and uses that to reliably detect builds that were created by Tailscale
for use in a container. Unfortunately this doesn't necessarily improve
detection for users' custom builds, but that's a separate issue.
Updates #13825
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
connstats currently increments the packet counter whenever it is called
to store a length of data, however when udp batch sending was introduced
we pass the length for a series of packages, and it is only incremented
ones, making it count wrongly if we are on a platform supporting udp
batches.
Updates tailscale/corp#22075
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
This allows passing through any environment variables that we set ourselves, for example DBUS_SESSION_BUS_ADDRESS.
Updates #11175
Co-authored-by: Mario Minardi <mario@tailscale.com>
Signed-off-by: Percy Wegmann <percy@tailscale.com>
The bools.Compare function compares boolean values
by reporting -1, 0, +1 for ordering so that it can be easily
used with slices.SortFunc.
Updates #cleanup
Updates tailscale/corp#11038
Signed-off-by: Joe Tsai <joetsai@digital-static.net>
If multiple upstream DNS servers are available, quad-100 sends requests to all of them
and forwards the first successful response, if any. If no successful responses are received,
it propagates the first failure from any of them.
This PR adds some test coverage for these scenarios.
Updates #13571
Signed-off-by: Nick Khyl <nickk@tailscale.com>
We currently have two executions paths where (*forwarder).forwardWithDestChan
returns nil, rather than an error, without sending a DNS response to responseChan.
These paths are accompanied by a comment that reads:
// Returning an error will cause an internal retry, there is
// nothing we can do if parsing failed. Just drop the packet.
But it is not (or no longer longer) accurate: returning an error from forwardWithDestChan
does not currently cause a retry.
Moreover, although these paths are currently unreachable due to implementation details,
if (*forwarder).forwardWithDestChan were to return nil without sending a response to
responseChan, it would cause a deadlock at one call site and a panic at another.
Therefore, we update (*forwarder).forwardWithDestChan to return errors in those two paths
and remove comments that were no longer accurate and misleading.
Updates #cleanup
Updates #13571
Signed-off-by: Nick Hill <mykola.khyl@gmail.com>
If a DoH server returns an HTTP server error, rather than a SERVFAIL within
a successful HTTP response, we should handle it in the same way as SERVFAIL.
Updates #13571
Signed-off-by: Nick Hill <mykola.khyl@gmail.com>
As per the docstring, (*forwarder).forwardWithDestChan should either send to responseChan
and returns nil, or returns a non-nil error (without sending to the channel).
However, this does not hold when all upstream DNS servers replied with an error.
We've been handling this special error path in (*Resolver).Query but not in (*Resolver).HandlePeerDNSQuery.
As a result, SERVFAIL responses from upstream servers were being converted into HTTP 503 responses,
instead of being properly forwarded as SERVFAIL within a successful HTTP response, as per RFC 8484, section 4.2.1:
A successful HTTP response with a 2xx status code (see Section 6.3 of [RFC7231]) is used for any valid DNS response,
regardless of the DNS response code. For example, a successful 2xx HTTP status code is used even with a DNS message
whose DNS response code indicates failure, such as SERVFAIL or NXDOMAIN.
In this PR we fix (*forwarder).forwardWithDestChan to no longer return an error when it sends a response to responseChan,
and remove the special handling in (*Resolver).Query, as it is no longer necessary.
Updates #13571
Signed-off-by: Nick Hill <mykola.khyl@gmail.com>
This helps better distinguish what is generating activity to the
Tailscale public API.
Updates tailscale/corp#23838
Signed-off-by: Percy Wegmann <percy@tailscale.com>
We were using google/uuid in two places and that brought in database/sql/driver.
We didn't need it in either place.
Updates #13760
Updates tailscale/corp#20099
Change-Id: Ieed32f1bebe35d35f47ec5a2a429268f24f11f1f
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
There's never a tailscaled on iOS. And we can't run child processes to
look for it anyway.
Updates tailscale/corp#20099
Change-Id: Ieb3776f4bb440c4f1c442fdd169bacbe17f23ddb
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
We probably shouldn't link it in anywhere, but let's fix iOS for now.
Updates #13762
Updates tailscale/corp#20099
Change-Id: Idac116e9340434334c256acba3866f02bd19827c
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
One primary purpose of WithLock is to mutate the underlying map.
However, this can lead to a panic if it happens to be nil.
Thus, always allocate a map before passing it to f.
Updates tailscale/corp#11038
Signed-off-by: Joe Tsai <joetsai@digital-static.net>
Thus new function allows constructing vizerrors that combine a message
appropriate for display to users with a wrapped underlying error.
Updates tailscale/corp#23781
Signed-off-by: Percy Wegmann <percy@tailscale.com>
Add Keys, Values, and All to iterate over
all keys, values, and entries, respectively.
Updates #11038
Signed-off-by: Joe Tsai <joetsai@digital-static.net>
cmd/k8s-operator,k8s-operator/apis: set a readiness condition on egress Services
Set a readiness condition on ExternalName Services that define a tailnet target
to route cluster traffic to via a ProxyGroup's proxies. The condition
is set to true if at least one proxy is currently set up to route.
Updates tailscale/tailscale#13406
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
Their callers using Range are all kinda clunky feeling. Iterators
should make them more readable.
Updates #12912
Change-Id: I93461eba8e735276fda4a8558a4ae4bfd6c04922
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
We don't need to error out and continuously reconcile if ProxyClass
has not (yet) been created, once it gets created the ProxyGroup
reconciler will get triggered.
Updates tailscale/tailscale#13406
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
As discussed in #13684, base the ProxyGroup's proxy definitions on the same
scaffolding as the existing proxies, as defined in proxy.yaml
Updates #13406
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
Instead of converting our PortMap struct to a string during marshalling
for use as a key, convert the whole collection of PortMaps to a list of
PortMap objects, which improves the readability of the JSON config while
still keeping the data structure we need in the code.
Updates #13406
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
Currently egress Services for ProxyGroup only work for Pods and Services
with IPv4 addresses. Ensure that it works on dual stack clusters by reading
proxy Pod's IP from the .status.podIPs list that always contains both
IPv4 and IPv6 address (if the Pod has them) rather than .status.podIP that
could contain IPv6 only for a dual stack cluster.
Updates tailscale/tailscale#13406
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
The default ProxyClass can be set via helm chart or env var, and applies
to all proxies that do not otherwise have an explicit ProxyClass set.
This ensures proxies created by the new ProxyGroup CRD are consistent
with the behaviour of existing proxies
Nearby but unrelated changes:
* Fix up double error logs (controller runtime logs returned errors)
* Fix a couple of variable names
Updates #13406
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
Rearrange conditionals to reduce indentation and make it a bit easier to read
the logic. Also makes some error message updates for better consistency
with the recent decision around capitalising resource names and the
upcoming addition of config secrets.
Updates #cleanup
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
It is sometimes necessary to defer initialization steps until the first actual usage
or until certain prerequisites have been met. For example, policy setting and
policy source registration should not occur during package initialization.
Instead, they should be deferred until the syspolicy package is actually used.
Additionally, any errors should be properly handled and reported, rather than
causing a panic within the package's init function.
In this PR, we add DeferredInit, to facilitate the registration and invocation
of deferred initialization functions.
Updates #12687
Signed-off-by: Nick Hill <mykola.khyl@gmail.com>
To avoid warning:
find: warning: you have specified the global option -maxdepth after the argument -type, but global options are not positional, i.e., -maxdepth affects tests specified before it as well as those specified after it. Please specify global options before other arguments.
Fixestailscale/corp#23689
Change-Id: I91ee260b295c552c0a029883d5e406733e081478
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Implements the controller for the new ProxyGroup CRD, designed for
running proxies in a high availability configuration. Each proxy gets
its own config and state Secret, and its own tailscale node ID.
We are currently mounting all of the config secrets into the container,
but will stop mounting them and instead read them directly from the kube
API once #13578 is implemented.
Updates #13406
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
Like we do for the ones on iOS.
As a bonus, this removes a caller of tsaddr.IsTailscaleIP which we
want to revamp/remove soonish.
Updates #13687
Change-Id: Iab576a0c48e9005c7844ab52a0aba5ba343b750e
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Per my investigation just now, the $HOME environment variable is unset
on the macsys (standalone macOS GUI) variant, but the current working
directory is valid. Look for the environment variable file in that
location in addition to inside the home directory.
Updates #3707
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I481ae2e0d19b316244373e06865e3b5c3a9f3b88
Extend safeweb.Config with the ability to pass a http.Server that
safeweb will use to server traffic.
Updates corp#8207
Signed-off-by: Patrick O'Doherty <patrick@tailscale.com>
Adds a new reconciler that reconciles ExternalName Services that define a
tailnet target that should be exposed to cluster workloads on a ProxyGroup's
proxies.
The reconciler ensures that for each such service, the config mounted to
the proxies is updated with the tailnet target definition and that
and EndpointSlice and ClusterIP Service are created for the service.
Adds a new reconciler that ensures that as proxy Pods become ready to route
traffic to a tailnet target, the EndpointSlice for the target is updated
with the Pods' endpoints.
Updates tailscale/tailscale#13406
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
The AddSNATRuleForDst rule was adding a new rule each time it was called including:
- if a rule already existed
- if a rule matching the destination, but with different desired source already existed
This was causing issues especially for the in-progress egress HA proxies work,
where the rules are now refreshed more frequently, so more redundant rules
were being created.
This change:
- only creates the rule if it doesn't already exist
- if a rule for the same dst, but different source is found, delete it
- also ensures that egress proxies refresh firewall rules
if the node's tailnet IP changes
Updates tailscale/tailscale#13406
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
Updates tailscale/tailscale#3363
We know `log.tailscale.io` supports TLS 1.3, so we can enforce its usage in the client to shake some bytes off the TLS handshake each time a connection is opened to upload logs.
Signed-off-by: Andrea Gottardo <andrea@gottardo.me>
The new logging in 2dd71e64ac is spammy at shutdown:
Receive func ReceiveIPv6 exiting with error: *net.OpError, read udp [::]:38869: raw-read udp6 [::]:38869: use of closed network connection
Receive func ReceiveIPv4 exiting with error: *net.OpError, read udp 0.0.0.0:36123: raw-read udp4 0.0.0.0:36123: use of closed network connection
Skip it if we're in the process of shutting down.
Updates #10976
Change-Id: I4f6d1c68465557eb9ffe335d43d740e499ba9786
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
We were selectively uploading it, but we were still gathering it,
which can be a waste of CPU.
Also remove a bunch of complexity that I don't think matters anymore.
And add an envknob to force service collection off on a single node,
even if the tailnet policy permits it.
Fixes#13463
Change-Id: Ib6abe9e29d92df4ffa955225289f045eeeb279cf
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
When an Exit Node is used, we create a WFP rule to block all inbound and outbound traffic,
along with several rules to permit specific types of traffic. Notably, we allow all inbound and
outbound traffic to and from LocalRoutes specified in wgengine/router.Config. The list of allowed
routes always includes routes for internal interfaces, such as loopback and virtual Hyper-V/WSL2
interfaces, and may also include LAN routes if the "Allow local network access" option is enabled.
However, these permitting rules do not allow link-local multicast on the corresponding interfaces.
This results in broken mDNS/LLMNR, and potentially other similar issues, whenever an exit node is used.
In this PR, we update (*wf.Firewall).UpdatePermittedRoutes() to create rules allowing outbound and
inbound link-local multicast traffic to and from the permitted IP ranges, partially resolving the mDNS/LLMNR
and *.local name resolution issue.
Since Windows does not attempt to send mDNS/LLMNR queries if a catch-all NRPT rule is present,
it is still necessary to disable the creation of that rule using the disable-local-dns-override-via-nrpt nodeAttr.
Updates #13571
Signed-off-by: Nick Khyl <nickk@tailscale.com>
Of tests we wish we could easily add. One day.
Updates #13038
Change-Id: If44646f8d477674bbf2c9a6e58c3cd8f94a4e8df
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Updates tailscale/tailscale#6148
This is the result of some observations we made today with @raggi. The DNS over HTTPS client currently doesn't cap the number of connections it uses, either in-use or idle. A burst of DNS queries will open multiple connections. Idle connections remain open for 30 seconds (this interval is defined in the dohTransportTimeout constant). For DoH providers like NextDNS which send keep-alives, this means the cellular modem will remain up more than expected to send ACKs if any keep-alives are received while a connection remains idle during those 30 seconds. We can set the IdleConnTimeout to 10 seconds to ensure an idle connection is terminated if no other DNS queries come in after 10 seconds. Additionally, we can cap the number of connections to 1. This ensures that at all times there is only one open DoH connection, either active or idle. If idle, it will be terminated within 10 seconds from the last query.
We also observed all the DoH providers we support are capable of TLS 1.3. We can force this TLS version to reduce the number of packets sent/received each time a TLS connection is established.
Signed-off-by: Andrea Gottardo <andrea@gottardo.me>
1eaad7d3de regressed some tests in another repo that were starting up
a control server on `http://127.0.0.1:nnn`. Because there was no https
running, and because of a bug in 1eaad7d3de (which ended up checking
the recently-dialed-control check twice in a single dial call), we
ended up forcing only the use of TLS dials in a test that only had
plaintext HTTP running.
Instead, plumb down support for explicitly disabling TLS fallbacks and
use it only when running in a test and using `http` scheme control
plane URLs to 127.0.0.1 or localhost.
This fixes the tests elsewhere.
Updates #13597
Change-Id: I97212ded21daf0bd510891a278078daec3eebaa6
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
I noticed while debugging a test failure elsewhere that our failure
logs (when verbosity is cranked up) were uselessly attributing dial
failures to failure to dial an invalid IP address (this IPv6 address
we didn't have), rather than showing me the actual IPv4 connection
failure.
Updates #13597 (tangentially)
Change-Id: I45ffbefbc7e25ebfb15768006413a705b941dae5
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Updates tailscale/tailscale#1634
Updates tailscale/tailscale#13265
Captive portal detection uses a custom `net.Dialer` in its `http.Client`. This custom Dialer ensures that the socket is bound specifically to the Wi-Fi interface. This is crucial because without it, if any default routes are set, the outgoing requests for detecting a captive portal would bypass Wi-Fi and go through the default route instead.
The Dialer did not have a Timeout property configured, so the default system timeout was applied. This caused issues in #13265, where we attempted to make captive portal detection requests over an IPsec interface used for Wi-Fi Calling. The call to `connect()` would fail and remain blocked until the system timeout (approximately 1 minute) was reached.
In #13598, I simply excluded the IPsec interface from captive portal detection. This was a quick and safe mitigation for the issue. This PR is a follow-up to make the process more robust, by setting a 3 seconds timeout on any connection establishment on any interface (this is the same timeout interval we were already setting on the HTTP client).
Signed-off-by: Andrea Gottardo <andrea@gottardo.me>
We were previously not checking that the external IP that we got back
from a UPnP portmap was a valid endpoint; add minimal validation that
this endpoint is something that is routeable by another host.
Updates tailscale/corp#23538
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: Id9649e7683394aced326d5348f4caa24d0efd532
This pulls out the clock and forceNoise443 code into methods on the
Dialer as cleanup in its own commit to make a future change less
distracting.
Updates #13597
Change-Id: I7001e57fe7b508605930c5b141a061b6fb908733
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
In prep for a future port 80 MITM fix, make the 'debug ts2021' command
retry once after a failure to give it a chance to pick a new strategy.
Updates #13597
Change-Id: Icb7bad60cbf0dbec78097df4a00e9795757bc8e4
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
* cmd/containerboot,kube,util/linuxfw: configure kube egress proxies to route to 1+ tailnet targets
This commit is first part of the work to allow running multiple
replicas of the Kubernetes operator egress proxies per tailnet service +
to allow exposing multiple tailnet services via each proxy replica.
This expands the existing iptables/nftables-based proxy configuration
mechanism.
A proxy can now be configured to route to one or more tailnet targets
via a (mounted) config file that, for each tailnet target, specifies:
- the target's tailnet IP or FQDN
- mappings of container ports to which cluster workloads will send traffic to
tailnet target ports where the traffic should be forwarded.
Example configfile contents:
{
"some-svc": {"tailnetTarget":{"fqdn":"foo.tailnetxyz.ts.net","ports"{"tcp:4006:80":{"protocol":"tcp","matchPort":4006,"targetPort":80},"tcp:4007:443":{"protocol":"tcp","matchPort":4007,"targetPort":443}}}}
}
A proxy that is configured with this config file will configure firewall rules
to route cluster traffic to the tailnet targets. It will then watch the config file
for updates as well as monitor relevant netmap updates and reconfigure firewall
as needed.
This adds a bunch of new iptables/nftables functionality to make it easier to dynamically update
the firewall rules without needing to restart the proxy Pod as well as to make
it easier to debug/understand the rules:
- for iptables, each portmapping is a DNAT rule with a comment pointing
at the 'service',i.e:
-A PREROUTING ! -i tailscale0 -p tcp -m tcp --dport 4006 -m comment --comment "some-svc:tcp:4006 -> tcp:80" -j DNAT --to-destination 100.64.1.18:80
Additionally there is a SNAT rule for each tailnet target, to mask the source address.
- for nftables, a separate prerouting chain is created for each tailnet target
and all the portmapping rules are placed in that chain. This makes it easier
to look up rules and delete services when no longer needed.
(nftables allows hooking a custom chain to a prerouting hook, so no extra work
is needed to ensure that the rules in the service chains are evaluated).
The next steps will be to get the Kubernetes Operator to generate
the configfile and ensure it is mounted to the relevant proxy nodes.
Updates tailscale/tailscale#13406
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
The operator creates a non-reusable auth key for each of
the cluster proxies that it creates and puts in the tailscaled
configfile mounted to the proxies.
The proxies are always tagged, and their state is persisted
in a Kubernetes Secret, so their node keys are expected to never
be regenerated, so that they don't need to re-auth.
Some tailnet configurations however have seen issues where the auth
keys being left in the tailscaled configfile cause the proxies
to end up in unauthorized state after a restart at a later point
in time.
Currently, we have not found a way to reproduce this issue,
however this commit removes the auth key from the config once
the proxy can be assumed to have logged in.
If an existing, logged-in proxy is upgraded to this version,
its redundant auth key will be removed from the conffile.
If an existing, logged-in proxy is downgraded from this version
to a previous version, it will work as before without re-issuing key
as the previous code did not enforce that a key must be present.
Updates tailscale/tailscale#13451
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
The ProxyGroup CRD specifies a set of N pods which will each be a
tailnet device, and will have M different ingress or egress services
mapped onto them. It is the mechanism for specifying how highly
available proxies need to be. This commit only adds the definition, no
controller loop, and so it is not currently functional.
This commit also splits out TailnetDevice and RecorderTailnetDevice
into separate structs because the URL field is specific to recorders,
but we want a more generic struct for use in the ProxyGroup status field.
Updates #13406
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
Updates tailscale/tailscale#1634
Logs from some iOS users indicate that we're pointlessly performing captive portal detection on certain interfaces named ipsec*. These are tunnels with the cellular carrier that do not offer Internet access, and are only used to provide internet calling functionality (VoLTE / VoWiFi).
```
attempting to do captive portal detection on interface ipsec1
attempting to do captive portal detection on interface ipsec6
```
This PR excludes interfaces with the `ipsec` prefix from captive portal detection.
Signed-off-by: Andrea Gottardo <andrea@gottardo.me>
Add logic for parsing and matching against our planned format for
AcceptEnv values. Namely, this supports direct matches against string
values and matching where * and ? are treated as wildcard characters
which match against an arbitrary number of characters and a single
character respectively.
Actually using this logic in non-test code will come in subsequent
changes.
Updates https://github.com/tailscale/corp/issues/22775
Signed-off-by: Mario Minardi <mario@tailscale.com>
Like Linux, macOS will reply to sendto(2) with EPERM if the firewall is
currently blocking writes, though this behavior is like Linux
undocumented. This is often caused by a faulting network extension or
content filter from EDR software.
Updates #11710
Updates #12891
Updates #13511
Signed-off-by: James Tucker <james@tailscale.com>
This breaks its ability to be used as an expvar and is blocking a trunkd
deploy. Revert for now, and add a test to ensure that we don't break it
in a future change.
Updates #13550
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I1f1221c257c1de47b4bff0597c12f8530736116d
When querying for an exit node suggestion, occasionally it triggers a
new report concurrently with an existing report in progress. Generally,
there should always be a recent report or one in progress, so it is
redundant to start one there, and it causes concurrency issues.
Fixes#12643
Change-Id: I66ab9003972f673e5d4416f40eccd7c6676272a5
Signed-off-by: Adrian Dewhurst <adrian@tailscale.com>
this commit changes usermetrics to be non-global, this is a building
block for correct metrics if a go process runs multiple tsnets or
in tests.
Updates #13420
Updates tailscale/corp#22075
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
So it doesn't delete and re-pull when switching between branches.
Updates tailscale/corp#17686
Change-Id: Iffb989781db42fcd673c5f03dbd0ce95972ede0f
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Updates tailscale/tailscale#13326
Adds a CLI subcommand to perform DNS queries using the internal DNS forwarder and observe its internals (namely, which upstream resolvers are being used).
Signed-off-by: Andrea Gottardo <andrea@gottardo.me>
Pin re-actors/alls-green usage to latest 1.x. This was previously
pointing to `@release/v2` which pulls in the latest changes from this
branch as they are released, with the potential to break our workflows
if a breaking change or malicious version on this stream is ever pushed.
Changing this to a pinned version also means that dependabot will keep
this in the pinned version format (e.g., referencing a SHA) when it
opens a PR to bump the dependency.
Updates #cleanup
Signed-off-by: Mario Minardi <mario@tailscale.com>
Update and pin actions/upload-artifact usage to latest 4.x. These were
previously pointing to @3 which pulls in the latest v3 as they are
released, with the potential to break our workflows if a breaking change
or malicious version on the @3 stream is ever pushed.
Changing this to a pinned version also means that dependabot will keep
this in the pinned version format (e.g., referencing a SHA) when it
opens a PR to bump the dependency.
Updates #cleanup
Signed-off-by: Mario Minardi <mario@tailscale.com>
Update and pin actions/cache usage to latest 4.x. These were previously
pointing to `@3` which pulls in the latest v3 as they are released, with
the potential to break our workflows if a breaking change or malicious
version on the `@3` stream is ever pushed.
Changing this to a pinned version also means that dependabot will keep
this in the pinned version format (e.g., referencing a SHA) when it
opens a PR to bump the dependency.
The breaking change between v3 and v4 is that v4 requires Node 20 which
should be a non-issue where this is run.
Updates #cleanup
Signed-off-by: Mario Minardi <mario@tailscale.com>
Use slackapi/slack-github-action across the board and pin to latest 1.x.
Previously we were referencing the 1.27.0 tag directly which is
vulnerable to someone replacing that version tag with malicious code.
Replace usage of ruby/action-slack with slackapi/slack-github-action as
the latter is the officially supported action from slack.
Updates #cleanup
Signed-off-by: Mario Minardi <mario@tailscale.com>
Pin codeql actions usage to latest 3.x. These were previously pointing
to `@2` which pulls in the latest v2 as they are released, with the
potential to break our workflows if a breaking change or malicious
version on the `@2` stream is ever pushed.
Changing this to a pinned version also means that dependabot will keep
this in the pinend version format (e.g., referencing a SHA) when it
opens a PR to bump the dependency.
The breaking change between v2 and v3 is that v3 requires Node 20 which
is a non-issue as we are running this on ubuntu latest.
Updates #cleanup
Signed-off-by: Mario Minardi <mario@tailscale.com>
Pin actions/checkout usage to latest 5.x. These were previously pointing
to `@4` which pulls in the latest v4 as they are released, with the
potential to break our workflows if a breaking change or malicious
version on the `@4` stream is ever pushed.
Changing this to a pinned version also means that dependabot will keep
this in the pinend version format (e.g., referencing a SHA) when it
opens a PR to bump the dependency.
The breaking change between v4 and v5 is that v5 requires Node 20 which
should be a non-issue where it is used.
Updates #cleanup
Signed-off-by: Mario Minardi <mario@tailscale.com>
Pin actions/checkout usage to latest 3.x or 4.x as appropriate. These
were previously pointing to `@4` or `@3` which pull in the latest
versions at these tags as they are released, with the potential to break
our workflows if a breaking change or malicious version for either of
these streams are released.
Changing this to a pinned version also means that dependabot will keep
this in the pinend version format (e.g., referencing a SHA) when it
opens a PR to bump the dependency.
Updates #cleanup
Signed-off-by: Mario Minardi <mario@tailscale.com>
Add an `AcceptEnv` field to `SSHRule`. This will contain the collection
of environment variable names / patterns that are specified in the
`acceptEnv` block for the SSH rule within the policy file. This will be
used in the tailscale client to filter out unacceptable environment
variables.
Updates: https://github.com/tailscale/corp/issues/22775
Signed-off-by: Mario Minardi <mario@tailscale.com>
Update go.toolchain.rev for https://github.com/tailscale/go/pull/104 and
add a test that, when using the tailscale_go build tag, we use the
right Go toolchain.
We'll crank up the strictness in later commits.
Updates #13527
Change-Id: Ifb09a844858be2beb144a420e4e9dbdc5c03ae3a
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
containerboot's main.go had grown to well over 1000 lines with
lots of disparate bits of functionality. This commit is pure copy-
paste to group related functionality outside of the main function
into its own set of files. Everything is still in the main package
to keep the diff incremental and reviewable.
Updates #cleanup
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
mdnsResponder at least as of macOS Sequoia does not find NXDOMAIN
responses to these dns-sd PTR queries acceptable unless they include the
question section in the response. This was found debugging #13511, once
we turned on additional diagnostic reporting from mdnsResponder we
witnessed:
```
Received unacceptable 12-byte response from 100.100.100.100 over UDP via utun6/27 -- id: 0x7F41 (32577), flags: 0x8183 (R/Query, RD, RA, NXDomain), counts: 0/0/0/0,
```
If the response includes a question section, the resposnes are
acceptable, e.g.:
```
Received acceptable 59-byte response from 8.8.8.8 over UDP via en0/17 -- id: 0x2E55 (11861), flags: 0x8183 (R/Query, RD, RA, NXDomain), counts: 1/0/0/0,
```
This may be contributing to an issue under diagnosis in #13511 wherein
some combination of conditions results in mdnsResponder no longer
answering DNS queries correctly to applications on the system for
extended periods of time (multiple minutes), while dig against quad-100
provides correct responses for those same domains. If additional debug
logging is enabled in mdnsResponder we see it reporting:
```
Penalizing server 100.100.100.100 for 60 seconds
```
It is also possible that the reason that macOS & iOS never "stopped
spamming" these queries is that they have never been replied to with
acceptable responses. It is not clear if this special case handling of
dns-sd PTR queries was ever beneficial, and given this evidence may have
always been harmful. If we subsequently observe that the queries settle
down now that they have acceptable responses, we should remove these
special cases - making upstream queries very occasionally isn't a lot of
battery, so we should be better off having to maintain less special
cases and avoid bugs of this class.
Updates #2442
Updates #3025
Updates #3363
Updates #3594
Updates #13511
Signed-off-by: James Tucker <james@tailscale.com>
Updates tailscale/tailscale#13452
Bump the Go toolchain to the latest to pick up changes required to not crash on Android 9/10.
Signed-off-by: Andrea Gottardo <andrea@gottardo.me>
In prep for upcoming flow tracking & mutex contention optimization
changes, this change refactors (subjectively simplifying) how the DERP
Server accounts for which peers have written to which other peers, to
be able to send PeerGoneReasonDisconnected messages to writes to
uncache their DRPO (DERP Return Path Optimization) routes.
Notably, this removes the Server.sentTo field which was guarded by
Server.mu and checked on all packet sends. Instead, the accounting is
moved to each sclient's sendLoop goroutine and now only needs to
acquire Server.mu for newly seen senders, the first time a peer sends
a packet to that sclient.
This change reduces the number of reasons to acquire Server.mu
per-packet from two to one. Removing the last one is the subject of an
upcoming change.
Updates #3560
Updates #150
Change-Id: Id226216d6629d61254b6bfd532887534ac38586c
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This un-breaks vim-go (which doesn't understand "go 1.23") and allows
the natlab tests to work in a Nix shell (by adding the "qemu-img" and
"mkfs.ext4" binaries to the shell). These binaries are available even on
macOS, as I'm testing on my M1 Max.
Updates #13038
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I99f8521b5de93ea47dc33b099d5b243ffc1303da
Now that we have our API docs hosted at https://tailscale.com/api we can
remove the previous (and now outdated) markdown based docs. The top
level api.md has been left with the only content being the redirect to
the new docs.
Updates #cleanup
Signed-off-by: Mario Minardi <mario@tailscale.com>
netcheck.Client.GetReport() applies its own deadlines. This 2s deadline
was causing GetReport() to never fall back to HTTPS/ICMP measurements
as it was shorter than netcheck.stunProbeTimeout, leaving no time
for fallbacks.
Updates #13394
Updates #6187
Signed-off-by: Jordan Whited <jordan@tailscale.com>
And update a few callers as examples of motivation. (there are a
couple others, but these are the ones where it's prettier)
Updates #cleanup
Change-Id: Ic8c5cb7af0a59c6e790a599136b591ebe16d38eb
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
73280595a8 for #2751 added a "clientSet" interface to
distinguish the two cases of a client being singly connected (the
common case) vs tolerating multiple connections from the client at
once. At the time (three years ago) it was kinda an experiment
and we didn't know whether it'd stop the reconnect floods we saw
from certain clients. It did.
So this promotes it to a be first-class thing a bit, removing the
interface. The old tests from 73280595a were invaluable in ensuring
correctness while writing this change (they failed a bunch).
But the real motivation for this change is that it'll permit a future
optimization to add flow tracking for stats & performance where we
don't contend on Server.mu for each packet sent via DERP. Instead,
each client can track its active flows and hold on to a *clientSet and
ask the clientSet per packet what the active client is via one atomic
load rather than a mutex. And if the atomic load returns nil, we'll
know we need to ask the server to see if they died and reconnected and
got a new clientSet. But that's all coming later.
Updates #3560
Change-Id: I9ccda3e5381226563b5ec171ceeacf5c210e1faf
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
When the desired netfilter mode was unset, we would always try
to use the `iptables` binary. In such cases if iptables was not found,
tailscaled would just crash as seen in #13440. To work around this, in those
cases check if the `iptables` binary even exists and if it doesn't fall back
to the nftables implementation.
Verified that it works on stock Ubuntu 24.04.
Updates #5621
Updates #8555
Updates #8762Fixes#13440
Signed-off-by: Maisem Ali <maisem@tailscale.com>
cmd/k8s-operator,k8s-operator,kube: Add TSRecorder CRD + controller
Deploys tsrecorder images to the operator's cluster. S3 storage is
configured via environment variables from a k8s Secret. Currently
only supports a single tsrecorder replica, but I've tried to take early
steps towards supporting multiple replicas by e.g. having a separate
secret for auth and state storage.
Example CR:
```yaml
apiVersion: tailscale.com/v1alpha1
kind: Recorder
metadata:
name: rec
spec:
enableUI: true
```
Updates #13298
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
This mimics having Tailscale in the 'Stopped' state by programming an
empty DNS configuration when the current node key is expired.
Updates tailscale/support-escalations#55
Change-Id: I68ff4665761fb621ed57ebf879263c2f4b911610
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
It was scaring people. It's been pretty stable for quite some time now
and we're unlikely to change the API and break people at this point.
We might, but have been trying not to.
Fixestailscale/corp#22933
Change-Id: I0c3c79b57ccac979693c62ba320643a940ac947e
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Rename kube/{types,client,api} -> kube/{kubetypes,kubeclient,kubeapi}
so that we don't need to rename the package on each import to
convey that it's kubernetes specific.
Updates#cleanup
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
Further split kube package into kube/{client,api,types}. This is so that
consumers who only need constants/static types don't have to import
the client and api bits.
Updates#cleanup
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
We already disable dynamic updates by setting DisableDynamicUpdate to 1 for the Tailscale interface.
However, this does not prevent non-dynamic DNS registration from happening when `ipconfig /registerdns`
runs and in similar scenarios. Notably, dns/windowsManager.SetDNS runs `ipconfig /registerdns`,
triggering DNS registration for all interfaces that do not explicitly disable it.
In this PR, we update dns/windowsManager.disableDynamicUpdates to also set RegistrationEnabled to 0.
Fixes#13411
Signed-off-by: Nick Khyl <nickk@tailscale.com>
We no longer need this on Windows, and it was never required on other platforms.
It just results in more short-lived connections unless we use HTTP/2.
Updates tailscale/corp#18342
Signed-off-by: Nick Khyl <nickk@tailscale.com>
When tailscaled restarts and our watch connection goes down, we get
stuck in an infinite loop printing `ipnbus error: EOF` (which ended up
consuming all the disk space on my laptop via the log file). Instead,
handle errors in `watchIPNBus` and reconnect after a short delay.
Updates #1708
Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
Disable TCP & UDP GRO if the probe fails.
torvalds/linux@e269d79c7d broke virtio_net
TCP & UDP GRO causing GRO writes to return EINVAL. The bug was then
resolved later in
torvalds/linux@89add40066. The offending
commit was pulled into various LTS releases.
Updates #13041
Signed-off-by: Jordan Whited <jordan@tailscale.com>
Discovered this while investigating the following issue; I think it's
unrelated, but might as well fix it. Also, add a test helper for
checking things that have an IsZero method using the reflect package.
Updates tailscale/support-escalations#55
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I57b7adde43bcef9483763b561da173b4c35f49e2
When a rotation signature chain reaches a certain size, remove the
oldest rotation signature from the chain before wrapping it in a new
rotation signature.
Since all previous rotation signatures are signed by the same wrapping
pubkey (node's own tailnet lock key), the node can re-construct the
chain, re-signing previous rotation signatures. This will satisfy the
existing certificate validation logic.
Updates #13185
Signed-off-by: Anton Tolchanov <anton@tailscale.com>
With the upcoming syspolicy changes, it's imperative that all syspolicy keys are defined in the syspolicy package
for proper registration. Otherwise, the corresponding policy settings will not be read.
This updates a couple of places where we still use string literals rather than syspolicy consts.
Updates #12687
Signed-off-by: Nick Khyl <nickk@tailscale.com>
Updates tailscale/tailscale#13326
This PR begins implementing a `tailscale dns` command group in the Tailscale CLI. It provides an initial implementation of `tailscale dns status` which dumps the state of the internal DNS forwarder.
Two new endpoints were added in LocalAPI to support the CLI functionality:
- `/netmap`: dumps a copy of the last received network map (because the CLI shouldn't have to listen to the ipn bus for a copy)
- `/dns-osconfig`: dumps the OS DNS configuration (this will be very handy for the UI clients as well, as they currently do not display this information)
My plan is to implement other subcommands mentioned in tailscale/tailscale#13326, such as `query`, in later PRs.
Signed-off-by: Andrea Gottardo <andrea@gottardo.me>
This PR changes how LocalBackend handles interactive (initiated via StartLoginInteractive) and non-interactive (e.g., due to key expiration) logins,
and when it sends the authURL to the connected clients.
Specifically,
- When a user initiates an interactive login by clicking Log In in the GUI, the LocalAPI calls StartLoginInteractive.
If an authURL is available and hasn't expired, we immediately send it to all connected clients, suggesting them to open that URL in a browser.
Otherwise, we send a login request to the control plane and set a flag indicating that an interactive login is in progress.
- When LocalBackend receives an authURL from the control plane, we check if it differs from the previous one and whether an interactive login
is in progress. If either condition is true, we notify all connected clients with the new authURL and reset the interactive login flag.
We reset the auth URL and flags upon a successful authentication, when a different user logs in and when switching Tailscale login profiles.
Finally, we remove the redundant dedup logic added to WatchNotifications in #12096 and revert the tests to their original state to ensure that
calling StartLoginInteractive always produces BrowseToURL notifications, either immediately or when the authURL is received from the control plane.
Fixes#13296
Signed-off-by: Nick Khyl <nickk@tailscale.com>
Updates tailscale/tailscale#177
It appears that the OSS distribution of `tailscaled` is currently unable to get the current system base DNS configuration, as GetBaseConfig() in manager_darwin.go is unimplemented. This PR adds a basic implementation that reads the current values in `/etc/resolv.conf`, to at least unblock DNS resolution via Quad100 if `--accept-dns` is enabled.
Signed-off-by: Andrea Gottardo <andrea@gottardo.me>
We've added more probe targets recently which has resulted in more
timeouts behind restrictive NATs in localized testing that don't
like how many flows we are creating at once. Not so much an issue
for datacenter or cloud-hosted deployments.
Updates tailscale/corp#22114
Signed-off-by: Jordan Whited <jordan@tailscale.com>
We add package defining interfaces for policy stores, enabling creation of policy sources
and reading settings from them. It includes a Windows-specific PlatformPolicyStore for GP and MDM
policies stored in the Registry, and an in-memory TestStore for testing purposes.
We also include an internal package that tracks and reports policy usage metrics when a policy setting
is read from a store. Initially, it will be used only on Windows and Android, as macOS, iOS, and tvOS
report their own metrics. However, we plan to use it across all platforms eventually.
Updates #12687
Signed-off-by: Nick Khyl <nickk@tailscale.com>
* cmd/k8s-operator,k8s-operator/sessonrecording: ensure CastHeader contains terminal size
For tsrecorder to be able to play session recordings, the recording's
CastHeader must have '.Width' and '.Height' fields set to non-zero.
Kubectl (or whoever is the client that initiates the 'kubectl exec'
session recording) sends the terminal dimensions in a resize message that
the API server proxy can intercept, however that races with the first server
message that we need to record.
This PR ensures we wait for the terminal dimensions to be processed from
the first resize message before any other data is sent, so that for all
sessions with terminal attached, the header of the session recording
contains the terminal dimensions and the recording can be played by tsrecorder.
Updates tailscale/tailscale#19821
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
Previously, despite what the commit said, we were using a raw IP socket
that was *not* an AF_PACKET socket, and thus was subject to the host
firewall rules. Switch to using a real AF_PACKET socket to actually get
the functionality we want.
Updates #13140
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: If657daeeda9ab8d967e75a4f049c66e2bca54b78
I should've bumped capver in 65fe0ba7b5 but forgot.
This lets us turn off the cryptokey routing change from control for
the affected panicky range of commits, based on capver.
Updates #13332
Updates tailscale/corp#20732
Change-Id: I32c17cfcb45b2369b2b560032330551d47a0ce0b
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Both for Raspberry Pis, and for running natlab tests faster on Apple
Silicon Macs without emulating x86.
Not fully wired up yet.
Updates #1866
Updates #13038
Change-Id: I1552bf107069308f325f640773cc881ed735b5ab
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
No need to make callers specify the redundant IP version or
TTL/HopLimit or EthernetType in the common case. The mkPacket helper
can set those when unset.
And use the mkIPLayer in another place, simplifying some code.
And rename mkPacketErr to just mkPacket, then move mkPacket to
test-only code, as mustPacket.
Updates #13038
Change-Id: Ic216e44dda760c69ab9bfc509370040874a47d30
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
And clean up some of the test helpers in the process.
Updates #13038
Change-Id: I3e2b5f7028a32d97af7f91941e59399a8e222b25
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
I'd added this helper for tests, but then moved it to non-test code
and forgot some places to use it. This uses it in more places to
remove some boilerplate.
Updates #13038
Change-Id: Ic4dc339be1c47a55b71d806bab421097ee3d75ed
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Logging serial numbers every time they are read might have been useful
early on, but seems unnecessary now.
Updates #5902
Signed-off-by: Anton Tolchanov <anton@tailscale.com>
The LocalBackend's state machine starts in NoState and soon transitions to NeedsLogin if there's no auto-start profile,
with the profileManager starting with a new empty profile. Notably, entering the NeedsLogin state blocks engine updates.
We expect the user to transition out of this state by logging in interactively, and we set WantRunning to true when
controlclient enters the StateAuthenticated state.
While our intention is correct, and completing an interactive login should set WantRunning to true, our assumption
that logging into the current Tailscale profile is the only way to transition out of the NeedsLogin state is not accurate.
Another common transition path includes an explicit profile switch (via LocalBackend.SwitchProfile) or an implicit switch
when a Windows user connects to the backend. This results in a bug where WantRunning is set to true even when it was
previously set to false, and the user expressed no intention of changing it.
A similar issue occurs when switching from (sic) a Tailnet that has seamlessRenewalEnabled, regardless of the current state
of the LocalBackend's state machine, and also results in unexpectedly set WantRunning. While this behavior is generally
undesired, it is also incorrect that it depends on the control knobs of the Tailnet we're switching from rather than
the Tailnet we're switching to. However, this issue needs to be addressed separately.
This PR updates LocalBackend.SetControlClientStatus to only set WantRunning to true in response to an interactive login
as indicated by a non-empty authURL.
Fixes#6668Fixes#11280
Updates #12756
Signed-off-by: Nick Khyl <nickk@tailscale.com>
This adds tests for DNS requests, and ignoring IPv6 packets on v4-only
networks.
No behavior changes. But some things are pulled out into functions.
And the mkPacket helpers previously just for tests are moved into
non-test code to be used elsewhere to reduce duplication, doing the
checksum stuff automatically.
Updates #13038
Change-Id: I4dd0b73c75b2b9567b4be3f05a2792999d83f6a3
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
When the TS_DEBUG_NETSTACK_LOOPBACK_PORT environment variable is set,
netstack will loop back (dnat to addressFamilyLoopback:loopbackPort)
TCP & UDP flows originally destined to localServicesIP:loopbackPort.
localServicesIP is quad-100 or the IPv6 equivalent.
Updates tailscale/corp#22713
Signed-off-by: Jordan Whited <jordan@tailscale.com>
In preparation for multi-user and unattended mode improvements, we are
refactoring and cleaning up `ipn/ipnlocal.profileManager`. The concept of the
"current user", which is only relevant on Windows, is being deprecated and will
soon be removed to allow more than one Windows user to connect and utilize
`LocalBackend` according to that user's access rights to the device and specific
Tailscale profiles.
We plan to pass the user's identity down to the `profileManager`, where it can
be used to determine the user's access rights to a given `LoginProfile`. While
the new permission model in `ipnauth` requires more work and is currently
blocked pending PR reviews, we are updating the `profileManager` to reduce its
reliance on the concept of a single OS user being connected to the backend at
the same time.
We extract the switching to the default Tailscale profile, which may also
trigger legacy profile migration, from `profileManager.SetCurrentUserID`. This
introduces `profileManager.DefaultUserProfileID`, which returns the default
profile ID for the current user, and `profileManager.SwitchToDefaultProfile`,
which is essentially a shorthand for `pm.SwitchProfile(pm.DefaultUserProfileID())`.
Both methods will eventually be updated to accept the user's identity and
utilize that user's default profile.
We make access checks more explicit by introducing the `profileManager.checkProfileAccess`
method. The current implementation continues to use `profileManager.currentUserID`
and `LoginProfile.LocalUserID` to determine whether access to a given profile
should be granted. This will be updated to utilize the `ipnauth` package and the
new permissions model once it's ready. We also expand access checks to be used
more widely in the `profileManager`, not just when switching or listing
profiles. This includes access checks in methods like `SetPrefs` and, most notably,
`DeleteProfile` and `DeleteAllProfiles`, preventing unprivileged Windows users
from deleting Tailscale profiles owned by other users on the same device,
including profiles owned by local admins.
We extract `profileManager.ProfilePrefs` and `profileManager.SetProfilePrefs`
methods that can be used to get and set preferences of a given `LoginProfile` if
`profileManager.checkProfileAccess` permits access to it.
We also update `profileManager.setUnattendedModeAsConfigured` to always enable
unattended mode on Windows if `Prefs.ForceDaemon` is true in the current
`LoginProfile`, even if `profileManager.currentUserID` is `""`. This facilitates
enabling unattended mode via `tailscale up --unattended` even if
`tailscale-ipn.exe` is not running, such as when a Group Policy or MDM-deployed
script runs at boot time, or when Tailscale is used on a Server Code or otherwise
headless Windows environments. See #12239, #2137, #3186 and
https://github.com/tailscale/tailscale/pull/6255#issuecomment-2016623838 for
details.
Fixes#12239
Updates tailscale/corp#18342
Updates #3186
Updates #2137
Signed-off-by: Nick Khyl <nickk@tailscale.com>
This adds support for sending packets to 33:33:00:00:01 at IPv6
multicast address ff02::1 to send to all nodes.
Nothing in Tailscale depends on this (yet?), but it makes debugging in
VMs behind natlab easier (e.g. you can ping all nodes), and other
things might depend on this in the future.
Mostly I'm trying to flesh out the IPv6 support in natlab now that we
can write vnet tests.
Updates #13038
Change-Id: If590031fcf075690ca35c7b230a38c3e72e621eb
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Currently, we use PermitRead/PermitWrite/PermitCert permission flags to determine which operations are allowed for a LocalAPI client.
These checks are performed when localapi.Handler handles a request. Additionally, certain operations (e.g., changing the serve config)
requires the connected user to be a local admin. This approach is inherently racey and is subject to TOCTOU issues.
We consider it to be more critical on Windows environments, which are inherently multi-user, and therefore we prevent more than one
OS user from connecting and utilizing the LocalBackend at the same time. However, the same type of issues is also applicable to other
platforms when switching between profiles that have different OperatorUser values in ipn.Prefs.
We'd like to allow more than one Windows user to connect, but limit what they can see and do based on their access rights on the device
(e.g., an local admin or not) and to the currently active LoginProfile (e.g., owner/operator or not), while preventing TOCTOU issues on Windows
and other platforms. Therefore, we'd like to pass an actor from the LocalAPI to the LocalBackend to represent the user performing the operation.
The LocalBackend, or the profileManager down the line, will then check the actor's access rights to perform a given operation on the device
and against the current (and/or the target) profile.
This PR does not change the current permission model in any way, but it introduces the concept of an actor and includes some preparatory
work to pass it around. Temporarily, the ipnauth.Actor interface has methods like IsLocalSystem and IsLocalAdmin, which are only relevant
to the current permission model. It also lacks methods that will actually be used in the new model. We'll be adding these gradually in the next
PRs and removing the deprecated methods and the Permit* flags at the end of the transition.
Updates tailscale/corp#18342
Signed-off-by: Nick Khyl <nickk@tailscale.com>
To test how virtual machines connect to the natlab vnet code.
Updates #13038
Change-Id: Ia4fd4b0c1803580ee7d94cc9878d777ad4f24f82
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
And refactor some of vnet.go for testability.
The only behavioral change (with a new test) is that ethernet
broadcasts no longer get sent back to the sender.
Updates #13038
Change-Id: Ic2e7e7d6d8805b7b7f2b5c52c2c5ba97101cef14
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
The bad naming (which had only been half updated with the IPv6
changes) tripped me up in the earlier change.
Updates #13038
Change-Id: I65ce07c167e8219d35b87e1f4bf61aab4cac31ff
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
The reason they weren't working was because the cmd/tta agent in the
guest was dialing out to the test and the vnet couldn't map its global
unicast IPv6 address to a node as it was just using a
map[netip.Addr]*node and blindly trusting the *node was
populated. Instead, it was nil, so the agent connection fetching
didn't work for its RoundTripper and the test could never drive the
node. That map worked for IPv4 but for IPv6 we need to use the method
that takes into account the node's IPv6 SLAAC address. Most call sites
had been converted but I'd missed that one.
Also clean up some debug, and prohibit nodes' link-local unicast
addresses from dialing 2000::/3 directly for now. We can allow that to
be configured opt-in later (some sort of IPv6 NAT mode. Whatever it's
called.) That mode was working on accident, but was confusing: Linux
would do source address selection from link local for the first few
seconds and then after SLAAC and DAD, switch to using the global
unicast source address. Be consistent for now and force it to use the
global unicast.
Updates #13038
Change-Id: I85e973aaa38b43c14611943ff45c7c825ee9200a
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Really we need to fix logpolicy + bootstrapDNS to not be so aggressive,
but this is a quick workaround meanwhile.
Without this, tailscaled starts immediately while IPv6 DAD is
happening for a couple seconds and logpolicy freaks out without the
network available and starts spamming stderr about bootstrap DNS
options. But we see that regularly anyway from people whose wifi is
down. So we need to fix the general case. This is not that fix.
Updates #13038
Change-Id: Iba7e536d08e59d34abded1d279f88fdc9c46d94d
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
There were a few places it could get wedged (notably the dial without
a timeout).
And add a knob for verbose debug logs.
And keep two idle connections always.
Updates #13038
Change-Id: I952ad182d7111481d97a83c12aa2ff4bfdc55fe8
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Move all the UDP handling to its own func to remove a bunch of "if
isUDP" checks in a bunch of blocks.
Updates #13038
Change-Id: If71d71b49e57651d15bd307a2233c43751cc8639
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
I didn't actually see this, but added this while debugging something
and figured it'd be good to keep.
Updates #13038
Change-Id: I67934c8a329e0233f79c3b08516fd6bad6bfe22a
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
On a major link change the LAN routes may change, so on linkChange where
ChangeDelta.Major, we need to call authReconfig to ensure that new
routes are observed and applied.
Updates tailscale/corp#22574
Signed-off-by: James Tucker <james@tailscale.com>
This is the equivalent of quad-100, but for IPv6. This is technically
already contained in the Tailscale IPv6 ULA prefix, but that is only
installed when remote peers are visible via control with contained
addrs. The service addr should always be reachable.
Updates #1152
Signed-off-by: Jordan Whited <jordan@tailscale.com>
And sprinkle some more docs around.
Updates #13038
Change-Id: Ia2dcf567b68170481cc2094d64b085c6b94a778a
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
We have several checked type assertions to *types.Named in both cmd/cloner and cmd/viewer.
As Go 1.23 updates the go/types package to produce Alias type nodes for type aliases,
these type assertions no longer work as expected unless the new behavior is disabled
with gotypesalias=0.
In this PR, we add codegen.NamedTypeOf(t types.Type), which functions like t.(*types.Named)
but also unrolls type aliases. We then use it in place of type assertions in the cmd/cloner and
cmd/viewer packages where appropriate.
We also update type switches to include *types.Alias alongside *types.Named in relevant cases,
remove *types.Struct cases when switching on types.Type.Underlying and update the tests
with more cases where type aliases can be used.
Updates #13224
Updates #12912
Signed-off-by: Nick Khyl <nickk@tailscale.com>
Go 1.23 updates the go/types package to produce Alias type nodes for type aliases, unless disabled with gotypesalias=0.
This new default behavior breaks codegen.LookupMethod, which uses checked type assertions to types.Named and
types.Interface, as only named types and interfaces have methods.
In this PR, we update codegen.LookupMethod to perform method lookup on the right-hand side of the alias declaration
and clearly switch on the supported type nodes types. We also improve support for various edge cases, such as when an alias
is used as a type parameter constraint, and add tests for the LookupMethod function.
Additionally, we update cmd/viewer/tests to include types with aliases used in type fields and generic type constraints.
Updates #13224
Updates #12912
Signed-off-by: Nick Khyl <nickk@tailscale.com>
All the magic service names with virtual IPs will need IPv6 variants.
Pull this out in prep.
Updates #13038
Change-Id: I53b5eebd0679f9fa43dc0674805049258c83a0de
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
So we don't log about them when verbose logging is enabled.
Updates #13038
Change-Id: I925bc3a23e6c93d60dd4fb4bf6a4fdc5a326de95
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
The natlab Test Agent (tta) still had its old log streaming hack in
place where it dialed out to anything on TCP port 124 and those logs
were streamed to the host running the tests. But we'd since added gokrazy
syslog streaming support, which made that redundant.
So remove all the port 124 stuff. And then make sure we log to stderr
so gokrazy logs it to syslog.
Also, keep the first 1MB of logs in memory in tta too, exported via
localhost:8034/logs for interactive debugging. That was very useful
during debugging when I added IPv6 support. (which is coming in future
PRs)
Updates #13038
Change-Id: Ieed904a704410b9031d5fd5f014a73412348fa7f
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Otherwise you get "Access denied: watch IPN bus access denied, must
set ipn.NotifyNoPrivateKeys when not running as admin/root or
operator".
This lets a non-operator at least start the app and see the status, even
if they can't change everything. (the web UI is unaffected by operator)
A future change can add a LocalAPI call to check permissions and guide
people through adding a user as an operator (perhaps the web client
can do that?)
Updates #1708
Change-Id: I699e035a251b4ebe14385102d5e7a2993424c4b7
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This adds a systray app for linux, similar to the apps for macOS and
windows. There are already a number of community-developed systray apps,
but most of them are either long abandoned, are built for a specific
desktop environment, or simply wrap the tailscale CLI.
This uses fyne.io/systray (a fork of github.com/getlantern/systray)
which uses newer D-Bus specifications to render the tray icon and menu.
This results in a pretty broad support for modern desktop environments.
This initial commit lacks a number of features like profile switching,
device listing, and exit node selection. This is really focused on the
application structure, the interaction with LocalAPI, and some system
integration pieces like the app icon, notifications, and the clipboard.
Updates #1708
Signed-off-by: Will Norris <will@tailscale.com>
updates tailcale/corp#22371
For dgram mode, we need to store the write addresses of
the client socket(s) alongside the writer functions and
the write operation needs to use WriteToUnix.
Unix also has multiple clients writing to the same socket,
so the serve method is modified to handle packets from
multiple mac addresses.
Cleans up a bit of cruft from the initial tailmac tooling
commit.
Now all the macOS packets are belong to us.
Signed-off-by: Jonathan Nobels <jonathan@tailscale.com>
And convert a few callers as an example, but nowhere near all.
Updates #12912
Change-Id: I5eaa12a29a6cd03b58d6f1072bd27bc0467852f2
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
In prep for updating to new staticcheck required for Go 1.23.
Updates #12912
Change-Id: If77892a023b79c6fa798f936fc80428fd4ce0673
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
To avoid dig vs nslookup vs $X availability issues between
OSes/distros. And to be in Go, to match the resolver we use.
Updates #13038
Change-Id: Ib7e5c351ed36b5470a42cbc230b8f27eed9a1bf8
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
net/tstun.Wrapper.InjectInboundPacketBuffer is not GSO-aware, which can
break quad-100 TCP streams as a result. Linux is the only platform where
gVisor GSO was previously enabled.
Updates tailscale/corp#22511
Updates #13211
Signed-off-by: Jordan Whited <jordan@tailscale.com>
Instead of changing the working directory before launching the incubator process,
this now just changes the working directory after dropping privileges, at which
point we're more likely to be able to enter the user's home directory since we're
running as the user.
For paths that use the 'login' or 'su -l' commands, those already take care of changing
the working directory to the user's home directory.
Fixes#13120
Signed-off-by: Percy Wegmann <percy@tailscale.com>
This adds a new package containing generic types to be used for defining preference hierarchies.
These include prefs.Item, prefs.List, prefs.StructList, and prefs.StructMap. Each of these types
represents a configurable preference, holding the preference's state, value, and metadata.
The metadata includes the default value (if it differs from the zero value of the Go type)
and flags indicating whether a preference is managed via syspolicy or is hidden/read-only for
another reason. This information can be marshaled and sent to the GUI, CLI and web clients
as a source of truth regarding preference configuration, management, and visibility/mutability states.
We plan to use these types to define device preferences, such as the updater preferences,
the permission mode to be used on Windows with #tailscale/corp#18342, and certain global options
that are currently exposed as tailscaled flags. We also aim to eventually use these types for
profile-local preferences in ipn.Prefs and and as a replacement for ipn.MaskedPrefs.
The generic preference types are compatible with the tailscale.com/cmd/viewer and
tailscale.com/cmd/cloner utilities.
Updates #12736
Signed-off-by: Nick Khyl <nickk@tailscale.com>
In Tailnet Lock, there is an implicit limit on the number of rotation
signatures that can be chained before the signature becomes too long.
This program helps tailnet admins to identify nodes that have signatures
with long chains and prints commands to re-sign those node keys with a
fresh direct signature. It's a temporary mitigation measure, and we will
remove this tool as we design and implement a long-term approach for
rotation signatures.
Example output:
```
2024/08/20 18:25:03 Self: does not need re-signing
2024/08/20 18:25:03 Visible peers with valid signatures:
2024/08/20 18:25:03 Peer xxx2.yy.ts.net. (100.77.192.34) nodeid=nyDmhiZiGA11KTM59, current signature kind=direct: does not need re-signing
2024/08/20 18:25:03 Peer xxx3.yy.ts.net. (100.84.248.22) nodeid=ndQ64mDnaB11KTM59, current signature kind=direct: does not need re-signing
2024/08/20 18:25:03 Peer xxx4.yy.ts.net. (100.85.253.53) nodeid=nmZfVygzkB21KTM59, current signature kind=rotation: chain length 4, printing command to re-sign
tailscale lock sign nodekey:530bddbfbe69e91fe15758a1d6ead5337aa6307e55ac92dafad3794f8b3fc661 tlpub:4bf07597336703395f2149dce88e7c50dd8694ab5bbde3d7c2a1c7b3e231a3c2
```
To support this, the NetworkLockStatus localapi response now includes
information about signatures of all peers rather than just the invalid
ones. This is not displayed by default in `tailscale lock status`, but
will be surfaced in `tailscale lock status --json`.
Updates #13185
Signed-off-by: Anton Tolchanov <anton@tailscale.com>
This involved the following:
1. Pass the su command path as first of args in call to unix.Exec to make sure that busybox sees the correct program name.
Busybox is a single executable userspace that implements various core userspace commands in a single binary. You'll
see it used via symlinking, so that for example /bin/su symlinks to /bin/busybox. Busybox knows that you're trying
to execute /bin/su because argv[0] is '/bin/su'. When we called unix.Exec, we weren't including the program name for
argv[0], which caused busybox to fail with 'applet not found', meaning that it didn't know which command it was
supposed to run.
2. Tell su to whitelist the SSH_AUTH_SOCK environment variable in order to support ssh agent forwarding.
3. Run integration tests on alpine, which uses busybox.
4. Increment CurrentCapabilityVersion to allow turning on SSH V2 behavior from control.
Fixes#12849
Signed-off-by: Percy Wegmann <percy@tailscale.com>
In df6014f1d7 we removed build tag
gating preventing importation, which tripped a NetworkExtension limit
test in corp. This was a reversal of
25f0a3fc8f which actually made the
situation worse, hence the simplification.
This commit goes back to the strategy in
25f0a3fc8f, and gets us back under the
limit in my local testing. Admittedly, we don't fully understand
the effects of importing or excluding importation of this package,
and have seen mixed results, but this commit allows us to move forward
again.
Updates tailscale/corp#22125
Signed-off-by: Jordan Whited <jordan@tailscale.com>
In 2f27319baf we disabled GRO due to a
data race around concurrent calls to tstun.Wrapper.Write(). This commit
refactors GRO to be thread-safe, and re-enables it on Linux.
This refactor now carries a GRO type across tstun and netstack APIs
with a lifetime that is scoped to a single tstun.Wrapper.Write() call.
In 25f0a3fc8f we used build tags to
prevent importation of gVisor's GRO package on iOS as at the time we
believed it was contributing to additional memory usage on that
platform. It wasn't, so this commit simplifies and removes those
build tags.
Updates tailscale/corp#22353
Updates tailscale/corp#22125
Updates #6816
Signed-off-by: Jordan Whited <jordan@tailscale.com>
cmd/k8s-operator/deploy: replace wildcards in Kubernetes Operator RBAC role definitions with verbs
fixes: #13168
Signed-off-by: Pierig Le Saux <pierig@n3xt.io>
Updates tailscale/corp#22120
Adds the ability to start the backend by reading an authkey stored in the syspolicy database (MDM). This is useful for devices that are provisioned in an unattended fashion.
Signed-off-by: Andrea Gottardo <andrea@gottardo.me>
updates tailcale/corp#22371
Adds custom macOS vm tooling. See the README for
the general gist, but this will spin up VMs with unixgram
capable network interfaces listening to a named socket,
and with a virtio socket device for host-guest communication.
We can add other devices like consoles, serial, etc as needed.
The whole things is buildable with a single make command, and
everything is controllable via the command line using the TailMac
utility.
This should all be generally functional but takes a few shortcuts
with error handling and the like. The virtio socket device support
has not been tested and may require some refinement.
Signed-off-by: Jonathan Nobels <jonathan@tailscale.com>
`DNS unavailable` was marked as a high severity warning. On Android (and other platforms), these trigger a system notification. Here we reduce the severity level to medium. A medium severity warning will still display the warning icon on platforms with a tray icon because of the `ImpactsConnectivity=true` flag being set here, but it won't show a notification anymore. If people enter an area with bad cellular reception, they're bound to receive so many of these notifications and we need to reduce notification fatigue.
Signed-off-by: Andrea Gottardo <andrea@tailscale.com>
In a93dc6cdb1 tryUpgradeToBatchingConn()
moved to build tag gated files, but the runtime.GOOS condition excluding
Android was removed unintentionally from batching_conn_linux.go. Add it
back.
Updates tailscale/corp#22348
Signed-off-by: Jordan Whited <jordan@tailscale.com>
By default, Windows sets the SIO_UDP_CONNRESET and SIO_UDP_NETRESET
options on created UDP sockets. These behaviours make the UDP socket
ICMP-aware; when the system gets an ICMP message (e.g. an "ICMP Port
Unreachable" message, in the case of SIO_UDP_CONNRESET), it will cause
the underlying UDP socket to throw an error. Confusingly, this can occur
even on reads, if the same UDP socket is used to write a packet that
triggers this response.
The Go runtime disabled the SIO_UDP_CONNRESET behavior in 3114bd6, but
did not change SIO_UDP_NETRESET–probably because that socket option
isn't documented particularly well.
Various other networking code seem to disable this behaviour, such as
the Godot game engine (godotengine/godot#22332) and the Eclipse TCF
agent (link below). Others appear to work around this by ignoring the
error returned (anacrolix/dht#16, among others).
For now, until it's clear whether this ends up in the upstream Go
implementation or not, let's also disable the SIO_UDP_NETRESET in a
similar manner to SIO_UDP_CONNRESET.
Eclipse TCF agent: https://gitlab.eclipse.org/eclipse/tcf/tcf.agent/-/blob/master/agent/tcf/framework/mdep.c
Updates #10976
Updates golang/go#68614
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I70a2f19855f8dec1bfb82e63f6d14fc4a22ed5c3
Coder has just adopted nhooyr/websocket which unfortunately changes the import path.
`github.com/coder/coder` imports `tailscale.com/net/wsconn` which was still pointing
to `nhooyr.io/websocket`, but this change updates it.
See https://coder.com/blog/websocket
Updates #13154
Change-Id: I3dec6512472b14eae337ae22c5bcc1e3758888d5
Signed-off-by: Kyle Carberry <kyle@carberry.com>
This PR modifies viewTypeForContainerType to use the last type parameter of a container type
as the value type, enabling the implementation of map-like container types where the second-to-last
(usually first) type parameter serves as the key type.
It also adds a MapContainer type to test the code generation.
Updates #12736
Signed-off-by: Nick Khyl <nickk@tailscale.com>
This prevents two things:
1. Crashing if there's no response body
2. Sending a nonsensical 0 response status code
Updates tailscale/corp#22357
Signed-off-by: Percy Wegmann <percy@tailscale.com>
cmd/k8s-operator,k8s-operator/sessionrecording: support recording WebSocket sessions
Kubernetes currently supports two streaming protocols, SPDY and WebSockets.
WebSockets are replacing SPDY, see
https://github.com/kubernetes/enhancements/issues/4006.
We were currently only supporting SPDY, erroring out if session
was not SPDY and relying on the kube's built-in SPDY fallback.
This PR:
- adds support for parsing contents of 'kubectl exec' sessions streamed
over WebSockets
- adds logic to distinguish 'kubectl exec' requests for a SPDY/WebSockets
sessions and call the relevant handler
Updates tailscale/corp#19821
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
Co-authored-by: Tom Proctor <tomhjp@users.noreply.github.com>
Add functionality to optionally serve a health check endpoint
(off by default).
Users can enable health check endpoint by setting
TS_HEALTHCHECK_ADDR_PORT to [<addr>]:<port>.
Containerboot will then serve an unauthenticatd HTTP health check at
/healthz at that address. The health check returns 200 OK if the
node has at least one tailnet IP address, else returns 503.
Updates tailscale/tailscale#12898
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
This latest version allows for building on various OpenBSD architectures.
(such as openbsd/riscv64)
Updates #8043
Change-Id: Ie9a8738e6aa96335214d5750e090db35e526a4a4
Signed-off-by: Aaron Bieber <aaron@bolddaemon.com>
... rather than abusing the generic tsapp.
Per discussion in https://github.com/gokrazy/gokrazy/pull/275
It also means we can remove stuff we don't need, like ntp or randomd.
Updates #13038
Change-Id: Iccf579c354bd3b5025d05fa1128e32f1d5bde4e4
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
It's too new to be supported in Debian bookworm so just remove it.
It doesn't seem to matter or help speed anything up.
Updates #13038
Change-Id: I39077ba8032bebecd75209552b88f1842c843c33
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
84adfa1ba3 made MAC addresses 1-based too, but didn't adjust this IP address
calculation which was based on the MAC address
Updates #13038
Change-Id: Idc112b303b0b85f41fe51fd61ce1c0d8a3f0f57e
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
The heartbeat package does nothing if not configured anyway, so don't
even put it in the image and pay the cost of it running.
Updates #13038
Updates #1866
Change-Id: Id22c0fb1f8395ad21ab0e0350973d31730e8d39f
The change in b7e48058c8 was too loose; it also captured the CLI
being run as a child process under cmd/tta.
Updates #13038
Updates #1866
Change-Id: Id410b87132938dd38ed4dd3959473c5d0d242ff5
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
* cmd/k8s-operator: fix DNS reconciler for dual-stack clusters
This fixes a bug where DNS reconciler logic was always assuming
that no more than one EndpointSlice exists for a Service.
In fact, there can be multiple, for example, in dual-stack
clusters, but also in other cases this is valid (as per kube docs).
This PR:
- allows for multiple EndpointSlices
- picks out the ones for IPv4 family
- deduplicates addresses
Updates tailscale/tailscale#13056
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
Co-authored-by: Tom Proctor <tomhjp@users.noreply.github.com>
These three packages aren't in gokrazy/tsapp/config.json but
used to be. Unfortunately, that meant that were being included
in the resulting image. Apparently `gok` doesn't delete them or
warn about them being present on disk when they're moved from
the config file.
Updates #13038
Updates #1866
Change-Id: I54918a9e3286ea755b11dde5e9efdd433b8f8fb8
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
We had a mix of 0-based and 1-based nodes and MACs in logs.
Updates #13038
Change-Id: I36d1b00f7f94b37b4ae2cd439bcdc5dbee6eda4d
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Using https://github.com/gokrazy/gokrazy/pull/275
This is much lower latency than logcatcher, which is higher latency
and chunkier. And this is better than getting it via 'tailscale debug
daemon-logs', which misses early interesting logs.
Updates #13038
Change-Id: I499ec254c003a9494c0e9910f9c650c8ac44ef33
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Package setting contains types for defining and representing policy settings.
It facilitates the registration of setting definitions using Register and RegisterDefinition,
and the retrieval of registered setting definitions via Definitions and DefinitionOf.
This package is intended for use primarily within the syspolicy package hierarchy,
and added in a preparation for the next PRs.
Updates #12687
Signed-off-by: Nick Khyl <nickk@tailscale.com>
In particular, tests showing that #3824 works. But that test doesn't
actually work yet; it only gets a DERP connection. (why?)
Updates #13038
Change-Id: Ie1fd1b6a38d4e90fae7e72a0b9a142a95f0b2e8f
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Similar to UseSocketOnly, but pulled out separately in case
people are doing unknown weird things.
Updates #13038
Change-Id: I7478e5cb9794439b947440b831caa798941845ea
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
getConns() is now responsible for returning both stable and unstable
conns. conn and measureFn are now passed together via connAndMeasureFn.
newConnAndMeasureFn() is responsible for constructing them.
TCP measurement timeouts are adjusted to more closely match netcheck.
Updates tailscale/corp#22114
Signed-off-by: Jordan Whited <jordan@tailscale.com>
To test local connections.
Updates #13038
Change-Id: I575dcab31ca812edf7d04fa126772611cf89b9a7
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This adds a new NodeAgentClient type that can be used to
invoke the LocalAPI using the LocalClient instead of
handcrafted URLs. However, there are certain cases where
it does make sense for the node agent to provide more
functionality than whats possible with just the LocalClient,
as such it also exposes a http.Client to make requests directly.
Signed-off-by: Maisem Ali <maisem@tailscale.com>
And don't make guests under vnet/natlab upload to logcatcher,
as there won't be a valid cert anyway.
Updates #13038
Change-Id: Ie1ce0139788036b8ecc1804549a9b5d326c5fef5
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
'stun' has been removed from metric names and replaced with a protocol
label. This refactor is preparation work for HTTPS & ICMP support.
Updates tailscale/corp#22114
Signed-off-by: Jordan Whited <jordan@tailscale.com>
Troubleshooting DNS resolution issues often requires additional information.
This PR expands the effect of the TS_DEBUG_DNS_FORWARD_SEND envknob to forwarder.forwardWithDestChan,
and includes the request type, domain name length, and the first 3 bytes of the domain's SHA-256 hash in the output.
Fixes#13070
Signed-off-by: Nick Khyl <nickk@tailscale.com>
In a situation when manual edits are made on the admin panel, around the
GitOps process, the pusher will be stuck if `--fail-on-manual-edits` is
set, as expected.
To recover from this, there are 2 options:
1. revert the admin panel changes to get back in sync with the code
2. check in the manual edits to code
The former will work well, since previous and local ETags will match
control ETag again. The latter will still fail, since local and control
ETags match, but previous does not.
For this situation, check the local ETag against control first and
ignore previous when things are already in sync.
Updates https://github.com/tailscale/corp/issues/22177
Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
For cases where users want to be extra careful about not overwriting
manual changes, add a flag to hard-fail. This is only useful if the etag
cache is persistent or otherwise reliable. This flag should not be used
in ephemeral CI workers that won't persist the cache.
Updates https://github.com/tailscale/corp/issues/22177
Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
* cmd/tsidp: add funnel support
Updates #10263.
Signed-off-by: Naman Sood <mail@nsood.in>
* look past funnel-ingress-node to see who we're authenticating
Signed-off-by: Naman Sood <mail@nsood.in>
* fix comment typo
Signed-off-by: Naman Sood <mail@nsood.in>
* address review feedback, support Basic auth for /token
Turns out you need to support Basic auth if you do client ID/secret
according to OAuth.
Signed-off-by: Naman Sood <mail@nsood.in>
* fix typos
Signed-off-by: Naman Sood <mail@nsood.in>
* review fixes
Signed-off-by: Naman Sood <mail@nsood.in>
* remove debugging log
Signed-off-by: Naman Sood <mail@nsood.in>
* add comments, fix header
Signed-off-by: Naman Sood <mail@nsood.in>
---------
Signed-off-by: Naman Sood <mail@nsood.in>
This commit adds a batchingConn interface, and renames batchingUDPConn
to linuxBatchingConn. tryUpgradeToBatchingConn() may return a platform-
specific implementation of batchingConn. So far only a Linux
implementation of this interface exists, but this refactor is being
done in anticipation of a Windows implementation.
Updates tailscale/corp#21874
Signed-off-by: Jordan Whited <jordan@tailscale.com>
The same context we use for the HTTP request here might be re-used by
the dialer, which could result in `GotConn` being called multiple times.
We only care about the last one.
Fixes#13009
Signed-off-by: Anton Tolchanov <anton@tailscale.com>
This change adds an HTTP handler with a table showing a list of all
probes, their status, and a button that allows triggering a specific
probe.
Updates tailscale/corp#20583
Signed-off-by: Anton Tolchanov <anton@tailscale.com>
- Keep track of the last 10 probe results and successful probe
latencies;
- Add an HTTP handler that triggers a given probe by name and returns it
result as a plaintext HTML page, showing recent probe results as a
baseline
Updates tailscale/corp#20583
Signed-off-by: Anton Tolchanov <anton@tailscale.com>
I noticed a few places with custom http.Transport where we are not
closing idle connections when transport is no longer used.
Updates tailscale/corp#21609
Signed-off-by: Anton Tolchanov <anton@tailscale.com>
During review of #8644 the `recover-compromised-key` command was renamed
to `revoke-key`, but the old name remained in some messages printed by
the command.
Fixestailscale/corp#19446
Signed-off-by: Anton Tolchanov <anton@tailscale.com>
We were copying 12 out of the 16 bytes which meant that
the 1:1 NAT required would only work if the last 4 bytes
happened to match between the new and old address, something
that our tests accidentally had. Fix it by copying the full
16 bytes and make the tests also verify the addr and use rand
addresses.
Updates #9511
Signed-off-by: Maisem Ali <maisem@tailscale.com>
It was returning a nil `*iptablesRunner` instead of a
nil `NetfilterRunner` interface which would then fail
checks later.
Fixes#13012
Signed-off-by: Maisem Ali <maisem@tailscale.com>
This commit increases gVisor's TCP max send (4->6MiB) and receive
(4->8MiB) buffer sizes on all platforms except iOS. These values are
biased towards higher throughput on high bandwidth-delay product paths.
The iperf3 results below demonstrate the effect of this commit between
two Linux computers with i5-12400 CPUs. 100ms of RTT latency is
introduced via Linux's traffic control network emulator queue
discipline.
The first set of results are from commit f0230ce prior to TCP buffer
resizing.
gVisor write direction:
Test Complete. Summary Results:
[ ID] Interval Transfer Bitrate Retr
[ 5] 0.00-10.00 sec 180 MBytes 151 Mbits/sec 0 sender
[ 5] 0.00-10.10 sec 179 MBytes 149 Mbits/sec receiver
gVisor read direction:
Test Complete. Summary Results:
[ ID] Interval Transfer Bitrate Retr
[ 5] 0.00-10.10 sec 337 MBytes 280 Mbits/sec 20 sender
[ 5] 0.00-10.00 sec 323 MBytes 271 Mbits/sec receiver
The second set of results are from this commit with increased TCP
buffer sizes.
gVisor write direction:
[ ID] Interval Transfer Bitrate Retr
[ 5] 0.00-10.00 sec 297 MBytes 249 Mbits/sec 0 sender
[ 5] 0.00-10.10 sec 297 MBytes 247 Mbits/sec receiver
gVisor read direction:
[ ID] Interval Transfer Bitrate Retr
[ 5] 0.00-10.10 sec 501 MBytes 416 Mbits/sec 17 sender
[ 5] 0.00-10.00 sec 485 MBytes 407 Mbits/sec receiver
Updates #9707
Updates tailscale/corp#22119
Signed-off-by: Jordan Whited <jordan@tailscale.com>
Updates tailscale/tailscale#1634
This PR optimizes captive portal detection on Android and iOS by excluding cellular data interfaces (`pdp*` and `rmnet`). As cellular networks do not present captive portals, frequent network switches between Wi-Fi and cellular would otherwise trigger captive detection unnecessarily, causing battery drain.
Signed-off-by: Andrea Gottardo <andrea@gottardo.me>
This commit implements TCP GRO for packets being written to gVisor on
Linux. Windows support will follow later. The wireguard-go dependency is
updated in order to make use of newly exported IP checksum functions.
gVisor is updated in order to make use of newly exported
stack.PacketBuffer GRO logic.
TCP throughput towards gVisor, i.e. TUN write direction, is dramatically
improved as a result of this commit. Benchmarks show substantial
improvement, sometimes as high as 2x. High bandwidth-delay product
paths remain receive window limited, bottlenecked by gVisor's default
TCP receive socket buffer size. This will be addressed in a follow-on
commit.
The iperf3 results below demonstrate the effect of this commit between
two Linux computers with i5-12400 CPUs. There is roughly ~13us of round
trip latency between them.
The first result is from commit 57856fc without TCP GRO.
Starting Test: protocol: TCP, 1 streams, 131072 byte blocks
- - - - - - - - - - - - - - - - - - - - - - - - -
Test Complete. Summary Results:
[ ID] Interval Transfer Bitrate Retr
[ 5] 0.00-10.00 sec 4.77 GBytes 4.10 Gbits/sec 20 sender
[ 5] 0.00-10.00 sec 4.77 GBytes 4.10 Gbits/sec receiver
The second result is from this commit with TCP GRO.
Starting Test: protocol: TCP, 1 streams, 131072 byte blocks
- - - - - - - - - - - - - - - - - - - - - - - - -
Test Complete. Summary Results:
[ ID] Interval Transfer Bitrate Retr
[ 5] 0.00-10.00 sec 10.6 GBytes 9.14 Gbits/sec 20 sender
[ 5] 0.00-10.00 sec 10.6 GBytes 9.14 Gbits/sec receiver
Updates #6816
Signed-off-by: Jordan Whited <jordan@tailscale.com>
I updated the address parsing stuff to return a specific error for
unspecified hosts passed as empty strings, and look for that
when logging errors. I explicitly did not make parseAddress return a
netip.Addr containing an unspecified address because at this layer,
in the absence of any host, we don't necessarily know the address
family we're dealing with.
For the purposes of this code I think this is fine, at least until
we implement #12588.
Fixes#12979
Signed-off-by: Aaron Klotz <aaron@tailscale.com>
It seems some security software or macOS itself might be MITMing TLS
(for ScreenTime?), so don't warn unless it fails x509 validation
against system roots.
Updates #3198
Change-Id: I6ea381b5bb6385b3d51da4a1468c0d803236b7bf
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This commit implements TCP GSO for packets being read from gVisor on
Linux. Windows support will follow later. The wireguard-go dependency is
updated in order to make use of newly exported GSO logic from its tun
package.
A new gVisor stack.LinkEndpoint implementation has been established
(linkEndpoint) that is loosely modeled after its predecessor
(channel.Endpoint). This new implementation supports GSO of monster TCP
segments up to 64K in size, whereas channel.Endpoint only supports up to
32K. linkEndpoint will also be required for GRO, which will be
implemented in a follow-on commit.
TCP throughput from gVisor, i.e. TUN read direction, is dramatically
improved as a result of this commit. Benchmarks show substantial
improvement through a wide range of RTT and loss conditions, sometimes
as high as 5x.
The iperf3 results below demonstrate the effect of this commit between
two Linux computers with i5-12400 CPUs. There is roughly ~13us of round
trip latency between them.
The first result is from commit 57856fc without TCP GSO.
Starting Test: protocol: TCP, 1 streams, 131072 byte blocks
- - - - - - - - - - - - - - - - - - - - - - - - -
Test Complete. Summary Results:
[ ID] Interval Transfer Bitrate Retr
[ 5] 0.00-10.00 sec 2.51 GBytes 2.15 Gbits/sec 154 sender
[ 5] 0.00-10.00 sec 2.49 GBytes 2.14 Gbits/sec receiver
The second result is from this commit with TCP GSO.
Starting Test: protocol: TCP, 1 streams, 131072 byte blocks
- - - - - - - - - - - - - - - - - - - - - - - - -
Test Complete. Summary Results:
[ ID] Interval Transfer Bitrate Retr
[ 5] 0.00-10.00 sec 12.6 GBytes 10.8 Gbits/sec 6 sender
[ 5] 0.00-10.00 sec 12.6 GBytes 10.8 Gbits/sec receiver
Updates #6816
Signed-off-by: Jordan Whited <jordan@tailscale.com>
Fixestailscale/tailscale#12973
Updates tailscale/tailscale#1634
There was a logic issue in the captive detection code we shipped in https://github.com/tailscale/tailscale/pull/12707.
Assume a captive portal has been detected, and the user notified. Upon switching to another Wi-Fi that does *not* have a captive portal, we were issuing a signal to interrupt any pending captive detection attempt. However, we were not also setting the `captive-portal-detected` warnable to healthy. The result was that any "captive portal detected" alert would not be cleared from the UI.
Also fixes a broken log statement value.
Signed-off-by: Andrea Gottardo <andrea@gottardo.me>
fixes tailscale#12968
The dns manager cleanup func was getting passed a nil
health tracker, which will panic. Fixed to pass it
the system health tracker.
Signed-off-by: Jonathan Nobels <jonathan@tailscale.com>
It is no longer correct to state that we don't support running Tailscale in containers or on Kubernetes.
Updates tailscale/tailscale#12842
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
Add a warning that the Dockerfile in the OSS repo is not the
currently used mechanism to build the images we publish - for folks
who want to contribute to image build scripts or otherwise need to
understand the image build process that we use.
Updates#cleanup
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
All wasi* are GOARCH wasm, so check that instead.
Updates #12732
Change-Id: Id3cc346295c1641bcf80a6c5eb1ad65488509656
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
updates tailscale/corp#21823
Misconfigured, broken, or blocked DNS will often present as
"internet is broken'" to the end user. This plumbs the health tracker
into the dns manager and forwarder and adds a health warning
with a 5 second delay that is raised on failures in the forwarder and
lowered on successes.
Signed-off-by: Jonathan Nobels <jonathan@tailscale.com>
cmd/k8s-operator,k8s-operator/sessionrecording,sessionrecording,ssh/tailssh: refactor session recording functionality
Refactor SSH session recording functionality (mostly the bits related to
Kubernetes API server proxy 'kubectl exec' session recording):
- move the session recording bits used by both Tailscale SSH
and the Kubernetes API server proxy into a shared sessionrecording package,
to avoid having the operator to import ssh/tailssh
- move the Kubernetes API server proxy session recording functionality
into a k8s-operator/sessionrecording package, add some abstractions
in preparation for adding support for a second streaming protocol (WebSockets)
Updates tailscale/corp#19821
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
Allows the use of tsweb.LogHandler exclusively for callbacks describing the
handler HTTP requests.
Fixes#12837
Signed-off-by: Paul Scott <paul@tailscale.com>
Re-instates the functionality that generates CRD API docs, but using
a different library as the one we were using earlier seemed to have
some issues with its Git history.
Also regenerates the docs (make kube-generate-all).
Updates tailscale/tailscale#12859
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
Remove the restriction that getent is skipped on non-Linux unixes.
Improve validation of the parsed output from getent, in case unknown
systems return unusable information.
Fixes#12730.
Signed-off-by: Ross Williams <ross@ross-williams.net>
Updates tailscale/corp#21949
As discussed with @raggi, this PR updates the static DERPMap embedded in the client to reflect the availability of HTTP on the DERP servers run by Tailscale.
Signed-off-by: Andrea Gottardo <andrea@gottardo.me>
Updates tailscale/tailscale#1634
This PR introduces a new `captive-portal-detected` Warnable which is set to an unhealthy state whenever a captive portal is detected on the local network, preventing Tailscale from connecting.
ipn/ipnlocal: fix captive portal loop shutdown
Change-Id: I7cafdbce68463a16260091bcec1741501a070c95
net/captivedetection: fix mutex misuse
ipn/ipnlocal: ensure that we don't fail to start the timer
Change-Id: I3e43fb19264d793e8707c5031c0898e48e3e7465
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Signed-off-by: Andrea Gottardo <andrea@gottardo.me>
wgengine/magicsock,ipn: allow setting static node endpoints via tailscaled config file.
Adds a new StaticEndpoints field to tailscaled config
that can be used to statically configure the endpoints
that the node advertizes. This field will replace
TS_DEBUG_PRETENDPOINTS env var that can be used to achieve the same.
Additionally adds some functionality that ensures that endpoints
are updated when configfile is reloaded.
Also, refactor configuring/reconfiguring components to use the
same functionality when configfile is parsed the first time or
subsequent times (after reload). Previously a configfile reload
did not result in resetting of prefs. Now it does- but does not yet
tell the relevant components to consume the new prefs. This is to
be done in a follow-up.
Updates tailscale/tailscale#12578
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
It is sometimes necessary to change a global lazy.SyncValue for the duration of a test. This PR adds a (*SyncValue[T]).SetForTest method to facilitate that.
Updates #12687
Signed-off-by: Nick Khyl <nickk@tailscale.com>
The standard library includes these for strings and byte slices,
but it lacks similar functions for generic slices of comparable types.
Although they are not as commonly used, these functions are useful
in scenarios such as working with field index sequences (i.e., []int)
via reflection.
Updates #12687
Signed-off-by: Nick Khyl <nickk@tailscale.com>
This adds support for container-like types such as Container[T] that
don't explicitly specify a view type for T. Instead, a package implementing
a container type should also implement and export a ContainerView[T, V] type
and a ContainerViewOf(*Container[T]) ContainerView[T, V] function, which
returns a view for the specified container, inferring the element view type V
from the element type T.
Updates #12736
Signed-off-by: Nick Khyl <nickk@tailscale.com>
Some users run "tailscale cert" in a cron job to renew their
certificates on disk. The time until the next cron job run may be long
enough for the old cert to expire with our default heristics.
Add a `--min-validity` flag which ensures that the returned cert is
valid for at least the provided duration (unless it's longer than the
cert lifetime set by Let's Encrypt).
Updates #8725
Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
Remove fybrik.io/crdoc dependency as it is causing issues for folks attempting
to vendor tailscale using GOPROXY=direct.
This means that the CRD API docs in ./k8s-operator/api.md will no longer
be generated- I am going to look at replacing it with another tool
in a follow-up.
Updates tailscale/tailscale#12859
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
Windows requires routes to have a nexthop. Routes created using the interface's local IP address or an unspecified IP address ("0.0.0.0" or "::") as the nexthop are considered on-link routes. Notably, Windows treats on-link subnet routes differently, reserving the last IP in the range as the broadcast IP and therefore prohibiting TCP connections to it, resulting in WSA error 10049: "The requested address is not valid in its context. This does not happen with single-host routes, such as routes to Tailscale IP addresses, but becomes a problem with advertised subnets when all IPs in the range should be reachable.
Before Windows 8, only routes created with an unspecified IP address were considered on-link, so our previous approach of using the interface's own IP as the nexthop likely worked on Windows 7.
This PR updates configureInterface to use the TailscaleServiceIP (100.100.100.100) and its IPv6 counterpart as the nexthop for subnet routes.
Fixestailscale/support-escalations#57
Signed-off-by: Nick Khyl <nickk@tailscale.com>
With this change, the error handling and request logging are all done in defers
after calling inner.ServeHTTP. This ensures that any recovered values which we
want to re-panic with retain a useful stacktrace. However, we now only
re-panic from errorHandler when there's no outside logHandler. Which if you're
using StdHandler there always is. We prefer this to ensure that we are able to
write a 500 Internal Server Error to the client. If a panic hits http.Server
then the response is not sent back.
Updates #12784
Signed-off-by: Paul Scott <paul@tailscale.com>
... and then do approximately nothing with that information, other
than a big TODO. This is mostly me relearning this code and leaving
breadcrumbs for others in the future.
Updates #12724
Signed-off-by: Brad Fitzpatrick <brad@danga.com>
StdHandler/retHandler would previously emit one log line for each request.
If there were multiple StdHandler in the chain, there would be one log line
per instance of retHandler.
With this change, only the outermost StdHandler/logHandler actually logs the
request or invokes OnStart or OnCompletion callbacks. The error-rendering part
of retHandler lives on in errorHandler, and errorHandler passes those errors up
the stack to logHandler through a callback that logHandler places in the
request.Context().
Updates tailscale/corp#19999
Signed-off-by: Paul Scott <paul@tailscale.com>
To match the format of exit node suggestions and ensure that the result
is not ambiguous, relax exit node CLI selection to permit using a FQDN
including the trailing dot.
Updates #12618
Change-Id: I04b9b36d2743154aa42f2789149b2733f8555d3f
Signed-off-by: Adrian Dewhurst <adrian@tailscale.com>
Fixestailscale/tailscale#12794
We were printing some leftover debug logs within a callback function that would be executed after the test completion, causing the test to fail. This change drops the log calls to address the issue.
Signed-off-by: Andrea Gottardo <andrea@gottardo.me>
If we get an non-disco presumably-wireguard-encrypted UDP packet from
an IP:port we don't recognize, rather than drop the packet, give it to
WireGuard anyway and let WireGuard try to figure out who it's from and
tell us.
This uses the new hook added in https://github.com/tailscale/wireguard-go/pull/27
Updates tailscale/corp#20732
Change-Id: I5c61a40143810592f9efac6c12808a87f924ecf2
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Some operations cannot be implemented with the prior API:
* Iterating over the map and deleting keys
* Iterating over the map and replacing items
* Calling APIs that expect a native Go map
Add a Map.WithLock method that acquires a write-lock on the map
and then calls a user-provided closure with the underlying Go map.
This allows users to interact with the Map as a regular Go map,
but with the gaurantees that it is concurrent safe.
Updates tailscale/corp#9115
Signed-off-by: Joe Tsai <joetsai@digital-static.net>
This adds support for generic types and interfaces to our cloner and viewer codegens.
It updates these packages to determine whether to make shallow or deep copies based
on the type parameter constraints. Additionally, if a template parameter or an interface
type has View() and Clone() methods, we'll use them for getters and the cloner of the
owning structure.
Updates #12736
Signed-off-by: Nick Khyl <nickk@tailscale.com>
Updates tailscale/tailscale#4136
To reduce the likelihood of presenting spurious warnings, add the ability to delay the visibility of certain Warnables, based on a TimeToVisible time.Duration field on each Warnable. The default is zero, meaning that a Warnable is immediately visible to the user when it enters an unhealthy state.
Signed-off-by: Andrea Gottardo <andrea@gottardo.me>
This commit truncates any additional information (mainly hostnames) that's passed to controlD via DOH URL in DoHIPsOfBase.
This change is to make sure only resolverID is passed to controlDv6Gen but not the additional information.
Updates: #7946
Signed-off-by: KevinLiang10 <37811973+KevinLiang10@users.noreply.github.com>
If an optional `hwaddrs` URL parameter is present, add network interface
hardware addresses to the posture identity response.
Just like with serial numbers, this requires client opt-in via MDM or
`tailscale set --posture-checking=true`
(https://tailscale.com/kb/1326/device-identity)
Updates tailscale/corp#21371
Signed-off-by: Anton Tolchanov <anton@tailscale.com>
Load Balancers often have more than one ingress IP, so allowing us to
add multiple means we can offer multiple options.
Updates #12578
Change-Id: I4aa49a698d457627d2f7011796d665c67d4c7952
Signed-off-by: Lee Briggs <lee@leebriggs.co.uk>
This adds a package with GP-related functions and types to be used in the future PRs.
It also updates nrptRuleDatabase to use the new package instead of its own gpNotificationWatcher implementation.
Updates #12687
Signed-off-by: Nick Khyl <nickk@tailscale.com>
We added a workaround for --wait, but didn't confirm the other flags,
which were added in systemd 235 and 236. Check systemd version for
deciding when to set all 3 flags.
Fixes#12136
Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
While `clientupdate.Updater` won't be able to apply updates on macsys,
we use `clientupdate.CanAutoUpdate` to gate the EditPrefs endpoint in
localAPI. We should allow the GUI client to set AutoUpdate.Apply on
macsys for it to properly get reported to the control plane. This also
allows the tailnet-wide default for auto-updates to propagate to macsys
clients.
Updates https://github.com/tailscale/corp/issues/21339
Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
cmd/k8s-operator,ssh/tailssh,tsnet: optionally record kubectl exec sessions
The Kubernetes operator's API server proxy, when it receives a request
for 'kubectl exec' session now reads 'RecorderAddrs', 'EnforceRecorder'
fields from tailcfg.KubernetesCapRule.
If 'RecorderAddrs' is set to one or more addresses (of a tsrecorder instance(s)),
it attempts to connect to those and sends the session contents
to the recorder before forwarding the request to the kube API
server. If connection cannot be established or fails midway,
it is only allowed if 'EnforceRecorder' is not true (fail open).
Updates tailscale/corp#19821
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
Co-authored-by: Maisem Ali <maisem@tailscale.com>
For testing. Lee wants to play with 'AWS Global Accelerator Custom
Routing with Amazon Elastic Kubernetes Service'. If this works well
enough, we can promote it.
Updates #12578
Change-Id: I5018347ed46c15c9709910717d27305d0aedf8f4
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
The DERP Return Path Optimization (DRPO) is over four years old (and
on by default for over two) and we haven't had problems, so time to
remove the emergency shutoff code (controlknob) which we've never
used. The controlknobs are only meant for new features, to mitigate
risk. But we don't want to keep them forever, as they kinda pollute
the code.
Updates #150
Change-Id: If021bc8fd1b51006d8bddd1ffab639bb1abb0ad1
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
And fix up a bogus comment and flesh out some other comments.
Updates #cleanup
Change-Id: Ia60a1c04b0f5e44e8d9587914af819df8e8f442a
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
cmd/containerboot,cmd/k8s-operator: enable IPv6 for fqdn egress proxies
Don't skip installing egress forwarding rules for IPv6 (as long as the host
supports IPv6), and set headless services `ipFamilyPolicy` to
`PreferDualStack` to optionally enable both IP families when possible. Note
that even with `PreferDualStack` set, testing a dual-stack GKE cluster with
the default DNS setup of kube-dns did not correctly set both A and
AAAA records for the headless service, and instead only did so when
switching the cluster DNS to Cloud DNS. For both IPv4 and IPv6 to work
simultaneously in a dual-stack cluster, we require headless services to
return both A and AAAA records.
If the host doesn't support IPv6 but the FQDN specified only has IPv6
addresses available, containerboot will exit with error code 1 and an
error message because there is no viable egress route.
Fixes#12215
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
Updates tailscale/tailscale#4136
We should make sure to send the value of ImpactsConnectivity over to the clients using LocalAPI as they need it to display alerts in the GUI properly.
Signed-off-by: Andrea Gottardo <andrea@gottardo.me>
This change expands the `exit-node list -filter` command to display all
location based exit nodes for the filtered country. This allows users
to switch to alternative servers when our recommended exit node is not
working as intended.
This change also makes the country filter matching case insensitive,
e.g. both USA and usa will work.
Updates #12698
Signed-off-by: Charlotte Brandhorst-Satzkorn <charlotte@tailscale.com>
Updates tailscale/tailscale#4136
High severity health warning = a system notification will appear, which can be quite disruptive to the user and cause unnecessary concern in the event of a temporary network issue.
Per design decision (@sonovawolf), the severity of all warnings but "network is down" should be tuned down to medium/low. ImpactsConnectivity should be set, to change the icon to an exclamation mark in some cases, but without a notification bubble.
I also tweaked the messaging for update-available, to reflect how each platform gets updates in different ways.
Signed-off-by: Andrea Gottardo <andrea@gottardo.me>
Updates tailscale/corp#20677
The recover function wasn't getting set in the benchmark
tests. Default changed to an empty func.
Signed-off-by: Jonathan Nobels <jonathan@tailscale.com>
Fix regression from #8108 (Mar 2023). Since that change, gocross has
always been rebuilt on each run of ./tool/go (gocross-wrapper.sh),
adding ~100ms. (Well, not totally rebuilt; cmd/go's caching still
ends up working fine.)
The problem was $gocross_path was just "gocross", which isn't in my
path (and "." isn't in my $PATH, as it shouldn't be), so this line was
always evaluating to the empty string:
gotver="$($gocross_path gocross-version 2>/dev/null || echo '')"
The ./gocross is fine because of the earlier `cd "$repo_root"`
Updates tailscale/corp#21262
Updates tailscale/corp#21263
Change-Id: I80d25446097a3bb3423490c164352f0b569add5f
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
go get github.com/tailscale/mkctr@main
Pulls in changes to support a local target that only pushes
a single-platform image to the machine's local image store.
Fixestailscale/mkctr#18
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
BPF links require that the owning FD remains open, this FD is embedded
into the RawLink returned by the attach function and must live for the
duration of the server.
Updates ENG-4274
Signed-off-by: James Tucker <james@tailscale.com>
Detection of duplicate Network Lock signature chains added in
01847e0123 failed to account for chains
originating with a SigCredential signature, which is used for wrapped
auth keys. This results in erroneous removal of signatures that
originate from the same re-usable auth key.
This change ensures that multiple nodes created by the same re-usable
auth key are not getting filtered out by the network lock.
Updates tailscale/corp#19764
Signed-off-by: Anton Tolchanov <anton@tailscale.com>
This change moves handling of wrapped auth keys to the `tka` package and
adds a test covering auth key originating signatures (SigCredential) in
netmap.
Updates tailscale/corp#19764
Signed-off-by: Anton Tolchanov <anton@tailscale.com>
In hopes it'll be found more.
Updates tailscale/corp#20844
Change-Id: Ic92ee9908f45b88f8770de285f838333f9467465
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
A few other minor language updates.
Updates tailscale/corp#20844
Change-Id: Idba85941baa0e2714688cc8a4ec3e242e7d1a362
Signed-off-by: James Tucker <james@tailscale.com>
And some misc doc tweaks for idiomatic Go style.
Updates #cleanup
Change-Id: I3ca45f78aaca037f433538b847fd6a9571a2d918
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
We cannot directly pass a flat domain name into NetUserGetInfo; we must
resolve the address of a domain controller first.
This PR implements the appropriate resolution mechanisms to do that, and
also exposes a couple of new utility APIs for future needs.
Fixes#12627
Signed-off-by: Aaron Klotz <aaron@tailscale.com>
This turns the checklocks workflow into a real check, and adds
annotations to a few basic packages as a starting point.
Updates #12625
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I2b0185bae05a843b5257980fc6bde732b1bdd93f
The exit node suggestion CLI command was written with the assumption
that it's possible to provide a stableid on the command line, but this
is incorrect. Instead, it will now emit the name of the exit node.
Fixes#12618
Change-Id: Id7277f395b5fca090a99b0d13bfee7b215bc9802
Signed-off-by: Adrian Dewhurst <adrian@tailscale.com>
The context can get canceled during backoff, and binding after that
makes the listener impossible to close afterwards.
Fixes#12620.
Signed-off-by: Naman Sood <mail@nsood.in>
To complement the existing `onCompletion` callback, which is called
after request handler.
Updates tailscale/corp#17075
Signed-off-by: Anton Tolchanov <anton@tailscale.com>
We can observe a data race in tests when logging after a test is
finished. `b.onHealthChange` is called in a goroutine after being
registered with `health.Tracker.RegisterWatcher`, which calls callbacks
in `setUnhealthyLocked` in a new goroutine.
See: https://github.com/tailscale/tailscale/actions/runs/9672919302/job/26686038740
Updates #12054
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: Ibf22cc994965d88a9e7236544878d5373f91229e
This PR ties together pseudoconsoles, user profiles, s4u logons, and
process creation into what is (hopefully) a simple API for various
Tailscale services to obtain Windows access tokens without requiring
knowledge of any Windows passwords. It works both for domain-joined
machines (Kerberos) and non-domain-joined machines. The former case
is fairly straightforward as it is fully documented. OTOH, the latter
case is not documented, though it is fully defined in the C headers in
the Windows SDK. The documentation blanks were filled in by reading
the source code of Microsoft's Win32 port of OpenSSH.
We need to do a bit of acrobatics to make conpty work correctly while
creating a child process with an s4u token; see the doc comments above
startProcessInternal for details.
Updates #12383
Signed-off-by: Aaron Klotz <aaron@tailscale.com>
Previously, if we had a umask set (e.g. 0027) that prevented creating a
world-readable file, /etc/resolv.conf would be created without the o+r
bit and thus other users may be unable to resolve DNS.
Since a umask only applies to file creation, chmod the file after
creation and before renaming it to ensure that it has the appropriate
permissions.
Updates #12609
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I2a05d64f4f3a8ee8683a70be17a7da0e70933137
The logic we added in #11378 would prevent selecting a home DERP if we
have no control connection.
Updates tailscale/corp#18095
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I44bb6ac4393989444e4961b8cfa27dc149a33c6e
Fixestailscale/corp#20677
Replaces the original attempt to rectify this (by injecting a netMon
event) which was both heavy handed, and missed cases where the
netMon event was "minor".
On apple platforms, the fetching the interface's nameservers can
and does return an empty list in certain situations. Apple's API
in particular is very limiting here. The header hints at notifications
for dns changes which would let us react ahead of time, but it's all
private APIs.
To avoid remaining in the state where we end up with no
nameservers but we absolutely need them, we'll react
to a lack of upstream nameservers by attempting to re-query
the OS.
We'll rate limit this to space out the attempts. It seems relatively
harmless to attempt a reconfig every 5 seconds (triggered
by an incoming query) if the network is in this broken state.
Missing nameservers might possibly be a persistent condition
(vs a transient error), but that would also imply that something
out of our control is badly misconfigured.
Tested by randomly returning [] for the nameservers. When switching
between Wifi networks, or cell->wifi, this will randomly trigger
the bug, and we appear to reliably heal the DNS state.
Signed-off-by: Jonathan Nobels <jonathan@tailscale.com>
So non-local users (e.g. Kerberos on FreeIPA) on Linux can be looked
up. Our default binaries are built with pure Go os/user which only
supports the classic /etc/passwd and not any libc-hooked lookups.
Updates #12601
Change-Id: I9592db89e6ca58bf972f2dcee7a35fbf44608a4f
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
stunstamp now sends data to Prometheus via remote write, and Prometheus
can serve the same data. Retaining and cleaning up old data in sqlite
leads to long probing pauses, and it's not worth investing more effort
to optimize the schema and/or concurrency model.
Updates tailscale/corp#20344
Signed-off-by: Jordan Whited <jordan@tailscale.com>
PeerPresentFlags was added in 5ffb2668ef but wasn't plumbed through to
the RunConnectionLoop. Rather than add yet another parameter (as
IP:port was added earlier), pass in the raw PeerPresentMessage and
PeerGoneMessage struct values, which are the same things, plus two
fields: PeerGoneReasonType for gone and the PeerPresentFlags from
5ffb2668ef.
Updates tailscale/corp#17816
Change-Id: Ib19d9f95353651ada90656071fc3656cf58b7987
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
When the store-appc-routes flag is on for a tailnet we are writing the
routes more often than seems necessary. Investigation reveals that we
are doing so ~every time we observe a dns response, even if this causes
us not to advertise any new routes. So when we have no new routes,
instead do not advertise routes.
Fixes#12593
Signed-off-by: Fran Bull <fran@tailscale.com>
I couldn't convince myself the old way was safe and couldn't lose
writes.
And it seemed too complicated.
Updates tailscale/corp#21104
Change-Id: I17ba7c7d6fd83458a311ac671146a1f6a458a5c1
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
sendMeshUpdates tries to write as much as possible without blocking,
being careful to check the bufio.Writer.Available size before writes.
Except that regressed in 6c791f7d60 which made those messages larger, which
meants we were doing network I/O with the Server mutex held.
Updates tailscale/corp#13945
Change-Id: Ic327071d2e37de262931b9b390cae32084811919
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This adds the ability to "peek" at the value of a SyncValue, so that
it's possible to observe a value without computing this.
Updates tailscale/corp#17122
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Co-authored-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Change-Id: I06f88c22a1f7ffcbc7ff82946335356bb0ef4622
This is implemented via GetBestInterfaceEx. Should we encounter errors
or fail to resolve a valid, non-Tailscale interface, we fall back to
returning the index for the default interface instead.
Fixes#12551
Signed-off-by: Aaron Klotz <aaron@tailscale.com>
Timeouts could already be identified as NaN values on
stunstamp_derp_stun_rtt_ns, but we can't use NaN effectively with
promql to visualize them. So, this commit adds a timeouts metric that
we can use with rate/delta/etc promql functions.
Updates tailscale/corp#20689
Signed-off-by: Jordan Whited <jordan@tailscale.com>
Changes "Accept" TCP logs to display in verbose logs only,
and removes lines from default logging behavior.
Updates #12158
Signed-off-by: Keli Velazquez <keli@tailscale.com>
This allows the SSH_AUTH_SOCK environment variable to work inside of
su and agent forwarding to succeed.
Fixes#12467
Signed-off-by: Percy Wegmann <percy@tailscale.com>
This actually performs a Noise request in the 'debug ts2021' command,
instead of just exiting once we've dialed a connection. This can help
debug certain forms of captive portals and deep packet inspection that
will allow a connection, but will RST the connection when trying to send
data on the post-upgraded TCP connection.
Updates #1634
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I1e46ca9c9a0751c55f16373a6a76cdc24fec1f18
So that it can be later used in the 'tailscale debug ts2021' function in
the CLI, to aid in debugging captive portals/WAFs/etc.
Updates #1634
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: Iec9423f5e7570f2c2c8218d27fc0902137e73909
Fix regression from bd93c3067e where I didn't notice the
32-bit test failure was real and not its usual slowness-related
regression. Yay failure blindness.
Updates #12526
Change-Id: I00e33bba697e2cdb61a0d76a71b62406f6c2eeb9
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Looks like a DERPmap might not be available when we try to get the
name associated with a region ID, and that was causing an intermittent
panic in CI.
Fixes#12534
Change-Id: I4ace53681bf004df46c728cff830b27339254243
Signed-off-by: Andrea Gottardo <andrea@gottardo.me>
It was hex-ifying the String() form of key.NodePublic, which was already hex.
I noticed in some logs:
"client 6e6f64656b65793a353537353..."
And thought that 6x6x6x6x looked strange. It's "nodekey:" in hex.
Updates tailscale/corp#20844
Change-Id: Ib9f2d63b37e324420b86efaa680668a9b807e465
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
The control plane hasn't sent it to clients in ages.
Updates tailscale/corp#20965
Change-Id: I1d71a4b6dd3f75010a05c544ee39827837c30772
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
I meant to do this in the earlier change and had a git fail.
To atone, add a test too while I'm here.
Updates #12486
Updates #12507
Change-Id: I4943b454a2530cb5047636f37136aa2898d2ffc7
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Fixestailscale/corp#20971
We added some Warnables for DERP failure situations, but their Text currently spits out the DERP region ID ("10") in the UI, which is super ugly. It would be better to provide the RegionName of the DERP region that is failing. We can do so by storing a reference to the last-known DERP map in the health package whenever we fetch one, and using it when generating the notification text.
This way, the following message...
> Tailscale could not connect to the relay server '10'. The server might be temporarily unavailable, or your Internet connection might be down.
becomes:
> Tailscale could not connect to the 'Seattle' relay server. The server might be temporarily unavailable, or your Internet connection might be down.
which is a lot more user-friendly.
Signed-off-by: Andrea Gottardo <andrea@gottardo.me>
Updates tailscale/corp#20969
Right now, when netcheck starts, it asks tailscaled for a copy of the DERPMap. If it doesn't have one, it makes a HTTPS request to controlplane.tailscale.com to fetch one.
This will always fail if you're on a network with a captive portal actively blocking HTTPS traffic. The code appears to hang entirely because the http.Client doesn't have a Timeout set. It just sits there waiting until the request succeeds or fails.
This adds a timeout of 10 seconds, and logs more details about the status of the HTTPS request.
Signed-off-by: Andrea Gottardo <andrea@gottardo.me>
Updates #4136
Small PR to expose the health Warnables dependencies to the GUI via LocalAPI, so that we can only show warnings for root cause issues, and filter out unnecessary messages before user presentation.
Signed-off-by: Andrea Gottardo <andrea@gottardo.me>
I noticed we were allocating these every time when they could just
share the same memory. Rather than document ownership, just lock it
down with a view.
I was considering doing all of the fields but decided to just do this
one first as test to see how infectious it became. Conclusion: not
very.
Updates #cleanup (while working towards tailscale/corp#20514)
Change-Id: I8ce08519de0c9a53f20292adfbecd970fe362de0
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Adds a new TailscaleProxyReady condition type for use in corev1.Service
conditions.
Also switch our CRDs to use metav1.Condition instead of
ConnectorCondition. The Go structs are seralized identically, but it
updates some descriptions and validation rules. Update k8s
controller-tools and controller-runtime deps to fix the documentation
generation for metav1.Condition so that it excludes comments and
TODOs.
Stop expecting the fake client to populate TypeMeta in tests. See
kubernetes-sigs/controller-runtime#2633 for details of the change.
Finally, make some minor improvements to validation for service hostnames.
Fixes#12216
Co-authored-by: Irbe Krumina <irbe@tailscale.com>
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
Previously, we were registering TCP and UDP connections in the same map,
which could result in erroneously removing a mapping if one of the two
connections completes while the other one is still active.
Add a "proto string" argument to these functions to avoid this.
Additionally, take the "proto" argument in LocalAPI, and plumb that
through from the CLI and add a new LocalClient method.
Updates tailscale/corp#20600
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I35d5efaefdfbf4721e315b8ca123f0c8af9125fb
We need to expand our enviornment information to include info about
the Windows store. Thinking about future plans, it would be nice
to include both the packaging mechanism and the distribution mechanism.
In this PR we change packageTypeWindows to check a new registry value
named MSIDist, and concatenate that value to "msi/" when present.
We also remove vestigial NSIS detection.
Updates https://github.com/tailscale/corp/issues/2790
Signed-off-by: Aaron Klotz <aaron@tailscale.com>
The Add method derives a new ID by adding a signed integer
to the ID, treating it as an unsigned 256-bit big-endian integer.
We also add Less and Compare methods to PrivateID to provide
feature parity with existing methods on PublicID.
Updates tailscale/corp#11038
Signed-off-by: Joe Tsai <joetsai@digital-static.net>
validate_udp_checksum was previously indeterminate (not zero) at
declaration, and IPv4 zero value UDP checksum packets were being passed
to the kernel.
Updates tailscale/corp#20689
Signed-off-by: Jordan Whited <jordan@tailscale.com>
* cmd/containerboot: store device ID before setting up proxy routes.
For containerboot instances whose state needs to be stored
in a Kubernetes Secret, we additonally store the device's
ID, FQDN and IPs.
This is used, between other, by the Kubernetes operator,
who uses the ID to delete the device when resources need
cleaning up and writes the FQDN and IPs on various kube
resource statuses for visibility.
This change shifts storing device ID earlier in the proxy setup flow,
to ensure that if proxy routing setup fails,
the device can still be deleted.
Updates tailscale/tailscale#12146
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
* code review feedback
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
---------
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
We do not support specific version updates or track switching on macOS.
Do not populate the flag to avoid confusion.
Updates #cleanup
Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
Previously, we would only compare the current version to resolved latest
version for track. When running `tailscale update --track=stable` from
an unstable build, it would almost always fail because the stable
version is "older". But we should support explicitly switching tracks
like that.
Fixes#12347
Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
For pprof cosmetic/confusion reasons more than performance, but it
might have tiny speed benefit.
Updates #12486
Change-Id: I40e03714f3afa3a7e7f5e1fa99b81c7e889b91b6
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
So profiles show more useful names than just func1, func2, func3, etc.
There will still be func1 on them all, but the symbol before will say
what the lookup type is.
Updates #12486
Change-Id: I910b024a7861394eb83d07f5a899eae338cb1f22
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This moves NewContainsIPFunc from tsaddr to new ipset package.
And wgengine/filter types gets split into wgengine/filter/filtertype,
so netmap (and thus the CLI, etc) doesn't need to bring in ipset,
bart, etc.
Then add a test making sure the CLI deps don't regress.
Updates #1278
Change-Id: Ia246d6d9502bbefbdeacc4aef1bed9c8b24f54d5
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
I noticed the not-local-v6 numbers were nowhere near the v4 numbers
(they should be identical) and then saw this. It meant the
Addr().Next() wasn't picking an IP that was no longer local, as
assumed.
Updates #12486
Change-Id: I18dfb641f00c74c6252666bc41bd2248df15fadd
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
NewContainsIPFunc was previously documented as performing poorly if
there were many netip.Prefixes to search over. As such, we never it used it
in such cases.
This updates it to use bart at a certain threshold (over 6 prefixes,
currently), at which point the bart lookup overhead pays off.
This is currently kinda useless because we're not using it. But now we
can and get wins elsewhere. And we can remove the caveat in the docs.
goos: darwin
goarch: arm64
pkg: tailscale.com/net/tsaddr
│ before │ after │
│ sec/op │ sec/op vs base │
NewContainsIPFunc/empty-8 2.215n ± 11% 2.239n ± 1% +1.08% (p=0.022 n=10)
NewContainsIPFunc/cidr-list-1-8 17.44n ± 0% 17.59n ± 6% +0.89% (p=0.000 n=10)
NewContainsIPFunc/cidr-list-2-8 27.85n ± 0% 28.13n ± 1% +1.01% (p=0.000 n=10)
NewContainsIPFunc/cidr-list-3-8 36.05n ± 0% 36.56n ± 13% +1.41% (p=0.000 n=10)
NewContainsIPFunc/cidr-list-4-8 43.73n ± 0% 44.38n ± 1% +1.50% (p=0.000 n=10)
NewContainsIPFunc/cidr-list-5-8 51.61n ± 2% 51.75n ± 0% ~ (p=0.101 n=10)
NewContainsIPFunc/cidr-list-10-8 95.65n ± 0% 68.92n ± 0% -27.94% (p=0.000 n=10)
NewContainsIPFunc/one-ip-8 4.466n ± 0% 4.469n ± 1% ~ (p=0.491 n=10)
NewContainsIPFunc/two-ip-8 8.002n ± 1% 7.997n ± 4% ~ (p=0.697 n=10)
NewContainsIPFunc/three-ip-8 27.98n ± 1% 27.75n ± 0% -0.82% (p=0.012 n=10)
geomean 19.60n 19.07n -2.71%
Updates #12486
Change-Id: I2e2320cc4384f875f41721374da536bab995c1ce
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This abstraction provides a nicer way to work with
maps of slices without having to write out three long type
params.
This also allows it to provide an AsMap implementation which
copies the map and the slices at least.
Updates tailscale/corp#20910
Signed-off-by: Maisem Ali <maisem@tailscale.com>
NewContainsIPFunc returns a contains matcher optimized for its
input. Use that instead of what this did before, always doing a test
over each of a list of netip.Prefixes.
goos: darwin
goarch: arm64
pkg: tailscale.com/wgengine/filter
│ before │ after │
│ sec/op │ sec/op vs base │
FilterMatch/file1-8 32.60n ± 1% 18.87n ± 1% -42.12% (p=0.000 n=10)
Updates #12486
Change-Id: I8f902bc064effb431e5b46751115942104ff6531
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Without this rule, Windows 8.1 and newer devices issue parallel DNS requests to DNS servers
associated with all network adapters, even when "Override local DNS" is enabled and/or
a Mullvad exit node is being used, resulting in DNS leaks.
This also adds "disable-local-dns-override-via-nrpt" nodeAttr that can be used to disable
the new behavior if needed.
Fixestailscale/corp#20718
Signed-off-by: Nick Khyl <nickk@tailscale.com>
Updates tailscale/tailscale#4136
This PR is the first round of work to move from encoding health warnings as strings and use structured data instead. The current health package revolves around the idea of Subsystems. Each subsystem can have (or not have) a Go error associated with it. The overall health of the backend is given by the concatenation of all these errors.
This PR polishes the concept of Warnable introduced by @bradfitz a few weeks ago. Each Warnable is a component of the backend (for instance, things like 'dns' or 'magicsock' are Warnables). Each Warnable has a unique identifying code. A Warnable is an entity we can warn the user about, by setting (or unsetting) a WarningState for it. Warnables have:
- an identifying Code, so that the GUI can track them as their WarningStates come and go
- a Title, which the GUIs can use to tell the user what component of the backend is broken
- a Text, which is a function that is called with a set of Args to generate a more detailed error message to explain the unhappy state
Additionally, this PR also begins to send Warnables and their WarningStates through LocalAPI to the clients, using ipn.Notify messages. An ipn.Notify is only issued when a warning is added or removed from the Tracker.
In a next PR, we'll get rid of subsystems entirely, and we'll start using structured warnings for all errors affecting the backend functionality.
Signed-off-by: Andrea Gottardo <andrea@gottardo.me>
Fixestailscale/corp#18366.
This PR provides serial number collection on iOS, by allowing system administrators to pass a `DeviceSerialNumber` MDM key which can be read by the `posture` package in Go.
Signed-off-by: Andrea Gottardo <andrea@gottardo.me>
S4U logons do not automatically load the associated user profile. In this
PR we add UserProfile to handle that part. Windows docs indicate that
we should try to resolve a remote profile path when present, so we attempt
to do so when the local computer is joined to a domain.
Updates #12383
Signed-off-by: Aaron Klotz <aaron@tailscale.com>
We do not intend to use this value for feature support communication in
the future, and have applied changes elsewhere that now fix the expected
value.
Updates tailscale/corp#19391
Updates tailscale/corp#20398
Signed-off-by: James Tucker <james@tailscale.com>
This commit introduces a userspace program for managing an experimental
eBPF XDP STUN server program. derp/xdp contains the eBPF pseudo-C along
with a Go pkg for loading it and exporting its metrics.
cmd/xdpderper is a package main user of derp/xdp.
Updates tailscale/corp#20689
Signed-off-by: Jordan Whited <jordan@tailscale.com>
This refactors the logic for determining whether a packet should be sent
to the host or not into a function, and then adds tests for it.
Updates #11304
Updates #12448
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: Ief9afa98eaffae00e21ceb7db073c61b170355e5
Fix a bug where, for a subnet router that advertizes
4via6 route, all packets with a source IP matching
the 4via6 address were being sent to the host itself.
Instead, only send to host packets whose destination
address is host's local address.
Fixestailscale/tailscale#12448
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
Co-authored-by: Andrew Dunham <andrew@du.nham.ca>
Checking in the incubator as this used to do fails because
the getenforce command is not on the PATH.
Updates #12442
Signed-off-by: Percy Wegmann <percy@tailscale.com>
Fixestailscale/corp#20677
On macOS sleep/wake, we're encountering a condition where reconfigure the network
a little bit too quickly - before apple has set the nameservers for our interface.
This results in a persistent condition where we have no upstream resolver and
fail all forwarded DNS queries.
No upstream nameservers is a legitimate configuration, and we have no (good) way
of determining when Apple is ready - but if we need to forward a query, and we
have no nameservers, then something has gone badly wrong and the network is
very broken.
A simple fix here is to simply inject a netMon event, which will go through the
configuration dance again when we hit the SERVFAIL condition.
Tested by artificially/randomly returning [] for the list of nameservers in the bespoke
ipn-bridge code responsible for getting the nameservers.
Signed-off-by: Jonathan Nobels <jonathan@tailscale.com>
As an alterative to #11935 using #12003.
Updates #11935
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I05f643fe812ceeaec5f266e78e3e529cab3a1ac3
panic("binary built with tailscale_go build tag but failed to read build info or find tailscale.toolchain.rev in build info")
}
want:=strings.TrimSpace(GoToolchainRev)
iftsRev!=want{
ifos.Getenv("TS_PERMIT_TOOLCHAIN_MISMATCH")=="1"{
fmt.Fprintf(os.Stderr,"tailscale.toolchain.rev = %q, want %q; but ignoring due to TS_PERMIT_TOOLCHAIN_MISMATCH=1\n",tsRev,want)
return
}
panic(fmt.Sprintf("binary built with tailscale_go build tag but Go toolchain %q doesn't match github.com/tailscale/tailscale expected value %q; override this failure with TS_PERMIT_TOOLCHAIN_MISMATCH=1",tsRev,want))
// GNOME expands submenus downward in the main menu, rather than flyouts to the side.
// Either as a result of that or another limitation, there seems to be a maximum depth of submenus.
// Mullvad countries that have a city submenu are not being rendered, and so can't be selected.
// Handle this by simply treating all mullvad countries as single-city and select the best peer.
hideMullvadCities=true
case"kde":
// KDE doesn't need a delay, and actually won't render submenus
// if we delay for more than about 400µs.
newMenuDelay=0
default:
// Add a slight delay to ensure the menu is created before adding items.
//
// Systray implementations that use libdbusmenu sometimes process messages out of order,
// resulting in errors such as:
// (waybar:153009): LIBDBUSMENU-GTK-WARNING **: 18:07:11.551: Children but no menu, someone's been naughty with their 'children-display' property: 'submenu'
//
// See also: https://github.com/fyne-io/systray/issues/12
newMenuDelay=10*time.Millisecond
}
}
// onReady is called by the systray package when the menu is ready to be built.
func(menu*Menu)onReady(){
log.Printf("starting")
setAppIcon(disconnected)
menu.rebuild()
}
// updateState updates the Menu state from the Tailscale local client.
log.Printf("Warning: proxy backend ClusterIP is an IPv6 address and the proxy has a IPv4 tailnet address. You might need to disable IPv4 address allocation for the proxy for forwarding to work. See https://github.com/tailscale/tailscale/issues/12156")
}
if!local.IsValid(){
returnfmt.Errorf("no tailscale IP matching family of %s found in %v",dstStr,tsIPs)
returnfmt.Errorf("adding rule to clamp traffic via tailscale0: %v",err)
}
returnnil
}
iflen(v4Backends)!=0{
if!tsv4.IsValid(){
log.Printf("backend targets %v contain at least one IPv4 address, but this node's Tailscale IPs do not contain a valid IPv4 address: %v",backendAddrs,tsIPs)
log.Printf("backend targets %v contain at least one IPv6 address, but this node's Tailscale IPs do not contain a valid IPv6 address: %v",backendAddrs,tsIPs)
}elseif!nfr.HasIPV6NAT(){
log.Printf("backend targets %v contain at least one IPv6 address, but the chosen firewall mode does not support IPv6 NAT",backendAddrs)
log.Printf("backend addresses for egress target %q have changed old IPs %v, new IPs %v trigger egress config resync",nn.Name(),ips,nn.Addresses().AsSlice())
}
returntrue
}
}
}
returnfalse
}
// ensureServiceDeleted ensures that any rules for an egress service are removed
returnfmt.Errorf("error validating whether tailscaled config directory %q contains tailscaled config for current capability version %q: %w. If this is a Tailscale Kubernetes operator proxy, please ensure that the version of the operator is not older than the version of the proxy",dir,file,err)
returnerrors.New("TS_EXPERIMENTAL_VERSIONED_CONFIG_DIR cannot be set in combination with TS_HOSTNAME, TS_EXTRA_ARGS, TS_AUTHKEY, TS_ROUTES, TS_ACCEPT_DNS.")
returnerrors.New("TS_EXPERIMENTAL_ENABLE_FORWARDING_OPTIMIZATIONS is not supported in userspace mode")
}
ifs.HealthCheckAddrPort!=""{
log.Printf("[warning] TS_HEALTHCHECK_ADDR_PORT is deprecated and will be removed in 1.82.0. Please use TS_ENABLE_HEALTH_CHECK and optionally TS_LOCAL_ADDR_PORT instead.")
// - Protocol & Go docs: https://pkg.go.dev/tailscale.com/derp
// - Running a DERP server: https://github.com/tailscale/tailscale/tree/main/cmd/derper#derp
packagemain// import "tailscale.com/cmd/derper"
import(
@@ -13,6 +19,7 @@ import (
"expvar"
"flag"
"fmt"
"html/template"
"io"
"log"
"math"
@@ -22,6 +29,9 @@ import (
"os/signal"
"path/filepath"
"regexp"
"runtime"
runtimemetrics"runtime/metrics"
"strconv"
"strings"
"syscall"
"time"
@@ -48,12 +58,12 @@ var (
configPath=flag.String("c","","config file path")
certMode=flag.String("certmode","letsencrypt","mode for getting a cert. possible options: manual, letsencrypt")
certDir=flag.String("certdir",tsweb.DefaultCertDir("derper-certs"),"directory to store LetsEncrypt certs, if addr's port is :443")
hostname=flag.String("hostname","derp.tailscale.com","LetsEncrypt host name, if addr's port is :443")
hostname=flag.String("hostname","derp.tailscale.com","LetsEncrypt host name, if addr's port is :443. When --certmode=manual, this can be an IP address to avoid SNI checks")
runSTUN=flag.Bool("stun",true,"whether to run a STUN server. It will bind to the same IP (if any) as the --addr flag value.")
runDERP=flag.Bool("derp",true,"whether to run a DERP server. The only reason to set this false is if you're decommissioning a server but want to keep its bootstrap DNS functionality still running.")
meshPSKFile=flag.String("mesh-psk-file",defaultMeshPSKFile(),"if non-empty, path to file containing the mesh pre-shared key file. It should contain some hex string; whitespace is trimmed.")
meshWith=flag.String("mesh-with","","optional comma-separated list of hostnames to mesh with; the server's own hostname can be in the list")
meshWith=flag.String("mesh-with","","optional comma-separated list of hostnames to mesh with; the server's own hostname can be in the list. If an entry contains a slash, the second part names a hostname to be used when dialing the target.")
bootstrapDNS=flag.String("bootstrap-dns-names","","optional comma-separated list of hostnames to make available at /bootstrap-dns")
unpublishedDNS=flag.String("unpublished-bootstrap-dns-names","","optional comma-separated list of hostnames to make available at /bootstrap-dns and not publish in the list. If an entry contains a slash, the second part names a DNS record to poll for its TXT record with a `0` to `100` value for rollout percentage.")
verifyClients=flag.Bool("verify-clients",false,"verify clients to this DERP server through a local tailscaled instance.")
derpMapURL=flag.String("derp-map","https://login.tailscale.com/derpmap/default","URL to DERP map (https:// or file://) or 'local' to use the local tailscaled's DERP map")
versionFlag=flag.Bool("version",false,"print version and exit")
derpMapURL=flag.String("derp-map","https://login.tailscale.com/derpmap/default","URL to DERP map (https:// or file://) or 'local' to use the local tailscaled's DERP map")
versionFlag=flag.Bool("version",false,"print version and exit")
bwTUNIPv4Address=flag.String("bw-tun-ipv4-addr","","if specified, bandwidth probes will be performed over a TUN device at this address in order to exercise TCP-in-TCP in similar fashion to TCP over Tailscale via DERP; we will use a /30 subnet including this IP address")
qdPacketsPerSecond=flag.Int("qd-packets-per-second",0,"if greater than 0, queuing delay will be measured continuously using 260 byte packets (approximate size of a CallMeMaybe packet) sent at this rate per second")
qdPacketTimeout=flag.Duration("qd-packet-timeout",5*time.Second,"queuing delay packets arriving after this period of time from being sent are treated like dropped packets and don't count toward queuing delay timings")
regionCodeOrID=flag.String("region-code","","probe only this region (e.g. 'lax' or '17'); if left blank, all regions will be probed")
apiServer=rootFlagSet.String("api-server","api.tailscale.com","API server to contact")
failOnManualEdits=rootFlagSet.Bool("fail-on-manual-edits",false,"fail if manual edits to the ACLs in the admin panel are detected; when set to false (the default) only a warning is printed")
)
funcmodifiedExternallyError(){
funcmodifiedExternallyError()error{
if*githubSyntax{
fmt.Printf("::warning file=%s,line=1,col=1,title=Policy File Modified Externally::The policy file was modified externally in the admin console.\n",*policyFname)
returnfmt.Errorf("::warning file=%s,line=1,col=1,title=Policy File Modified Externally::The policy file was modified externally in the admin console.",*policyFname)
}else{
fmt.Printf("The policy file was modified externally in the admin console.\n")
returnfmt.Errorf("The policy file was modified externally in the admin console.")
# Repository defaults to DockerHub, but images are also synced to ghcr.io/tailscale/tailscale.
@@ -71,10 +99,14 @@ proxyConfig:
# Note that if you pass multiple tags to this field via `--set` flag to helm upgrade/install commands you must escape the comma (for example, "tag:k8s-proxies\,tag:prod"). See https://github.com/helm/helm/issues/1556
defaultTags:"tag:k8s"
firewallMode:auto
# If defined, this proxy class will be used as the default proxy class for
# service and ingress resources that do not have a proxy class defined. It
# does not apply to Connector resources.
defaultProxyClass:""
# apiServerProxyConfig allows to configure whether the operator should expose
description: 'Connector defines a Tailscale node that will be deployed in the cluster. The node can be configured to act as a Tailscale subnet router and/or a Tailscale exit node. Connector is a cluster-scoped resource. More info:https://tailscale.com/kb/1236/kubernetes-operator#deploying-exit-nodes-and-subnet-routers-on-kubernetes-using-connector-custom-resource'
description:|-
Connector defines a Tailscale node that will be deployed in the cluster. The
node can be configured to act as a Tailscale subnet router and/or a Tailscale
description: 'APIVersion defines the versioned schema of this representation of an object. Servers should convert recognized schemas to the latest internal value, and may reject unrecognized values. More info:https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources'
description:|-
APIVersion defines the versioned schema of this representation of an object.
Servers should convert recognized schemas to the latest internal value, and
may reject unrecognized values.
More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources
type:string
kind:
description: 'Kind is a string value representing the REST resource this object represents. Servers may infer this from the endpoint the client submits requests to. Cannot be updated. In CamelCase. More info:https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds'
description:|-
Kind is a string value representing the REST resource this object represents.
Servers may infer this from the endpoint the client submits requests to.
Cannot be updated.
In CamelCase.
More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds
type:string
metadata:
type:object
spec:
description: 'ConnectorSpec describes the desired Tailscale component. More info:https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#spec-and-status'
description:|-
ConnectorSpec describes the desired Tailscale component.
description:ExitNode defines whether the Connector node should act as a Tailscale exit node. Defaults to false. https://tailscale.com/kb/1103/exit-nodes
description:|-
ExitNode defines whether the Connector device should act as a Tailscale exit node. Defaults to false.
This field is mutually exclusive with the appConnector field.
https://tailscale.com/kb/1103/exit-nodes
type:boolean
hostname:
description:Hostname is the tailnet hostname that should be assigned to the Connector node. If unset, hostname defaults to <connector name>-connector. Hostname can contain lower case letters, numbers and dashes, it must not start or end with a dash and must be between 2 and 63 characters long.
description:|-
Hostname is the tailnet hostname that should be assigned to the
Connector node. If unset, hostname defaults to <connector
name>-connector. Hostname can contain lower case letters, numbers and
dashes, it must not start or end with a dash and must be between 2
and 63 characters long.
type:string
pattern:^[a-z0-9][a-z0-9-]{0,61}[a-z0-9]$
proxyClass:
description:ProxyClass is the name of the ProxyClass custom resource that contains configuration options that should be applied to the resources created for this Connector. If unset, the operator will create resources with the default configuration.
description:|-
ProxyClass is the name of the ProxyClass custom resource that
contains configuration options that should be applied to the
resources created for this Connector. If unset, the operator will
create resources with the default configuration.
type:string
subnetRouter:
description:SubnetRouter defines subnet routes that the Connector node should expose to tailnet. If unset, none are exposed. https://tailscale.com/kb/1019/subnets/
description:|-
SubnetRouter defines subnet routes that the Connector device should
expose to tailnet as a Tailscale subnet router.
https://tailscale.com/kb/1019/subnets/
If this field is unset, the device does not get configured as a Tailscale subnet router.
This field is mutually exclusive with the appConnector field.
type:object
required:
- advertiseRoutes
properties:
advertiseRoutes:
description:AdvertiseRoutes refer to CIDRs that the subnet router should make available. Route values must be strings that represent a valid IPv4 or IPv6 CIDR range. Values can be Tailscale 4via6 subnet routes. https://tailscale.com/kb/1201/4via6-subnets/
description:|-
AdvertiseRoutes refer to CIDRs that the subnet router should make
available. Route values must be strings that represent a valid IPv4
or IPv6 CIDR range. Values can be Tailscale 4via6 subnet routes.
https://tailscale.com/kb/1201/4via6-subnets/
type:array
minItems:1
items:
type:string
format:cidr
tags:
description:Tags that the Tailscale node will be tagged with. Defaults to [tag:k8s]. To autoapprove the subnet routes or exit node defined by a Connector, you can configure Tailscale ACLs to give these tags the necessary permissions. See https://tailscale.com/kb/1018/acls/#auto-approvers-for-routes-and-exit-nodes. If you specify custom tags here, you must also make the operator an owner of these tags. See https://tailscale.com/kb/1236/kubernetes-operator/#setting-up-the-kubernetes-operator. Tags cannot be changed once a Connector node has been created. Tag values must be in form ^tag:[a-zA-Z][a-zA-Z0-9-]*$.
description:|-
Tags that the Tailscale node will be tagged with.
Defaults to [tag:k8s].
To autoapprove the subnet routes or exit node defined by a Connector,
you can configure Tailscale ACLs to give these tags the necessary
permissions.
See https://tailscale.com/kb/1337/acl-syntax#autoapprovers.
If you specify custom tags here, you must also make the operator an owner of these tags.
See https://tailscale.com/kb/1236/kubernetes-operator/#setting-up-the-kubernetes-operator.
Tags cannot be changed once a Connector node has been created.
Tag values must be in form ^tag:[a-zA-Z][a-zA-Z0-9-]*$.
message:The appConnector field is mutually exclusive with exitNode and subnetRouter fields.
status:
description:ConnectorStatus describes the status of the Connector. This is set and managed by the Tailscale operator.
description:|-
ConnectorStatus describes the status of the Connector. This is set
and managed by the Tailscale operator.
type:object
properties:
conditions:
description:List of status conditions to indicate the status of the Connector. Known condition types are `ConnectorReady`.
description:|-
List of status conditions to indicate the status of the Connector.
Known condition types are `ConnectorReady`.
type:array
items:
description:ConnectorCondition contains condition information for a Connector.
description:Condition contains details for one aspect of the current state of this API Resource.
type:object
required:
- lastTransitionTime
- message
- reason
- status
- type
properties:
lastTransitionTime:
description:LastTransitionTime is the timestamp corresponding to the last status change of this condition.
description:|-
lastTransitionTime is the last time the condition transitioned from one status to another.
This should be when the underlying condition changed. If that is not known, then using the time when the API field changed is acceptable.
type:string
format:date-time
message:
description:Message is a human readable description of the details of the last transition, complementing reason.
description:|-
message is a human readable message indicating details about the transition.
This may be an empty string.
type:string
maxLength:32768
observedGeneration:
description:If set, this represents the .metadata.generation that the condition was set based upon. For instance, if .metadata.generation is currently 12, but the .status.condition[x].observedGeneration is 9, the condition is out of date with respect to the current state of the Connector.
description:|-
observedGeneration represents the .metadata.generation that the condition was set based upon.
For instance, if .metadata.generation is currently 12, but the .status.conditions[x].observedGeneration is 9, the condition is out of date
with respect to the current state of the instance.
type:integer
format:int64
minimum:0
reason:
description:Reason is a brief machine readable explanation for the condition's last transition.
description:|-
reason contains a programmatic identifier indicating the reason for the condition's last transition.
Producers of specific condition types may define expected values and meanings for this field,
and whether the values are considered a guaranteed API.
The value should be a CamelCase string.
This field may not be empty.
type:string
maxLength:1024
minLength:1
pattern:^[A-Za-z]([A-Za-z0-9_,:]*[A-Za-z0-9_])?$
status:
description:Status of the condition, one of ('True', 'False', 'Unknown').
description:status of the condition, one of True, False, Unknown.
type:string
enum:
- "True"
- "False"
- Unknown
type:
description:Type of the condition, known values are (`SubnetRouterReady`).
description:type of condition in CamelCase or in foo.example.com/CamelCase.
description:Hostname is the fully qualified domain name of the Connector node. If MagicDNS is enabled in your tailnet, it is the MagicDNS name of the node.
description:|-
Hostname is the fully qualified domain name of the Connector node.
If MagicDNS is enabled in your tailnet, it is the MagicDNS name of the
node.
type:string
isAppConnector:
description:IsAppConnector is set to true if the Connector acts as an app connector.
type:boolean
isExitNode:
description:IsExitNode is set to true if the Connector acts as an exit node.
type:boolean
subnetRoutes:
description:SubnetRoutes are the routes currently exposed to tailnet via this Connector instance.
description:|-
SubnetRoutes are the routes currently exposed to tailnet via this
Connector instance.
type:string
tailnetIPs:
description:TailnetIPs is the set of tailnet IP addresses (both IPv4 and IPv6) assigned to the Connector node.
description:|-
TailnetIPs is the set of tailnet IP addresses (both IPv4 and IPv6)
description: 'DNSConfig can be deployed to cluster to make a subset of Tailscale MagicDNS names resolvable by cluster workloads. Use this if: A) you need to refer to tailnet services, exposed to cluster via Tailscale Kubernetes operator egress proxies by the MagicDNS names of those tailnet services (usually because the services run over HTTPS) B) you have exposed a cluster workload to the tailnet using Tailscale Ingress and you also want to refer to the workload from within the cluster over the Ingress''s MagicDNS name (usually because you have some callback component that needs to use the same URL as that used by a non-cluster client on tailnet). When a DNSConfig is applied to a cluster, Tailscale Kubernetes operator will deploy a nameserver for ts.net DNS names and automatically populate it with records for any Tailscale egress or Ingress proxies deployed to that cluster. Currently you must manually update your cluster DNS configuration to add the IP address of the deployed nameserver as a ts.net stub nameserver. Instructions for how to do it: https://kubernetes.io/docs/tasks/administer-cluster/dns-custom-nameservers/#configuration-of-stub-domain-and-upstream-nameserver-using-coredns (for CoreDNS), https://cloud.google.com/kubernetes-engine/docs/how-to/kube-dns (for kube-dns). Tailscale Kubernetes operator will write the address of a Service fronting the nameserver to dsnconfig.status.nameserver.ip. DNSConfig is a singleton - you must not create more than one. NB: if you want cluster workloads to be able to refer to Tailscale Ingress using its MagicDNS name, you must also annotate the Ingress resource with tailscale.com/experimental-forward-cluster-traffic-via-ingress annotation to ensure that the proxy created for the Ingress listens on its Pod IP address. NB:Clusters where Pods get assigned IPv6 addresses only are currently not supported.'
description:|-
DNSConfig can be deployed to cluster to make a subset of Tailscale MagicDNS
names resolvable by cluster workloads. Use this if: A) you need to refer to
tailnet services, exposed to cluster via Tailscale Kubernetes operator egress
proxies by the MagicDNS names of those tailnet services (usually because the
services run over HTTPS)
B) you have exposed a cluster workload to the tailnet using Tailscale Ingress
and you also want to refer to the workload from within the cluster over the
Ingress's MagicDNS name (usually because you have some callback component
that needs to use the same URL as that used by a non-cluster client on
tailnet).
When a DNSConfig is applied to a cluster, Tailscale Kubernetes operator will
deploy a nameserver for ts.net DNS names and automatically populate it with records
for any Tailscale egress or Ingress proxies deployed to that cluster.
Currently you must manually update your cluster DNS configuration to add the
IP address of the deployed nameserver as a ts.net stub nameserver.
Tailscale Kubernetes operator will write the address of a Service fronting
the nameserver to dsnconfig.status.nameserver.ip.
DNSConfig is a singleton - you must not create more than one.
NB: if you want cluster workloads to be able to refer to Tailscale Ingress
using its MagicDNS name, you must also annotate the Ingress resource with
tailscale.com/experimental-forward-cluster-traffic-via-ingress annotation to
ensure that the proxy created for the Ingress listens on its Pod IP address.
NB: Clusters where Pods get assigned IPv6 addresses only are currently not supported.
type:object
required:
- spec
properties:
apiVersion:
description: 'APIVersion defines the versioned schema of this representation of an object. Servers should convert recognized schemas to the latest internal value, and may reject unrecognized values. More info:https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources'
description:|-
APIVersion defines the versioned schema of this representation of an object.
Servers should convert recognized schemas to the latest internal value, and
may reject unrecognized values.
More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources
type:string
kind:
description: 'Kind is a string value representing the REST resource this object represents. Servers may infer this from the endpoint the client submits requests to. Cannot be updated. In CamelCase. More info:https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds'
description:|-
Kind is a string value representing the REST resource this object represents.
Servers may infer this from the endpoint the client submits requests to.
Cannot be updated.
In CamelCase.
More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds
type:string
metadata:
type:object
spec:
description: 'Spec describes the desired DNS configuration. More info:https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#spec-and-status'
description:Configuration for a nameserver that can resolve ts.net DNS names associated with in-cluster proxies for Tailscale egress Services and Tailscale Ingresses. The operator will always deploy this nameserver when a DNSConfig is applied.
description:|-
Configuration for a nameserver that can resolve ts.net DNS names
associated with in-cluster proxies for Tailscale egress Services and
Tailscale Ingresses. The operator will always deploy this nameserver
when a DNSConfig is applied.
type:object
properties:
image:
description:Nameserver image.
description:Nameserver image. Defaults to tailscale/k8s-nameserver:unstable.
type:object
properties:
repo:
description:Repo defaults to tailscale/k8s-nameserver.
type:string
tag:
description:Tag defaults to operator's own tag.
description:Tag defaults to unstable.
type:string
status:
description:Status describes the status of the DNSConfig. This is set and managed by the Tailscale operator.
description:|-
Status describes the status of the DNSConfig. This is set
and managed by the Tailscale operator.
type:object
properties:
conditions:
type:array
items:
description:ConnectorCondition contains condition information for a Connector.
description:Condition contains details for one aspect of the current state of this API Resource.
type:object
required:
- lastTransitionTime
- message
- reason
- status
- type
properties:
lastTransitionTime:
description:LastTransitionTime is the timestamp corresponding to the last status change of this condition.
description:|-
lastTransitionTime is the last time the condition transitioned from one status to another.
This should be when the underlying condition changed. If that is not known, then using the time when the API field changed is acceptable.
type:string
format:date-time
message:
description:Message is a human readable description of the details of the last transition, complementing reason.
description:|-
message is a human readable message indicating details about the transition.
This may be an empty string.
type:string
maxLength:32768
observedGeneration:
description:If set, this represents the .metadata.generation that the condition was set based upon. For instance, if .metadata.generation is currently 12, but the .status.condition[x].observedGeneration is 9, the condition is out of date with respect to the current state of the Connector.
description:|-
observedGeneration represents the .metadata.generation that the condition was set based upon.
For instance, if .metadata.generation is currently 12, but the .status.conditions[x].observedGeneration is 9, the condition is out of date
with respect to the current state of the instance.
type:integer
format:int64
minimum:0
reason:
description:Reason is a brief machine readable explanation for the condition's last transition.
description:|-
reason contains a programmatic identifier indicating the reason for the condition's last transition.
Producers of specific condition types may define expected values and meanings for this field,
and whether the values are considered a guaranteed API.
The value should be a CamelCase string.
This field may not be empty.
type:string
maxLength:1024
minLength:1
pattern:^[A-Za-z]([A-Za-z0-9_,:]*[A-Za-z0-9_])?$
status:
description:Status of the condition, one of ('True', 'False', 'Unknown').
description:status of the condition, one of True, False, Unknown.
type:string
enum:
- "True"
- "False"
- Unknown
type:
description:Type of the condition, known values are (`SubnetRouterReady`).
description:type of condition in CamelCase or in foo.example.com/CamelCase.
description:IP is the ClusterIP of the Service fronting the deployed ts.net nameserver. Currently you must manually update your cluster DNS config to add this address as a stub nameserver for ts.net for cluster workloads to be able to resolve MagicDNS names associated with egress or Ingress proxies. The IP address will change if you delete and recreate the DNSConfig.
description:|-
IP is the ClusterIP of the Service fronting the deployed ts.net nameserver.
Currently you must manually update your cluster DNS config to add
this address as a stub nameserver for ts.net for cluster workloads to be
able to resolve MagicDNS names associated with egress or Ingress
proxies.
The IP address will change if you delete and recreate the DNSConfig.
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.