Initial support for SrcCaps was added in 5ec01bf but it was not actually
working without this.
Updates #12542
Signed-off-by: Anton Tolchanov <anton@tailscale.com>
Adds functionality to kube client to emit Events.
Updates kube store to emit Events when tailscaled state has been loaded, updated or if any errors where
encountered during those operations.
This should help in cases where an error related to state loading/updating caused the Pod to crash in a loop-
unlike logs of the originally failed container instance, Events associated with the Pod will still be
accessible even after N restarts.
Updates tailscale/tailscale#14080
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
I merged 5cae7c51bf (removing Notify.BackendLogID) and 93db503565
(adding another reference to Notify.BackendLogID) that didn't have merge
conflicts, but didn't compile together.
This removes the new reference, fixing the build.
Updates #14129
Change-Id: I9bb68efd977342ea8822e525d656817235039a66
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Limit spamming GUIs with boring updates to once in 3 seconds, unless
the notification is relatively interesting and the GUI should update
immediately.
This is basically @barnstar's #14119 but with the logic moved to be
per-watch-session (since the bit is per session), rather than
globally. And this distinguishes notable Notify messages (such as
state changes) and makes them send immediately.
Updates tailscale/corp#24553
Change-Id: I79cac52cce85280ce351e65e76ea11e107b00b49
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
The v2 endpoint supports HTTP/2 bidirectional streaming and acks for
received bytes. This is used to detect when a recorder disappears to
more quickly terminate the session.
Updates https://github.com/tailscale/corp/issues/24023
Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
* ipn,tailcfg: add VIPService struct and c2n to fetch them from client
Updates tailscale/corp#22743, tailscale/corp#22955
Signed-off-by: Naman Sood <mail@nsood.in>
* more review fixes
Signed-off-by: Naman Sood <mail@nsood.in>
* don't mention PeerCapabilityServicesDestination since it's currently unused
Signed-off-by: Naman Sood <mail@nsood.in>
---------
Signed-off-by: Naman Sood <mail@nsood.in>
Back in the day this testcontrol package only spoke the
nacl-boxed-based control protocol, which used this.
Then we added ts2021, which didn't, but still sometimes used it.
Then we removed the old mode and didn't remove this parameter
in 2409661a0d.
Updates #11585
Change-Id: Ifd290bd7dbbb52b681b3599786437a15bc98b6a5
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Previously we required the program to be running in a test or have
TS_CONTROL_IS_PLAINTEXT_HTTP before we disabled its https fallback
on "http" schema control URLs to localhost with ports.
But nobody accidentally does all three of "http", explicit port
number, localhost and doesn't mean it. And when they mean it, they're
testing a localhost dev control server (like I was) and don't want 443
getting involved.
As of the changes for #13597, this became more annoying in that we
were trying to use a port which wasn't even available.
Updates #13597
Change-Id: Icd00bca56043d2da58ab31de7aa05a3b269c490f
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
We currently annotate pods with a hash of the tailscaled config so that
we can trigger pod restarts whenever it changes. However, the hash
updates more frequently than is necessary causing more restarts than is
necessary. This commit removes two causes; scaling up/down and removing
the auth key after pods have initially authed to control. However, note
that pods will still restart on scale-up/down because of the updated set
of volumes mounted into each pod. Hopefully we can fix that in a planned
follow-up PR.
Updates #13406
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
This gets close to all of the remaining ones.
Updates #12912
Change-Id: I9c672bbed2654a6c5cab31e0cbece6c107d8c6fa
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
It doesn't need a Clone method, like a time.Time, etc.
And then, because Go 1.23+ uses unique.Handle internally for
the netip package types, we can remove those special cases.
Updates #14058 (pulled out from that PR)
Updates tailscale/corp#24485
Change-Id: Iac3548a9417ccda5987f98e0305745a6e178b375
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Perhaps I was too opimistic in #13323 thinking we won't need logs for
this. Let's log a summary of the response without logging specific
identifiers.
Updates tailscale/corp#24437
Signed-off-by: Anton Tolchanov <anton@tailscale.com>
Or unless the new "ts_debug_websockets" build tag is set.
Updates #1278
Change-Id: Ic4c4f81c1924250efd025b055585faec37a5491d
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Otherwise all the clients only using control/controlhttp for the
ts2021 HTTP client were also pulling in WebSocket libraries, as the
server side always needs to speak websockets, but only GOOS=js clients
speak it.
This doesn't yet totally remove the websocket dependency on Linux because
Linux has a envknob opt-in to act like GOOS=js for manual testing and force
the use of WebSockets for DERP only (not control). We can put that behind
a build tag in a future change to eliminate the dep on all GOOSes.
Updates #1278
Change-Id: I4f60508f4cad52bf8c8943c8851ecee506b7ebc9
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Some environments would like to remove Tailscale SSH support for the
binary for various reasons when not needed (either for peace of mind,
or the ~1MB of binary space savings).
Updates tailscale/corp#24454
Updates #1278
Updates #12614
Change-Id: Iadd6c5a393992c254b5dc9aa9a526916f96fd07a
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Adds a /disconnect-control local API endpoint that just shuts down control client.
This can be run before shutting down an HA subnet router/app connector replica - it will ensure
that all connection to control are dropped and control thus considers this node inactive and tells
peers to switch over to another replica. Meanwhile the existing connections keep working (assuming
that the replica is given some graceful shutdown period).
Updates tailscale/tailscale#14020
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
Sets a custom hostinfo app type for ProxyGroup replicas, similarly
to how we do it for all other Kubernetes Operator managed components.
Updates tailscale/tailscale#13406,tailscale/corp#22920
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
- Basic description of DERP
If configured to do so, also show
- Mailto link to security@tailscale.com
- Link to Tailscale Security Policies
- Link to Tailscale Acceptable Use Policy
Updates tailscale/corp#24092
Signed-off-by: Percy Wegmann <percy@tailscale.com>
This adds a new generic result type (motivated by golang/go#70084) to
try it out, and uses it in the new lineutil package (replacing the old
lineread package), changing that package to return iterators:
sometimes over []byte (when the input is all in memory), but sometimes
iterators over results of []byte, if errors might happen at runtime.
Updates #12912
Updates golang/go#70084
Change-Id: Iacdc1070e661b5fb163907b1e8b07ac7d51d3f83
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Thanks to @davidbuzz for raising the issue in #13973.
Fixes#8272Fixes#13973
Change-Id: Ic413e14d34c82df3c70a97e591b90316b0b4946b
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Key changes:
- No mutex for every udp package: replace syncs.Map with regular map for udpTargetConns
- Use socksAddr as map key for better type safety
- Add test for multi udp target
Updates #7581
Change-Id: Ic3d384a9eab62dcbf267d7d6d268bf242cc8ed3c
Signed-off-by: VimT <me@vimt.me>
This commit addresses an issue with the SOCKS5 UDP relay functionality
when using the --tun=userspace-networking option. Previously, UDP packets
were not being correctly routed into the Tailscale network in this mode.
Key changes:
- Replace single UDP connection with a map of connections per target
- Use c.srv.dial for creating connections to ensure proper routing
Updates #7581
Change-Id: Iaaa66f9de6a3713218014cf3f498003a7cac9832
Signed-off-by: VimT <me@vimt.me>
A filesystem was plumbed into netstack in 993acf4475
but hasn't been used since 2d5d6f5403. Remove it.
Noticed while rebasing a Tailscale fork elsewhere.
Updates tailscale/corp#16827
Change-Id: Ib76deeda205ffe912b77a59b9d22853ebff42813
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
We were only updating the ProfileManager and not going down
the EditPrefs path which meant the prefs weren't applied
till either the process restarted or some other pref changed.
This makes it so that we reconfigure everything correctly when
ReloadConfig is called.
Updates #13032
Signed-off-by: Maisem Ali <maisem@tailscale.com>
Add an explicit case for exercising preferred DERP hysteresis around
the branch that compares latencies on a percentage basis.
Updates #cleanup
Signed-off-by: Jordan Whited <jordan@tailscale.com>
By counting "/" elements in the pattern we catch many scenarios, but not
the root-level handler. If either of the patterns is "/", compare the
pattern length to pick the right one.
Updates https://github.com/tailscale/corp/issues/8027
Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
In this PR, we add the tailscale syspolicy command with two subcommands: list, which displays
policy settings, and reload, which forces a reload of those settings. We also update the LocalAPI
and LocalClient to facilitate these additions.
Updates #12687
Signed-off-by: Nick Khyl <nickk@tailscale.com>
Make it possible to advertise app connector via a new conffile field.
Also bumps capver - conffile deserialization errors out if unknonw
fields are set, so we need to know which clients understand the new field.
Updates tailscale/tailscale#11113
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
This required sharing the dropped packet metric between two packages
(tstun and magicsock), so I've moved its definition to util/usermetric.
Updates tailscale/corp#22075
Signed-off-by: Anton Tolchanov <anton@tailscale.com>
The user-facing metrics are intended to track data transmitted at
the overlay network level.
Updates tailscale/corp#22075
Signed-off-by: Anton Tolchanov <anton@tailscale.com>
In an environment with unstable latency, such as upstream bufferbloat,
there are cases where a full netcheck could drop the prior preferred
DERP (likely home DERP) from future netcheck probe plans. This will then
likely result in a home DERP having a missing sample on the next
incremental netcheck, ultimately resulting in a home DERP move.
This change does not fix our overall response to highly unstable
latency, but it is an incremental improvement to prevent single spurious
samples during a full netcheck from alone triggering a flapping
condition, as now the prior changes to include historical latency will
still provide the desired resistance, and the home DERP should not move
unless latency is consistently worse over a 5 minute period.
Note that there is a nomenclature and semantics issue remaining in the
difference between a report preferred DERP and a home DERP. A report
preferred DERP is aspirational, it is what will be picked as a home DERP
if a home DERP connection needs to be established. A nodes home DERP may
be different than a recent preferred DERP, in which case a lot of
netcheck logic is fallible. In future enhancements much of the DERP move
logic should move to consider the home DERP, rather than recent report
preferred DERP.
Updates #8603
Updates #13969
Signed-off-by: James Tucker <james@tailscale.com>
We make setting.Snapshot JSON-marshallable in preparation for returning it from the LocalAPI.
Updates #12687
Signed-off-by: Nick Khyl <nickk@tailscale.com>
We add setting.RawValue, a new type that facilitates unmarshalling JSON numbers and arrays
as uint64 and []string (instead of float64 and []any) for policy setting values.
We then use it to make setting.RawItem JSON-marshallable and update the tests.
Updates #12687
Signed-off-by: Nick Khyl <nickk@tailscale.com>
In this PR, we implement (but do not use yet, pending #13727 review) a syspolicy/source.Store
that reads policy settings from environment variables. It converts a CamelCase setting.Key,
such as AuthKey or ExitNodeID, to a SCREAMING_SNAKE_CASE, TS_-prefixed environment
variable name, such as TS_AUTH_KEY and TS_EXIT_NODE_ID. It then looks up the variable
and attempts to parse it according to the expected value type. If the environment variable
is not set, the policy setting is considered not configured in this store (the syspolicy package
will still read it from other sources). Similarly, if the environment variable has an invalid value
for the setting type, it won't be used (though the reported/logged error will differ).
Updates #13193
Updates #12687
Signed-off-by: Nick Khyl <nickk@tailscale.com>
Now when we have HA for egress proxies, it makes sense to support topology
spread constraints that would allow users to define more complex
topologies of how proxy Pods need to be deployed in relation with other
Pods/across regions etc.
Updates tailscale/tailscale#13406
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
This adds additional logging on DERP home changes to allow
better troubleshooting.
Updates tailscale/corp#18095
Signed-off-by: Tim Walters <tim@tailscale.com>
updates tailscale/corp#24197
tailmac run now supports the --share option which will allow you
to specify a directory on the host which can be mounted in the guest
using mount_virtiofs vmshare <path>.
Signed-off-by: Jonathan Nobels <jonathan@tailscale.com>
updates tailscale/corp#24197
Generation of the Host.app path was erroneous and tailmac run
would not work unless the pwd was tailmac/bin. Now you can
be able to invoke tailmac from anywhere.
Signed-off-by: Jonathan Nobels <jonathan@tailscale.com>
- `tailscale metrics print`: to show metric values in console
- `tailscale metrics write`: to write metrics to a file (with a tempfile
& rename dance, which is atomic on Unix).
Also, remove the `TS_DEBUG_USER_METRICS` envknob as we are getting
more confident in these metrics.
Updates tailscale/corp#22075
Signed-off-by: Anton Tolchanov <anton@tailscale.com>
Not confident this is the right way to expose this, so let's remote it
for now.
Updates tailscale/corp#22075
Signed-off-by: Anton Tolchanov <anton@tailscale.com>
If the client cannot fetch a serial number, write a log message helping
the user understand what happened. Also, don't just return the error
immediately, since we still have a chance to collect network interface
addresses.
Updates #5902
Signed-off-by: Anton Tolchanov <anton@tailscale.com>
trimpath can be inconvenient for IDEs and LSPs that do not always
correctly handle module relative paths, and can also contribute to
caching bugs taking effect. We rarely have a real need for trimpath of
test produced binaries, so avoiding it should be a net win.
Updates #2988
Signed-off-by: James Tucker <james@tailscale.com>
A simple implementation of latency and loss simulation, applied to
writes to the ethernet interface of the NIC. The latency implementation
could be optimized substantially later if necessary.
Updates #13355
Signed-off-by: James Tucker <james@tailscale.com>
During resolv.conf update, old 'search' lines are cleared but '\n' is not
deleted, leaving behind a new blank line on every update.
This adds 's' flag to regexp, so '\n' is included in the match and deleted when
old lines are cleared.
Also, insert missing `\n` when updated 'search' line is appended to resolv.conf.
Signed-off-by: Renato Aguiar <renato@renatoaguiar.net>
Cache state in memory on writes, read from memory
in reads.
kubestore was previously always reading state from a Secret.
This change should fix bugs caused by temporary loss of access
to kube API server and imporove overall performance
Fixes#7671
Updates tailscale/tailscale#12079,tailscale/tailscale#13900
Signed-off-by: Maisem Ali <maisem@tailscale.com>
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
Co-authored-by: Maisem Ali <maisem@tailscale.com>
In this PR, we update the syspolicy package to utilize syspolicy/rsop under the hood,
and remove syspolicy.CachingHandler, syspolicy.windowsHandler and related code
which is no longer used.
We mark the syspolicy.Handler interface and RegisterHandler/SetHandlerForTest functions
as deprecated, but keep them temporarily until they are no longer used in other repos.
We also update the package to register setting definitions for all existing policy settings
and to register the Registry-based, Windows-specific policy stores when running on Windows.
Finally, we update existing internal and external tests to use the new API and add a few more
tests and benchmarks.
Updates #12687
Signed-off-by: Nick Khyl <nickk@tailscale.com>
This allows us to print the time that a netcheck was run, which is
useful in debugging.
Updates #10972
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: Id48d30d4eb6d5208efb2b1526a71d83fe7f9320b
Few changes to resolve TODOs in the code:
- Instead of using a hardcoded IP, get it from the netmap.
- Use 100.100.100.100 as the gateway IP
- Use the /10 CGNAT range instead of a random /24
Updates #2589
Signed-off-by: Maisem Ali <maisem@tailscale.com>
It had bit-rotted likely during the transition to vector io in
76389d8baf. Tested on Ubuntu 24.04
by creating a netns and doing the DHCP dance to get an IP.
Updates #2589
Signed-off-by: Maisem Ali <maisem@tailscale.com>
In f77821fd63 (released in v1.72.0), we made the client tell a DERP server
when the connection was not its ideal choice (the first node in its region).
But we didn't do anything with that information until now. This adds a
metric about how many such connections are on a given derper, and also
adds a bit to the PeerPresentFlags bitmask so watchers can identify
(and rebalance) them.
Updates tailscale/corp#372
Change-Id: Ief8af448750aa6d598e5939a57c062f4e55962be
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Updates tailscale/tailscale#13839
Adds a new blockblame package which can detect common MITM SSL certificates used by network appliances. We use this in `tlsdial` to display a dedicated health warning when we cannot connect to control, and a network appliance MITM attack is detected.
Signed-off-by: Andrea Gottardo <andrea@gottardo.me>
Clamp the min and max version for DSM 7.0 and DSM 7.2 packages when we
are building packages for the synology package centre. This change
leaves packages destined for pkgs.tailscale.com with just the min
version set to not break packages in the wild / our update flow.
Updates https://github.com/tailscale/corp/issues/22908
Signed-off-by: Mario Minardi <mario@tailscale.com>
GetReport() may have side effects when the caller enforces a deadline
that is shorter than ReportTimeout.
Updates #13783
Updates #13394
Signed-off-by: Jordan Whited <jordan@tailscale.com>
We add the ClientID() method to the ipnauth.Actor interface and updated ipnserver.actor to implement it.
This method returns a unique ID of the connected client if the actor represents one. It helps link a series
of interactions initiated by the client, such as when a notification needs to be sent back to a specific session,
rather than all active sessions, in response to a certain request.
We also add LocalBackend.WatchNotificationsAs and LocalBackend.StartLoginInteractiveAs methods,
which are like WatchNotifications and StartLoginInteractive but accept an additional parameter
specifying an ipnauth.Actor who initiates the operation. We store these actor identities in
watchSession.owner and LocalBackend.authActor, respectively,and implement LocalBackend.sendTo
and related helper methods to enable sending notifications to watchSessions associated with actors
(or, more broadly, identifiable recipients).
We then use the above to change who receives the BrowseToURL notifications:
- For user-initiated, interactive logins, the notification is delivered only to the user who initiated the
process. If the initiating actor represents a specific connected client, the URL notification is sent back
to the same LocalAPI client that called StartLoginInteractive. Otherwise, the notification is sent to all
clients connected as that user.
Currently, we only differentiate between users on Windows, as it is inherently a multi-user OS.
- In all other cases (e.g., node key expiration), we send the notification to all connected users.
Updates tailscale/corp#18342
Signed-off-by: Nick Khyl <nickk@tailscale.com>
Write timeouts can be indicative of stalled TCP streams. Understanding
changes in the rate of such events can be helpful in an ops context.
Updates tailscale/corp#23668
Signed-off-by: Jordan Whited <jordan@tailscale.com>
This fixes the installation on newer Fedora versions that use dnf5 as
the 'dnf' binary.
Updates #13828
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I39513243c81640fab244a32b7dbb3f32071e9fce
Adds logic to `checkExitNodePrefsLocked` to return an error when
attempting to use exit nodes on a platform where this is not supported.
This mirrors logic that was added to error out when trying to use `ssh`
on an unsupported platform, and has very similar semantics.
Fixes https://github.com/tailscale/tailscale/issues/13724
Signed-off-by: Mario Minardi <mario@tailscale.com>
While looking at deflaking TestTwoDevicePing/ping_1.0.0.2_via_SendPacket,
there were a bunch of distracting:
WARNING: (non-fatal) nil health.Tracker (being strict in CI): ...
This pacifies those so it's easier to work on actually deflaking the test.
Updates #11762
Updates #11874
Change-Id: I08dcb44511d4996b68d5f1ce5a2619b555a2a773
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
* updates to LocalBackend require metrics to be passed in which are now initialized
* os.MkdirTemp isn't supported in wasm/js so we simply return empty
string for logger
* adds a UDP dialer which was missing and led to the dialer being
incompletely initialized
Fixes#10454 and #8272
Signed-off-by: Christian <christian@devzero.io>
In this PR we add syspolicy/rsop package that facilitates policy source registration
and provides access to the resultant policy merged from all registered sources for a
given scope.
Updates #12687
Signed-off-by: Nick Khyl <nickk@tailscale.com>
For a customer that wants to run their own DERP prober, let's add a
/healthz endpoint that can be used to monitor derpprobe itself.
Updates #6526
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: Iba315c999fc0b1a93d8c503c07cc733b4c8d5b6b
Our existing container-detection tricks did not work on Kubernetes,
where Docker is no longer used as a container runtime. Extends the
existing go build tags for containers to the other container packages
and uses that to reliably detect builds that were created by Tailscale
for use in a container. Unfortunately this doesn't necessarily improve
detection for users' custom builds, but that's a separate issue.
Updates #13825
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
connstats currently increments the packet counter whenever it is called
to store a length of data, however when udp batch sending was introduced
we pass the length for a series of packages, and it is only incremented
ones, making it count wrongly if we are on a platform supporting udp
batches.
Updates tailscale/corp#22075
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
This allows passing through any environment variables that we set ourselves, for example DBUS_SESSION_BUS_ADDRESS.
Updates #11175
Co-authored-by: Mario Minardi <mario@tailscale.com>
Signed-off-by: Percy Wegmann <percy@tailscale.com>
The bools.Compare function compares boolean values
by reporting -1, 0, +1 for ordering so that it can be easily
used with slices.SortFunc.
Updates #cleanup
Updates tailscale/corp#11038
Signed-off-by: Joe Tsai <joetsai@digital-static.net>
If multiple upstream DNS servers are available, quad-100 sends requests to all of them
and forwards the first successful response, if any. If no successful responses are received,
it propagates the first failure from any of them.
This PR adds some test coverage for these scenarios.
Updates #13571
Signed-off-by: Nick Khyl <nickk@tailscale.com>
We currently have two executions paths where (*forwarder).forwardWithDestChan
returns nil, rather than an error, without sending a DNS response to responseChan.
These paths are accompanied by a comment that reads:
// Returning an error will cause an internal retry, there is
// nothing we can do if parsing failed. Just drop the packet.
But it is not (or no longer longer) accurate: returning an error from forwardWithDestChan
does not currently cause a retry.
Moreover, although these paths are currently unreachable due to implementation details,
if (*forwarder).forwardWithDestChan were to return nil without sending a response to
responseChan, it would cause a deadlock at one call site and a panic at another.
Therefore, we update (*forwarder).forwardWithDestChan to return errors in those two paths
and remove comments that were no longer accurate and misleading.
Updates #cleanup
Updates #13571
Signed-off-by: Nick Hill <mykola.khyl@gmail.com>
If a DoH server returns an HTTP server error, rather than a SERVFAIL within
a successful HTTP response, we should handle it in the same way as SERVFAIL.
Updates #13571
Signed-off-by: Nick Hill <mykola.khyl@gmail.com>
As per the docstring, (*forwarder).forwardWithDestChan should either send to responseChan
and returns nil, or returns a non-nil error (without sending to the channel).
However, this does not hold when all upstream DNS servers replied with an error.
We've been handling this special error path in (*Resolver).Query but not in (*Resolver).HandlePeerDNSQuery.
As a result, SERVFAIL responses from upstream servers were being converted into HTTP 503 responses,
instead of being properly forwarded as SERVFAIL within a successful HTTP response, as per RFC 8484, section 4.2.1:
A successful HTTP response with a 2xx status code (see Section 6.3 of [RFC7231]) is used for any valid DNS response,
regardless of the DNS response code. For example, a successful 2xx HTTP status code is used even with a DNS message
whose DNS response code indicates failure, such as SERVFAIL or NXDOMAIN.
In this PR we fix (*forwarder).forwardWithDestChan to no longer return an error when it sends a response to responseChan,
and remove the special handling in (*Resolver).Query, as it is no longer necessary.
Updates #13571
Signed-off-by: Nick Hill <mykola.khyl@gmail.com>
This helps better distinguish what is generating activity to the
Tailscale public API.
Updates tailscale/corp#23838
Signed-off-by: Percy Wegmann <percy@tailscale.com>
We were using google/uuid in two places and that brought in database/sql/driver.
We didn't need it in either place.
Updates #13760
Updates tailscale/corp#20099
Change-Id: Ieed32f1bebe35d35f47ec5a2a429268f24f11f1f
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
There's never a tailscaled on iOS. And we can't run child processes to
look for it anyway.
Updates tailscale/corp#20099
Change-Id: Ieb3776f4bb440c4f1c442fdd169bacbe17f23ddb
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
We probably shouldn't link it in anywhere, but let's fix iOS for now.
Updates #13762
Updates tailscale/corp#20099
Change-Id: Idac116e9340434334c256acba3866f02bd19827c
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
One primary purpose of WithLock is to mutate the underlying map.
However, this can lead to a panic if it happens to be nil.
Thus, always allocate a map before passing it to f.
Updates tailscale/corp#11038
Signed-off-by: Joe Tsai <joetsai@digital-static.net>
Thus new function allows constructing vizerrors that combine a message
appropriate for display to users with a wrapped underlying error.
Updates tailscale/corp#23781
Signed-off-by: Percy Wegmann <percy@tailscale.com>
Add Keys, Values, and All to iterate over
all keys, values, and entries, respectively.
Updates #11038
Signed-off-by: Joe Tsai <joetsai@digital-static.net>
cmd/k8s-operator,k8s-operator/apis: set a readiness condition on egress Services
Set a readiness condition on ExternalName Services that define a tailnet target
to route cluster traffic to via a ProxyGroup's proxies. The condition
is set to true if at least one proxy is currently set up to route.
Updates tailscale/tailscale#13406
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
Their callers using Range are all kinda clunky feeling. Iterators
should make them more readable.
Updates #12912
Change-Id: I93461eba8e735276fda4a8558a4ae4bfd6c04922
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
We don't need to error out and continuously reconcile if ProxyClass
has not (yet) been created, once it gets created the ProxyGroup
reconciler will get triggered.
Updates tailscale/tailscale#13406
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
As discussed in #13684, base the ProxyGroup's proxy definitions on the same
scaffolding as the existing proxies, as defined in proxy.yaml
Updates #13406
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
Instead of converting our PortMap struct to a string during marshalling
for use as a key, convert the whole collection of PortMaps to a list of
PortMap objects, which improves the readability of the JSON config while
still keeping the data structure we need in the code.
Updates #13406
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
Currently egress Services for ProxyGroup only work for Pods and Services
with IPv4 addresses. Ensure that it works on dual stack clusters by reading
proxy Pod's IP from the .status.podIPs list that always contains both
IPv4 and IPv6 address (if the Pod has them) rather than .status.podIP that
could contain IPv6 only for a dual stack cluster.
Updates tailscale/tailscale#13406
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
The default ProxyClass can be set via helm chart or env var, and applies
to all proxies that do not otherwise have an explicit ProxyClass set.
This ensures proxies created by the new ProxyGroup CRD are consistent
with the behaviour of existing proxies
Nearby but unrelated changes:
* Fix up double error logs (controller runtime logs returned errors)
* Fix a couple of variable names
Updates #13406
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
Rearrange conditionals to reduce indentation and make it a bit easier to read
the logic. Also makes some error message updates for better consistency
with the recent decision around capitalising resource names and the
upcoming addition of config secrets.
Updates #cleanup
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
It is sometimes necessary to defer initialization steps until the first actual usage
or until certain prerequisites have been met. For example, policy setting and
policy source registration should not occur during package initialization.
Instead, they should be deferred until the syspolicy package is actually used.
Additionally, any errors should be properly handled and reported, rather than
causing a panic within the package's init function.
In this PR, we add DeferredInit, to facilitate the registration and invocation
of deferred initialization functions.
Updates #12687
Signed-off-by: Nick Hill <mykola.khyl@gmail.com>
To avoid warning:
find: warning: you have specified the global option -maxdepth after the argument -type, but global options are not positional, i.e., -maxdepth affects tests specified before it as well as those specified after it. Please specify global options before other arguments.
Fixestailscale/corp#23689
Change-Id: I91ee260b295c552c0a029883d5e406733e081478
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Implements the controller for the new ProxyGroup CRD, designed for
running proxies in a high availability configuration. Each proxy gets
its own config and state Secret, and its own tailscale node ID.
We are currently mounting all of the config secrets into the container,
but will stop mounting them and instead read them directly from the kube
API once #13578 is implemented.
Updates #13406
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
Like we do for the ones on iOS.
As a bonus, this removes a caller of tsaddr.IsTailscaleIP which we
want to revamp/remove soonish.
Updates #13687
Change-Id: Iab576a0c48e9005c7844ab52a0aba5ba343b750e
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Per my investigation just now, the $HOME environment variable is unset
on the macsys (standalone macOS GUI) variant, but the current working
directory is valid. Look for the environment variable file in that
location in addition to inside the home directory.
Updates #3707
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I481ae2e0d19b316244373e06865e3b5c3a9f3b88
Extend safeweb.Config with the ability to pass a http.Server that
safeweb will use to server traffic.
Updates corp#8207
Signed-off-by: Patrick O'Doherty <patrick@tailscale.com>
Adds a new reconciler that reconciles ExternalName Services that define a
tailnet target that should be exposed to cluster workloads on a ProxyGroup's
proxies.
The reconciler ensures that for each such service, the config mounted to
the proxies is updated with the tailnet target definition and that
and EndpointSlice and ClusterIP Service are created for the service.
Adds a new reconciler that ensures that as proxy Pods become ready to route
traffic to a tailnet target, the EndpointSlice for the target is updated
with the Pods' endpoints.
Updates tailscale/tailscale#13406
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
The AddSNATRuleForDst rule was adding a new rule each time it was called including:
- if a rule already existed
- if a rule matching the destination, but with different desired source already existed
This was causing issues especially for the in-progress egress HA proxies work,
where the rules are now refreshed more frequently, so more redundant rules
were being created.
This change:
- only creates the rule if it doesn't already exist
- if a rule for the same dst, but different source is found, delete it
- also ensures that egress proxies refresh firewall rules
if the node's tailnet IP changes
Updates tailscale/tailscale#13406
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
Updates tailscale/tailscale#3363
We know `log.tailscale.io` supports TLS 1.3, so we can enforce its usage in the client to shake some bytes off the TLS handshake each time a connection is opened to upload logs.
Signed-off-by: Andrea Gottardo <andrea@gottardo.me>
The new logging in 2dd71e64ac is spammy at shutdown:
Receive func ReceiveIPv6 exiting with error: *net.OpError, read udp [::]:38869: raw-read udp6 [::]:38869: use of closed network connection
Receive func ReceiveIPv4 exiting with error: *net.OpError, read udp 0.0.0.0:36123: raw-read udp4 0.0.0.0:36123: use of closed network connection
Skip it if we're in the process of shutting down.
Updates #10976
Change-Id: I4f6d1c68465557eb9ffe335d43d740e499ba9786
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
We were selectively uploading it, but we were still gathering it,
which can be a waste of CPU.
Also remove a bunch of complexity that I don't think matters anymore.
And add an envknob to force service collection off on a single node,
even if the tailnet policy permits it.
Fixes#13463
Change-Id: Ib6abe9e29d92df4ffa955225289f045eeeb279cf
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
When an Exit Node is used, we create a WFP rule to block all inbound and outbound traffic,
along with several rules to permit specific types of traffic. Notably, we allow all inbound and
outbound traffic to and from LocalRoutes specified in wgengine/router.Config. The list of allowed
routes always includes routes for internal interfaces, such as loopback and virtual Hyper-V/WSL2
interfaces, and may also include LAN routes if the "Allow local network access" option is enabled.
However, these permitting rules do not allow link-local multicast on the corresponding interfaces.
This results in broken mDNS/LLMNR, and potentially other similar issues, whenever an exit node is used.
In this PR, we update (*wf.Firewall).UpdatePermittedRoutes() to create rules allowing outbound and
inbound link-local multicast traffic to and from the permitted IP ranges, partially resolving the mDNS/LLMNR
and *.local name resolution issue.
Since Windows does not attempt to send mDNS/LLMNR queries if a catch-all NRPT rule is present,
it is still necessary to disable the creation of that rule using the disable-local-dns-override-via-nrpt nodeAttr.
Updates #13571
Signed-off-by: Nick Khyl <nickk@tailscale.com>
Of tests we wish we could easily add. One day.
Updates #13038
Change-Id: If44646f8d477674bbf2c9a6e58c3cd8f94a4e8df
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Updates tailscale/tailscale#6148
This is the result of some observations we made today with @raggi. The DNS over HTTPS client currently doesn't cap the number of connections it uses, either in-use or idle. A burst of DNS queries will open multiple connections. Idle connections remain open for 30 seconds (this interval is defined in the dohTransportTimeout constant). For DoH providers like NextDNS which send keep-alives, this means the cellular modem will remain up more than expected to send ACKs if any keep-alives are received while a connection remains idle during those 30 seconds. We can set the IdleConnTimeout to 10 seconds to ensure an idle connection is terminated if no other DNS queries come in after 10 seconds. Additionally, we can cap the number of connections to 1. This ensures that at all times there is only one open DoH connection, either active or idle. If idle, it will be terminated within 10 seconds from the last query.
We also observed all the DoH providers we support are capable of TLS 1.3. We can force this TLS version to reduce the number of packets sent/received each time a TLS connection is established.
Signed-off-by: Andrea Gottardo <andrea@gottardo.me>
1eaad7d3de regressed some tests in another repo that were starting up
a control server on `http://127.0.0.1:nnn`. Because there was no https
running, and because of a bug in 1eaad7d3de (which ended up checking
the recently-dialed-control check twice in a single dial call), we
ended up forcing only the use of TLS dials in a test that only had
plaintext HTTP running.
Instead, plumb down support for explicitly disabling TLS fallbacks and
use it only when running in a test and using `http` scheme control
plane URLs to 127.0.0.1 or localhost.
This fixes the tests elsewhere.
Updates #13597
Change-Id: I97212ded21daf0bd510891a278078daec3eebaa6
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
I noticed while debugging a test failure elsewhere that our failure
logs (when verbosity is cranked up) were uselessly attributing dial
failures to failure to dial an invalid IP address (this IPv6 address
we didn't have), rather than showing me the actual IPv4 connection
failure.
Updates #13597 (tangentially)
Change-Id: I45ffbefbc7e25ebfb15768006413a705b941dae5
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Updates tailscale/tailscale#1634
Updates tailscale/tailscale#13265
Captive portal detection uses a custom `net.Dialer` in its `http.Client`. This custom Dialer ensures that the socket is bound specifically to the Wi-Fi interface. This is crucial because without it, if any default routes are set, the outgoing requests for detecting a captive portal would bypass Wi-Fi and go through the default route instead.
The Dialer did not have a Timeout property configured, so the default system timeout was applied. This caused issues in #13265, where we attempted to make captive portal detection requests over an IPsec interface used for Wi-Fi Calling. The call to `connect()` would fail and remain blocked until the system timeout (approximately 1 minute) was reached.
In #13598, I simply excluded the IPsec interface from captive portal detection. This was a quick and safe mitigation for the issue. This PR is a follow-up to make the process more robust, by setting a 3 seconds timeout on any connection establishment on any interface (this is the same timeout interval we were already setting on the HTTP client).
Signed-off-by: Andrea Gottardo <andrea@gottardo.me>
We were previously not checking that the external IP that we got back
from a UPnP portmap was a valid endpoint; add minimal validation that
this endpoint is something that is routeable by another host.
Updates tailscale/corp#23538
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: Id9649e7683394aced326d5348f4caa24d0efd532
This pulls out the clock and forceNoise443 code into methods on the
Dialer as cleanup in its own commit to make a future change less
distracting.
Updates #13597
Change-Id: I7001e57fe7b508605930c5b141a061b6fb908733
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
In prep for a future port 80 MITM fix, make the 'debug ts2021' command
retry once after a failure to give it a chance to pick a new strategy.
Updates #13597
Change-Id: Icb7bad60cbf0dbec78097df4a00e9795757bc8e4
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
* cmd/containerboot,kube,util/linuxfw: configure kube egress proxies to route to 1+ tailnet targets
This commit is first part of the work to allow running multiple
replicas of the Kubernetes operator egress proxies per tailnet service +
to allow exposing multiple tailnet services via each proxy replica.
This expands the existing iptables/nftables-based proxy configuration
mechanism.
A proxy can now be configured to route to one or more tailnet targets
via a (mounted) config file that, for each tailnet target, specifies:
- the target's tailnet IP or FQDN
- mappings of container ports to which cluster workloads will send traffic to
tailnet target ports where the traffic should be forwarded.
Example configfile contents:
{
"some-svc": {"tailnetTarget":{"fqdn":"foo.tailnetxyz.ts.net","ports"{"tcp:4006:80":{"protocol":"tcp","matchPort":4006,"targetPort":80},"tcp:4007:443":{"protocol":"tcp","matchPort":4007,"targetPort":443}}}}
}
A proxy that is configured with this config file will configure firewall rules
to route cluster traffic to the tailnet targets. It will then watch the config file
for updates as well as monitor relevant netmap updates and reconfigure firewall
as needed.
This adds a bunch of new iptables/nftables functionality to make it easier to dynamically update
the firewall rules without needing to restart the proxy Pod as well as to make
it easier to debug/understand the rules:
- for iptables, each portmapping is a DNAT rule with a comment pointing
at the 'service',i.e:
-A PREROUTING ! -i tailscale0 -p tcp -m tcp --dport 4006 -m comment --comment "some-svc:tcp:4006 -> tcp:80" -j DNAT --to-destination 100.64.1.18:80
Additionally there is a SNAT rule for each tailnet target, to mask the source address.
- for nftables, a separate prerouting chain is created for each tailnet target
and all the portmapping rules are placed in that chain. This makes it easier
to look up rules and delete services when no longer needed.
(nftables allows hooking a custom chain to a prerouting hook, so no extra work
is needed to ensure that the rules in the service chains are evaluated).
The next steps will be to get the Kubernetes Operator to generate
the configfile and ensure it is mounted to the relevant proxy nodes.
Updates tailscale/tailscale#13406
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
The operator creates a non-reusable auth key for each of
the cluster proxies that it creates and puts in the tailscaled
configfile mounted to the proxies.
The proxies are always tagged, and their state is persisted
in a Kubernetes Secret, so their node keys are expected to never
be regenerated, so that they don't need to re-auth.
Some tailnet configurations however have seen issues where the auth
keys being left in the tailscaled configfile cause the proxies
to end up in unauthorized state after a restart at a later point
in time.
Currently, we have not found a way to reproduce this issue,
however this commit removes the auth key from the config once
the proxy can be assumed to have logged in.
If an existing, logged-in proxy is upgraded to this version,
its redundant auth key will be removed from the conffile.
If an existing, logged-in proxy is downgraded from this version
to a previous version, it will work as before without re-issuing key
as the previous code did not enforce that a key must be present.
Updates tailscale/tailscale#13451
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
The ProxyGroup CRD specifies a set of N pods which will each be a
tailnet device, and will have M different ingress or egress services
mapped onto them. It is the mechanism for specifying how highly
available proxies need to be. This commit only adds the definition, no
controller loop, and so it is not currently functional.
This commit also splits out TailnetDevice and RecorderTailnetDevice
into separate structs because the URL field is specific to recorders,
but we want a more generic struct for use in the ProxyGroup status field.
Updates #13406
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
Updates tailscale/tailscale#1634
Logs from some iOS users indicate that we're pointlessly performing captive portal detection on certain interfaces named ipsec*. These are tunnels with the cellular carrier that do not offer Internet access, and are only used to provide internet calling functionality (VoLTE / VoWiFi).
```
attempting to do captive portal detection on interface ipsec1
attempting to do captive portal detection on interface ipsec6
```
This PR excludes interfaces with the `ipsec` prefix from captive portal detection.
Signed-off-by: Andrea Gottardo <andrea@gottardo.me>
Add logic for parsing and matching against our planned format for
AcceptEnv values. Namely, this supports direct matches against string
values and matching where * and ? are treated as wildcard characters
which match against an arbitrary number of characters and a single
character respectively.
Actually using this logic in non-test code will come in subsequent
changes.
Updates https://github.com/tailscale/corp/issues/22775
Signed-off-by: Mario Minardi <mario@tailscale.com>
Like Linux, macOS will reply to sendto(2) with EPERM if the firewall is
currently blocking writes, though this behavior is like Linux
undocumented. This is often caused by a faulting network extension or
content filter from EDR software.
Updates #11710
Updates #12891
Updates #13511
Signed-off-by: James Tucker <james@tailscale.com>
This breaks its ability to be used as an expvar and is blocking a trunkd
deploy. Revert for now, and add a test to ensure that we don't break it
in a future change.
Updates #13550
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I1f1221c257c1de47b4bff0597c12f8530736116d
When querying for an exit node suggestion, occasionally it triggers a
new report concurrently with an existing report in progress. Generally,
there should always be a recent report or one in progress, so it is
redundant to start one there, and it causes concurrency issues.
Fixes#12643
Change-Id: I66ab9003972f673e5d4416f40eccd7c6676272a5
Signed-off-by: Adrian Dewhurst <adrian@tailscale.com>
this commit changes usermetrics to be non-global, this is a building
block for correct metrics if a go process runs multiple tsnets or
in tests.
Updates #13420
Updates tailscale/corp#22075
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
So it doesn't delete and re-pull when switching between branches.
Updates tailscale/corp#17686
Change-Id: Iffb989781db42fcd673c5f03dbd0ce95972ede0f
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Updates tailscale/tailscale#13326
Adds a CLI subcommand to perform DNS queries using the internal DNS forwarder and observe its internals (namely, which upstream resolvers are being used).
Signed-off-by: Andrea Gottardo <andrea@gottardo.me>
Pin re-actors/alls-green usage to latest 1.x. This was previously
pointing to `@release/v2` which pulls in the latest changes from this
branch as they are released, with the potential to break our workflows
if a breaking change or malicious version on this stream is ever pushed.
Changing this to a pinned version also means that dependabot will keep
this in the pinned version format (e.g., referencing a SHA) when it
opens a PR to bump the dependency.
Updates #cleanup
Signed-off-by: Mario Minardi <mario@tailscale.com>
Update and pin actions/upload-artifact usage to latest 4.x. These were
previously pointing to @3 which pulls in the latest v3 as they are
released, with the potential to break our workflows if a breaking change
or malicious version on the @3 stream is ever pushed.
Changing this to a pinned version also means that dependabot will keep
this in the pinned version format (e.g., referencing a SHA) when it
opens a PR to bump the dependency.
Updates #cleanup
Signed-off-by: Mario Minardi <mario@tailscale.com>
Update and pin actions/cache usage to latest 4.x. These were previously
pointing to `@3` which pulls in the latest v3 as they are released, with
the potential to break our workflows if a breaking change or malicious
version on the `@3` stream is ever pushed.
Changing this to a pinned version also means that dependabot will keep
this in the pinned version format (e.g., referencing a SHA) when it
opens a PR to bump the dependency.
The breaking change between v3 and v4 is that v4 requires Node 20 which
should be a non-issue where this is run.
Updates #cleanup
Signed-off-by: Mario Minardi <mario@tailscale.com>
Use slackapi/slack-github-action across the board and pin to latest 1.x.
Previously we were referencing the 1.27.0 tag directly which is
vulnerable to someone replacing that version tag with malicious code.
Replace usage of ruby/action-slack with slackapi/slack-github-action as
the latter is the officially supported action from slack.
Updates #cleanup
Signed-off-by: Mario Minardi <mario@tailscale.com>
Pin codeql actions usage to latest 3.x. These were previously pointing
to `@2` which pulls in the latest v2 as they are released, with the
potential to break our workflows if a breaking change or malicious
version on the `@2` stream is ever pushed.
Changing this to a pinned version also means that dependabot will keep
this in the pinend version format (e.g., referencing a SHA) when it
opens a PR to bump the dependency.
The breaking change between v2 and v3 is that v3 requires Node 20 which
is a non-issue as we are running this on ubuntu latest.
Updates #cleanup
Signed-off-by: Mario Minardi <mario@tailscale.com>
Pin actions/checkout usage to latest 5.x. These were previously pointing
to `@4` which pulls in the latest v4 as they are released, with the
potential to break our workflows if a breaking change or malicious
version on the `@4` stream is ever pushed.
Changing this to a pinned version also means that dependabot will keep
this in the pinend version format (e.g., referencing a SHA) when it
opens a PR to bump the dependency.
The breaking change between v4 and v5 is that v5 requires Node 20 which
should be a non-issue where it is used.
Updates #cleanup
Signed-off-by: Mario Minardi <mario@tailscale.com>
Pin actions/checkout usage to latest 3.x or 4.x as appropriate. These
were previously pointing to `@4` or `@3` which pull in the latest
versions at these tags as they are released, with the potential to break
our workflows if a breaking change or malicious version for either of
these streams are released.
Changing this to a pinned version also means that dependabot will keep
this in the pinend version format (e.g., referencing a SHA) when it
opens a PR to bump the dependency.
Updates #cleanup
Signed-off-by: Mario Minardi <mario@tailscale.com>
Add an `AcceptEnv` field to `SSHRule`. This will contain the collection
of environment variable names / patterns that are specified in the
`acceptEnv` block for the SSH rule within the policy file. This will be
used in the tailscale client to filter out unacceptable environment
variables.
Updates: https://github.com/tailscale/corp/issues/22775
Signed-off-by: Mario Minardi <mario@tailscale.com>
Update go.toolchain.rev for https://github.com/tailscale/go/pull/104 and
add a test that, when using the tailscale_go build tag, we use the
right Go toolchain.
We'll crank up the strictness in later commits.
Updates #13527
Change-Id: Ifb09a844858be2beb144a420e4e9dbdc5c03ae3a
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
containerboot's main.go had grown to well over 1000 lines with
lots of disparate bits of functionality. This commit is pure copy-
paste to group related functionality outside of the main function
into its own set of files. Everything is still in the main package
to keep the diff incremental and reviewable.
Updates #cleanup
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
mdnsResponder at least as of macOS Sequoia does not find NXDOMAIN
responses to these dns-sd PTR queries acceptable unless they include the
question section in the response. This was found debugging #13511, once
we turned on additional diagnostic reporting from mdnsResponder we
witnessed:
```
Received unacceptable 12-byte response from 100.100.100.100 over UDP via utun6/27 -- id: 0x7F41 (32577), flags: 0x8183 (R/Query, RD, RA, NXDomain), counts: 0/0/0/0,
```
If the response includes a question section, the resposnes are
acceptable, e.g.:
```
Received acceptable 59-byte response from 8.8.8.8 over UDP via en0/17 -- id: 0x2E55 (11861), flags: 0x8183 (R/Query, RD, RA, NXDomain), counts: 1/0/0/0,
```
This may be contributing to an issue under diagnosis in #13511 wherein
some combination of conditions results in mdnsResponder no longer
answering DNS queries correctly to applications on the system for
extended periods of time (multiple minutes), while dig against quad-100
provides correct responses for those same domains. If additional debug
logging is enabled in mdnsResponder we see it reporting:
```
Penalizing server 100.100.100.100 for 60 seconds
```
It is also possible that the reason that macOS & iOS never "stopped
spamming" these queries is that they have never been replied to with
acceptable responses. It is not clear if this special case handling of
dns-sd PTR queries was ever beneficial, and given this evidence may have
always been harmful. If we subsequently observe that the queries settle
down now that they have acceptable responses, we should remove these
special cases - making upstream queries very occasionally isn't a lot of
battery, so we should be better off having to maintain less special
cases and avoid bugs of this class.
Updates #2442
Updates #3025
Updates #3363
Updates #3594
Updates #13511
Signed-off-by: James Tucker <james@tailscale.com>
Updates tailscale/tailscale#13452
Bump the Go toolchain to the latest to pick up changes required to not crash on Android 9/10.
Signed-off-by: Andrea Gottardo <andrea@gottardo.me>
In prep for upcoming flow tracking & mutex contention optimization
changes, this change refactors (subjectively simplifying) how the DERP
Server accounts for which peers have written to which other peers, to
be able to send PeerGoneReasonDisconnected messages to writes to
uncache their DRPO (DERP Return Path Optimization) routes.
Notably, this removes the Server.sentTo field which was guarded by
Server.mu and checked on all packet sends. Instead, the accounting is
moved to each sclient's sendLoop goroutine and now only needs to
acquire Server.mu for newly seen senders, the first time a peer sends
a packet to that sclient.
This change reduces the number of reasons to acquire Server.mu
per-packet from two to one. Removing the last one is the subject of an
upcoming change.
Updates #3560
Updates #150
Change-Id: Id226216d6629d61254b6bfd532887534ac38586c
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This un-breaks vim-go (which doesn't understand "go 1.23") and allows
the natlab tests to work in a Nix shell (by adding the "qemu-img" and
"mkfs.ext4" binaries to the shell). These binaries are available even on
macOS, as I'm testing on my M1 Max.
Updates #13038
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I99f8521b5de93ea47dc33b099d5b243ffc1303da
Now that we have our API docs hosted at https://tailscale.com/api we can
remove the previous (and now outdated) markdown based docs. The top
level api.md has been left with the only content being the redirect to
the new docs.
Updates #cleanup
Signed-off-by: Mario Minardi <mario@tailscale.com>
netcheck.Client.GetReport() applies its own deadlines. This 2s deadline
was causing GetReport() to never fall back to HTTPS/ICMP measurements
as it was shorter than netcheck.stunProbeTimeout, leaving no time
for fallbacks.
Updates #13394
Updates #6187
Signed-off-by: Jordan Whited <jordan@tailscale.com>
And update a few callers as examples of motivation. (there are a
couple others, but these are the ones where it's prettier)
Updates #cleanup
Change-Id: Ic8c5cb7af0a59c6e790a599136b591ebe16d38eb
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
73280595a8 for #2751 added a "clientSet" interface to
distinguish the two cases of a client being singly connected (the
common case) vs tolerating multiple connections from the client at
once. At the time (three years ago) it was kinda an experiment
and we didn't know whether it'd stop the reconnect floods we saw
from certain clients. It did.
So this promotes it to a be first-class thing a bit, removing the
interface. The old tests from 73280595a were invaluable in ensuring
correctness while writing this change (they failed a bunch).
But the real motivation for this change is that it'll permit a future
optimization to add flow tracking for stats & performance where we
don't contend on Server.mu for each packet sent via DERP. Instead,
each client can track its active flows and hold on to a *clientSet and
ask the clientSet per packet what the active client is via one atomic
load rather than a mutex. And if the atomic load returns nil, we'll
know we need to ask the server to see if they died and reconnected and
got a new clientSet. But that's all coming later.
Updates #3560
Change-Id: I9ccda3e5381226563b5ec171ceeacf5c210e1faf
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
When the desired netfilter mode was unset, we would always try
to use the `iptables` binary. In such cases if iptables was not found,
tailscaled would just crash as seen in #13440. To work around this, in those
cases check if the `iptables` binary even exists and if it doesn't fall back
to the nftables implementation.
Verified that it works on stock Ubuntu 24.04.
Updates #5621
Updates #8555
Updates #8762Fixes#13440
Signed-off-by: Maisem Ali <maisem@tailscale.com>
cmd/k8s-operator,k8s-operator,kube: Add TSRecorder CRD + controller
Deploys tsrecorder images to the operator's cluster. S3 storage is
configured via environment variables from a k8s Secret. Currently
only supports a single tsrecorder replica, but I've tried to take early
steps towards supporting multiple replicas by e.g. having a separate
secret for auth and state storage.
Example CR:
```yaml
apiVersion: tailscale.com/v1alpha1
kind: Recorder
metadata:
name: rec
spec:
enableUI: true
```
Updates #13298
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
This mimics having Tailscale in the 'Stopped' state by programming an
empty DNS configuration when the current node key is expired.
Updates tailscale/support-escalations#55
Change-Id: I68ff4665761fb621ed57ebf879263c2f4b911610
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
It was scaring people. It's been pretty stable for quite some time now
and we're unlikely to change the API and break people at this point.
We might, but have been trying not to.
Fixestailscale/corp#22933
Change-Id: I0c3c79b57ccac979693c62ba320643a940ac947e
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Rename kube/{types,client,api} -> kube/{kubetypes,kubeclient,kubeapi}
so that we don't need to rename the package on each import to
convey that it's kubernetes specific.
Updates#cleanup
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
Further split kube package into kube/{client,api,types}. This is so that
consumers who only need constants/static types don't have to import
the client and api bits.
Updates#cleanup
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
We already disable dynamic updates by setting DisableDynamicUpdate to 1 for the Tailscale interface.
However, this does not prevent non-dynamic DNS registration from happening when `ipconfig /registerdns`
runs and in similar scenarios. Notably, dns/windowsManager.SetDNS runs `ipconfig /registerdns`,
triggering DNS registration for all interfaces that do not explicitly disable it.
In this PR, we update dns/windowsManager.disableDynamicUpdates to also set RegistrationEnabled to 0.
Fixes#13411
Signed-off-by: Nick Khyl <nickk@tailscale.com>
We no longer need this on Windows, and it was never required on other platforms.
It just results in more short-lived connections unless we use HTTP/2.
Updates tailscale/corp#18342
Signed-off-by: Nick Khyl <nickk@tailscale.com>
When tailscaled restarts and our watch connection goes down, we get
stuck in an infinite loop printing `ipnbus error: EOF` (which ended up
consuming all the disk space on my laptop via the log file). Instead,
handle errors in `watchIPNBus` and reconnect after a short delay.
Updates #1708
Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
Disable TCP & UDP GRO if the probe fails.
torvalds/linux@e269d79c7d broke virtio_net
TCP & UDP GRO causing GRO writes to return EINVAL. The bug was then
resolved later in
torvalds/linux@89add40066. The offending
commit was pulled into various LTS releases.
Updates #13041
Signed-off-by: Jordan Whited <jordan@tailscale.com>
Discovered this while investigating the following issue; I think it's
unrelated, but might as well fix it. Also, add a test helper for
checking things that have an IsZero method using the reflect package.
Updates tailscale/support-escalations#55
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I57b7adde43bcef9483763b561da173b4c35f49e2
When a rotation signature chain reaches a certain size, remove the
oldest rotation signature from the chain before wrapping it in a new
rotation signature.
Since all previous rotation signatures are signed by the same wrapping
pubkey (node's own tailnet lock key), the node can re-construct the
chain, re-signing previous rotation signatures. This will satisfy the
existing certificate validation logic.
Updates #13185
Signed-off-by: Anton Tolchanov <anton@tailscale.com>
With the upcoming syspolicy changes, it's imperative that all syspolicy keys are defined in the syspolicy package
for proper registration. Otherwise, the corresponding policy settings will not be read.
This updates a couple of places where we still use string literals rather than syspolicy consts.
Updates #12687
Signed-off-by: Nick Khyl <nickk@tailscale.com>
Updates tailscale/tailscale#13326
This PR begins implementing a `tailscale dns` command group in the Tailscale CLI. It provides an initial implementation of `tailscale dns status` which dumps the state of the internal DNS forwarder.
Two new endpoints were added in LocalAPI to support the CLI functionality:
- `/netmap`: dumps a copy of the last received network map (because the CLI shouldn't have to listen to the ipn bus for a copy)
- `/dns-osconfig`: dumps the OS DNS configuration (this will be very handy for the UI clients as well, as they currently do not display this information)
My plan is to implement other subcommands mentioned in tailscale/tailscale#13326, such as `query`, in later PRs.
Signed-off-by: Andrea Gottardo <andrea@gottardo.me>
This PR changes how LocalBackend handles interactive (initiated via StartLoginInteractive) and non-interactive (e.g., due to key expiration) logins,
and when it sends the authURL to the connected clients.
Specifically,
- When a user initiates an interactive login by clicking Log In in the GUI, the LocalAPI calls StartLoginInteractive.
If an authURL is available and hasn't expired, we immediately send it to all connected clients, suggesting them to open that URL in a browser.
Otherwise, we send a login request to the control plane and set a flag indicating that an interactive login is in progress.
- When LocalBackend receives an authURL from the control plane, we check if it differs from the previous one and whether an interactive login
is in progress. If either condition is true, we notify all connected clients with the new authURL and reset the interactive login flag.
We reset the auth URL and flags upon a successful authentication, when a different user logs in and when switching Tailscale login profiles.
Finally, we remove the redundant dedup logic added to WatchNotifications in #12096 and revert the tests to their original state to ensure that
calling StartLoginInteractive always produces BrowseToURL notifications, either immediately or when the authURL is received from the control plane.
Fixes#13296
Signed-off-by: Nick Khyl <nickk@tailscale.com>
Updates tailscale/tailscale#177
It appears that the OSS distribution of `tailscaled` is currently unable to get the current system base DNS configuration, as GetBaseConfig() in manager_darwin.go is unimplemented. This PR adds a basic implementation that reads the current values in `/etc/resolv.conf`, to at least unblock DNS resolution via Quad100 if `--accept-dns` is enabled.
Signed-off-by: Andrea Gottardo <andrea@gottardo.me>
We've added more probe targets recently which has resulted in more
timeouts behind restrictive NATs in localized testing that don't
like how many flows we are creating at once. Not so much an issue
for datacenter or cloud-hosted deployments.
Updates tailscale/corp#22114
Signed-off-by: Jordan Whited <jordan@tailscale.com>
We add package defining interfaces for policy stores, enabling creation of policy sources
and reading settings from them. It includes a Windows-specific PlatformPolicyStore for GP and MDM
policies stored in the Registry, and an in-memory TestStore for testing purposes.
We also include an internal package that tracks and reports policy usage metrics when a policy setting
is read from a store. Initially, it will be used only on Windows and Android, as macOS, iOS, and tvOS
report their own metrics. However, we plan to use it across all platforms eventually.
Updates #12687
Signed-off-by: Nick Khyl <nickk@tailscale.com>
* cmd/k8s-operator,k8s-operator/sessonrecording: ensure CastHeader contains terminal size
For tsrecorder to be able to play session recordings, the recording's
CastHeader must have '.Width' and '.Height' fields set to non-zero.
Kubectl (or whoever is the client that initiates the 'kubectl exec'
session recording) sends the terminal dimensions in a resize message that
the API server proxy can intercept, however that races with the first server
message that we need to record.
This PR ensures we wait for the terminal dimensions to be processed from
the first resize message before any other data is sent, so that for all
sessions with terminal attached, the header of the session recording
contains the terminal dimensions and the recording can be played by tsrecorder.
Updates tailscale/tailscale#19821
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
Previously, despite what the commit said, we were using a raw IP socket
that was *not* an AF_PACKET socket, and thus was subject to the host
firewall rules. Switch to using a real AF_PACKET socket to actually get
the functionality we want.
Updates #13140
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: If657daeeda9ab8d967e75a4f049c66e2bca54b78
I should've bumped capver in 65fe0ba7b5 but forgot.
This lets us turn off the cryptokey routing change from control for
the affected panicky range of commits, based on capver.
Updates #13332
Updates tailscale/corp#20732
Change-Id: I32c17cfcb45b2369b2b560032330551d47a0ce0b
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Both for Raspberry Pis, and for running natlab tests faster on Apple
Silicon Macs without emulating x86.
Not fully wired up yet.
Updates #1866
Updates #13038
Change-Id: I1552bf107069308f325f640773cc881ed735b5ab
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
No need to make callers specify the redundant IP version or
TTL/HopLimit or EthernetType in the common case. The mkPacket helper
can set those when unset.
And use the mkIPLayer in another place, simplifying some code.
And rename mkPacketErr to just mkPacket, then move mkPacket to
test-only code, as mustPacket.
Updates #13038
Change-Id: Ic216e44dda760c69ab9bfc509370040874a47d30
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
And clean up some of the test helpers in the process.
Updates #13038
Change-Id: I3e2b5f7028a32d97af7f91941e59399a8e222b25
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
I'd added this helper for tests, but then moved it to non-test code
and forgot some places to use it. This uses it in more places to
remove some boilerplate.
Updates #13038
Change-Id: Ic4dc339be1c47a55b71d806bab421097ee3d75ed
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Logging serial numbers every time they are read might have been useful
early on, but seems unnecessary now.
Updates #5902
Signed-off-by: Anton Tolchanov <anton@tailscale.com>
The LocalBackend's state machine starts in NoState and soon transitions to NeedsLogin if there's no auto-start profile,
with the profileManager starting with a new empty profile. Notably, entering the NeedsLogin state blocks engine updates.
We expect the user to transition out of this state by logging in interactively, and we set WantRunning to true when
controlclient enters the StateAuthenticated state.
While our intention is correct, and completing an interactive login should set WantRunning to true, our assumption
that logging into the current Tailscale profile is the only way to transition out of the NeedsLogin state is not accurate.
Another common transition path includes an explicit profile switch (via LocalBackend.SwitchProfile) or an implicit switch
when a Windows user connects to the backend. This results in a bug where WantRunning is set to true even when it was
previously set to false, and the user expressed no intention of changing it.
A similar issue occurs when switching from (sic) a Tailnet that has seamlessRenewalEnabled, regardless of the current state
of the LocalBackend's state machine, and also results in unexpectedly set WantRunning. While this behavior is generally
undesired, it is also incorrect that it depends on the control knobs of the Tailnet we're switching from rather than
the Tailnet we're switching to. However, this issue needs to be addressed separately.
This PR updates LocalBackend.SetControlClientStatus to only set WantRunning to true in response to an interactive login
as indicated by a non-empty authURL.
Fixes#6668Fixes#11280
Updates #12756
Signed-off-by: Nick Khyl <nickk@tailscale.com>
This adds tests for DNS requests, and ignoring IPv6 packets on v4-only
networks.
No behavior changes. But some things are pulled out into functions.
And the mkPacket helpers previously just for tests are moved into
non-test code to be used elsewhere to reduce duplication, doing the
checksum stuff automatically.
Updates #13038
Change-Id: I4dd0b73c75b2b9567b4be3f05a2792999d83f6a3
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
When the TS_DEBUG_NETSTACK_LOOPBACK_PORT environment variable is set,
netstack will loop back (dnat to addressFamilyLoopback:loopbackPort)
TCP & UDP flows originally destined to localServicesIP:loopbackPort.
localServicesIP is quad-100 or the IPv6 equivalent.
Updates tailscale/corp#22713
Signed-off-by: Jordan Whited <jordan@tailscale.com>
In preparation for multi-user and unattended mode improvements, we are
refactoring and cleaning up `ipn/ipnlocal.profileManager`. The concept of the
"current user", which is only relevant on Windows, is being deprecated and will
soon be removed to allow more than one Windows user to connect and utilize
`LocalBackend` according to that user's access rights to the device and specific
Tailscale profiles.
We plan to pass the user's identity down to the `profileManager`, where it can
be used to determine the user's access rights to a given `LoginProfile`. While
the new permission model in `ipnauth` requires more work and is currently
blocked pending PR reviews, we are updating the `profileManager` to reduce its
reliance on the concept of a single OS user being connected to the backend at
the same time.
We extract the switching to the default Tailscale profile, which may also
trigger legacy profile migration, from `profileManager.SetCurrentUserID`. This
introduces `profileManager.DefaultUserProfileID`, which returns the default
profile ID for the current user, and `profileManager.SwitchToDefaultProfile`,
which is essentially a shorthand for `pm.SwitchProfile(pm.DefaultUserProfileID())`.
Both methods will eventually be updated to accept the user's identity and
utilize that user's default profile.
We make access checks more explicit by introducing the `profileManager.checkProfileAccess`
method. The current implementation continues to use `profileManager.currentUserID`
and `LoginProfile.LocalUserID` to determine whether access to a given profile
should be granted. This will be updated to utilize the `ipnauth` package and the
new permissions model once it's ready. We also expand access checks to be used
more widely in the `profileManager`, not just when switching or listing
profiles. This includes access checks in methods like `SetPrefs` and, most notably,
`DeleteProfile` and `DeleteAllProfiles`, preventing unprivileged Windows users
from deleting Tailscale profiles owned by other users on the same device,
including profiles owned by local admins.
We extract `profileManager.ProfilePrefs` and `profileManager.SetProfilePrefs`
methods that can be used to get and set preferences of a given `LoginProfile` if
`profileManager.checkProfileAccess` permits access to it.
We also update `profileManager.setUnattendedModeAsConfigured` to always enable
unattended mode on Windows if `Prefs.ForceDaemon` is true in the current
`LoginProfile`, even if `profileManager.currentUserID` is `""`. This facilitates
enabling unattended mode via `tailscale up --unattended` even if
`tailscale-ipn.exe` is not running, such as when a Group Policy or MDM-deployed
script runs at boot time, or when Tailscale is used on a Server Code or otherwise
headless Windows environments. See #12239, #2137, #3186 and
https://github.com/tailscale/tailscale/pull/6255#issuecomment-2016623838 for
details.
Fixes#12239
Updates tailscale/corp#18342
Updates #3186
Updates #2137
Signed-off-by: Nick Khyl <nickk@tailscale.com>
This adds support for sending packets to 33:33:00:00:01 at IPv6
multicast address ff02::1 to send to all nodes.
Nothing in Tailscale depends on this (yet?), but it makes debugging in
VMs behind natlab easier (e.g. you can ping all nodes), and other
things might depend on this in the future.
Mostly I'm trying to flesh out the IPv6 support in natlab now that we
can write vnet tests.
Updates #13038
Change-Id: If590031fcf075690ca35c7b230a38c3e72e621eb
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Currently, we use PermitRead/PermitWrite/PermitCert permission flags to determine which operations are allowed for a LocalAPI client.
These checks are performed when localapi.Handler handles a request. Additionally, certain operations (e.g., changing the serve config)
requires the connected user to be a local admin. This approach is inherently racey and is subject to TOCTOU issues.
We consider it to be more critical on Windows environments, which are inherently multi-user, and therefore we prevent more than one
OS user from connecting and utilizing the LocalBackend at the same time. However, the same type of issues is also applicable to other
platforms when switching between profiles that have different OperatorUser values in ipn.Prefs.
We'd like to allow more than one Windows user to connect, but limit what they can see and do based on their access rights on the device
(e.g., an local admin or not) and to the currently active LoginProfile (e.g., owner/operator or not), while preventing TOCTOU issues on Windows
and other platforms. Therefore, we'd like to pass an actor from the LocalAPI to the LocalBackend to represent the user performing the operation.
The LocalBackend, or the profileManager down the line, will then check the actor's access rights to perform a given operation on the device
and against the current (and/or the target) profile.
This PR does not change the current permission model in any way, but it introduces the concept of an actor and includes some preparatory
work to pass it around. Temporarily, the ipnauth.Actor interface has methods like IsLocalSystem and IsLocalAdmin, which are only relevant
to the current permission model. It also lacks methods that will actually be used in the new model. We'll be adding these gradually in the next
PRs and removing the deprecated methods and the Permit* flags at the end of the transition.
Updates tailscale/corp#18342
Signed-off-by: Nick Khyl <nickk@tailscale.com>
To test how virtual machines connect to the natlab vnet code.
Updates #13038
Change-Id: Ia4fd4b0c1803580ee7d94cc9878d777ad4f24f82
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
And refactor some of vnet.go for testability.
The only behavioral change (with a new test) is that ethernet
broadcasts no longer get sent back to the sender.
Updates #13038
Change-Id: Ic2e7e7d6d8805b7b7f2b5c52c2c5ba97101cef14
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
The bad naming (which had only been half updated with the IPv6
changes) tripped me up in the earlier change.
Updates #13038
Change-Id: I65ce07c167e8219d35b87e1f4bf61aab4cac31ff
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
The reason they weren't working was because the cmd/tta agent in the
guest was dialing out to the test and the vnet couldn't map its global
unicast IPv6 address to a node as it was just using a
map[netip.Addr]*node and blindly trusting the *node was
populated. Instead, it was nil, so the agent connection fetching
didn't work for its RoundTripper and the test could never drive the
node. That map worked for IPv4 but for IPv6 we need to use the method
that takes into account the node's IPv6 SLAAC address. Most call sites
had been converted but I'd missed that one.
Also clean up some debug, and prohibit nodes' link-local unicast
addresses from dialing 2000::/3 directly for now. We can allow that to
be configured opt-in later (some sort of IPv6 NAT mode. Whatever it's
called.) That mode was working on accident, but was confusing: Linux
would do source address selection from link local for the first few
seconds and then after SLAAC and DAD, switch to using the global
unicast source address. Be consistent for now and force it to use the
global unicast.
Updates #13038
Change-Id: I85e973aaa38b43c14611943ff45c7c825ee9200a
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Really we need to fix logpolicy + bootstrapDNS to not be so aggressive,
but this is a quick workaround meanwhile.
Without this, tailscaled starts immediately while IPv6 DAD is
happening for a couple seconds and logpolicy freaks out without the
network available and starts spamming stderr about bootstrap DNS
options. But we see that regularly anyway from people whose wifi is
down. So we need to fix the general case. This is not that fix.
Updates #13038
Change-Id: Iba7e536d08e59d34abded1d279f88fdc9c46d94d
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
There were a few places it could get wedged (notably the dial without
a timeout).
And add a knob for verbose debug logs.
And keep two idle connections always.
Updates #13038
Change-Id: I952ad182d7111481d97a83c12aa2ff4bfdc55fe8
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Move all the UDP handling to its own func to remove a bunch of "if
isUDP" checks in a bunch of blocks.
Updates #13038
Change-Id: If71d71b49e57651d15bd307a2233c43751cc8639
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
I didn't actually see this, but added this while debugging something
and figured it'd be good to keep.
Updates #13038
Change-Id: I67934c8a329e0233f79c3b08516fd6bad6bfe22a
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
On a major link change the LAN routes may change, so on linkChange where
ChangeDelta.Major, we need to call authReconfig to ensure that new
routes are observed and applied.
Updates tailscale/corp#22574
Signed-off-by: James Tucker <james@tailscale.com>
This is the equivalent of quad-100, but for IPv6. This is technically
already contained in the Tailscale IPv6 ULA prefix, but that is only
installed when remote peers are visible via control with contained
addrs. The service addr should always be reachable.
Updates #1152
Signed-off-by: Jordan Whited <jordan@tailscale.com>
And sprinkle some more docs around.
Updates #13038
Change-Id: Ia2dcf567b68170481cc2094d64b085c6b94a778a
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
We have several checked type assertions to *types.Named in both cmd/cloner and cmd/viewer.
As Go 1.23 updates the go/types package to produce Alias type nodes for type aliases,
these type assertions no longer work as expected unless the new behavior is disabled
with gotypesalias=0.
In this PR, we add codegen.NamedTypeOf(t types.Type), which functions like t.(*types.Named)
but also unrolls type aliases. We then use it in place of type assertions in the cmd/cloner and
cmd/viewer packages where appropriate.
We also update type switches to include *types.Alias alongside *types.Named in relevant cases,
remove *types.Struct cases when switching on types.Type.Underlying and update the tests
with more cases where type aliases can be used.
Updates #13224
Updates #12912
Signed-off-by: Nick Khyl <nickk@tailscale.com>
Go 1.23 updates the go/types package to produce Alias type nodes for type aliases, unless disabled with gotypesalias=0.
This new default behavior breaks codegen.LookupMethod, which uses checked type assertions to types.Named and
types.Interface, as only named types and interfaces have methods.
In this PR, we update codegen.LookupMethod to perform method lookup on the right-hand side of the alias declaration
and clearly switch on the supported type nodes types. We also improve support for various edge cases, such as when an alias
is used as a type parameter constraint, and add tests for the LookupMethod function.
Additionally, we update cmd/viewer/tests to include types with aliases used in type fields and generic type constraints.
Updates #13224
Updates #12912
Signed-off-by: Nick Khyl <nickk@tailscale.com>
All the magic service names with virtual IPs will need IPv6 variants.
Pull this out in prep.
Updates #13038
Change-Id: I53b5eebd0679f9fa43dc0674805049258c83a0de
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
So we don't log about them when verbose logging is enabled.
Updates #13038
Change-Id: I925bc3a23e6c93d60dd4fb4bf6a4fdc5a326de95
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
The natlab Test Agent (tta) still had its old log streaming hack in
place where it dialed out to anything on TCP port 124 and those logs
were streamed to the host running the tests. But we'd since added gokrazy
syslog streaming support, which made that redundant.
So remove all the port 124 stuff. And then make sure we log to stderr
so gokrazy logs it to syslog.
Also, keep the first 1MB of logs in memory in tta too, exported via
localhost:8034/logs for interactive debugging. That was very useful
during debugging when I added IPv6 support. (which is coming in future
PRs)
Updates #13038
Change-Id: Ieed904a704410b9031d5fd5f014a73412348fa7f
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Otherwise you get "Access denied: watch IPN bus access denied, must
set ipn.NotifyNoPrivateKeys when not running as admin/root or
operator".
This lets a non-operator at least start the app and see the status, even
if they can't change everything. (the web UI is unaffected by operator)
A future change can add a LocalAPI call to check permissions and guide
people through adding a user as an operator (perhaps the web client
can do that?)
Updates #1708
Change-Id: I699e035a251b4ebe14385102d5e7a2993424c4b7
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This adds a systray app for linux, similar to the apps for macOS and
windows. There are already a number of community-developed systray apps,
but most of them are either long abandoned, are built for a specific
desktop environment, or simply wrap the tailscale CLI.
This uses fyne.io/systray (a fork of github.com/getlantern/systray)
which uses newer D-Bus specifications to render the tray icon and menu.
This results in a pretty broad support for modern desktop environments.
This initial commit lacks a number of features like profile switching,
device listing, and exit node selection. This is really focused on the
application structure, the interaction with LocalAPI, and some system
integration pieces like the app icon, notifications, and the clipboard.
Updates #1708
Signed-off-by: Will Norris <will@tailscale.com>
updates tailcale/corp#22371
For dgram mode, we need to store the write addresses of
the client socket(s) alongside the writer functions and
the write operation needs to use WriteToUnix.
Unix also has multiple clients writing to the same socket,
so the serve method is modified to handle packets from
multiple mac addresses.
Cleans up a bit of cruft from the initial tailmac tooling
commit.
Now all the macOS packets are belong to us.
Signed-off-by: Jonathan Nobels <jonathan@tailscale.com>
And convert a few callers as an example, but nowhere near all.
Updates #12912
Change-Id: I5eaa12a29a6cd03b58d6f1072bd27bc0467852f2
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
In prep for updating to new staticcheck required for Go 1.23.
Updates #12912
Change-Id: If77892a023b79c6fa798f936fc80428fd4ce0673
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
To avoid dig vs nslookup vs $X availability issues between
OSes/distros. And to be in Go, to match the resolver we use.
Updates #13038
Change-Id: Ib7e5c351ed36b5470a42cbc230b8f27eed9a1bf8
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
net/tstun.Wrapper.InjectInboundPacketBuffer is not GSO-aware, which can
break quad-100 TCP streams as a result. Linux is the only platform where
gVisor GSO was previously enabled.
Updates tailscale/corp#22511
Updates #13211
Signed-off-by: Jordan Whited <jordan@tailscale.com>
Instead of changing the working directory before launching the incubator process,
this now just changes the working directory after dropping privileges, at which
point we're more likely to be able to enter the user's home directory since we're
running as the user.
For paths that use the 'login' or 'su -l' commands, those already take care of changing
the working directory to the user's home directory.
Fixes#13120
Signed-off-by: Percy Wegmann <percy@tailscale.com>
This adds a new package containing generic types to be used for defining preference hierarchies.
These include prefs.Item, prefs.List, prefs.StructList, and prefs.StructMap. Each of these types
represents a configurable preference, holding the preference's state, value, and metadata.
The metadata includes the default value (if it differs from the zero value of the Go type)
and flags indicating whether a preference is managed via syspolicy or is hidden/read-only for
another reason. This information can be marshaled and sent to the GUI, CLI and web clients
as a source of truth regarding preference configuration, management, and visibility/mutability states.
We plan to use these types to define device preferences, such as the updater preferences,
the permission mode to be used on Windows with #tailscale/corp#18342, and certain global options
that are currently exposed as tailscaled flags. We also aim to eventually use these types for
profile-local preferences in ipn.Prefs and and as a replacement for ipn.MaskedPrefs.
The generic preference types are compatible with the tailscale.com/cmd/viewer and
tailscale.com/cmd/cloner utilities.
Updates #12736
Signed-off-by: Nick Khyl <nickk@tailscale.com>
In Tailnet Lock, there is an implicit limit on the number of rotation
signatures that can be chained before the signature becomes too long.
This program helps tailnet admins to identify nodes that have signatures
with long chains and prints commands to re-sign those node keys with a
fresh direct signature. It's a temporary mitigation measure, and we will
remove this tool as we design and implement a long-term approach for
rotation signatures.
Example output:
```
2024/08/20 18:25:03 Self: does not need re-signing
2024/08/20 18:25:03 Visible peers with valid signatures:
2024/08/20 18:25:03 Peer xxx2.yy.ts.net. (100.77.192.34) nodeid=nyDmhiZiGA11KTM59, current signature kind=direct: does not need re-signing
2024/08/20 18:25:03 Peer xxx3.yy.ts.net. (100.84.248.22) nodeid=ndQ64mDnaB11KTM59, current signature kind=direct: does not need re-signing
2024/08/20 18:25:03 Peer xxx4.yy.ts.net. (100.85.253.53) nodeid=nmZfVygzkB21KTM59, current signature kind=rotation: chain length 4, printing command to re-sign
tailscale lock sign nodekey:530bddbfbe69e91fe15758a1d6ead5337aa6307e55ac92dafad3794f8b3fc661 tlpub:4bf07597336703395f2149dce88e7c50dd8694ab5bbde3d7c2a1c7b3e231a3c2
```
To support this, the NetworkLockStatus localapi response now includes
information about signatures of all peers rather than just the invalid
ones. This is not displayed by default in `tailscale lock status`, but
will be surfaced in `tailscale lock status --json`.
Updates #13185
Signed-off-by: Anton Tolchanov <anton@tailscale.com>
This involved the following:
1. Pass the su command path as first of args in call to unix.Exec to make sure that busybox sees the correct program name.
Busybox is a single executable userspace that implements various core userspace commands in a single binary. You'll
see it used via symlinking, so that for example /bin/su symlinks to /bin/busybox. Busybox knows that you're trying
to execute /bin/su because argv[0] is '/bin/su'. When we called unix.Exec, we weren't including the program name for
argv[0], which caused busybox to fail with 'applet not found', meaning that it didn't know which command it was
supposed to run.
2. Tell su to whitelist the SSH_AUTH_SOCK environment variable in order to support ssh agent forwarding.
3. Run integration tests on alpine, which uses busybox.
4. Increment CurrentCapabilityVersion to allow turning on SSH V2 behavior from control.
Fixes#12849
Signed-off-by: Percy Wegmann <percy@tailscale.com>
In df6014f1d7 we removed build tag
gating preventing importation, which tripped a NetworkExtension limit
test in corp. This was a reversal of
25f0a3fc8f which actually made the
situation worse, hence the simplification.
This commit goes back to the strategy in
25f0a3fc8f, and gets us back under the
limit in my local testing. Admittedly, we don't fully understand
the effects of importing or excluding importation of this package,
and have seen mixed results, but this commit allows us to move forward
again.
Updates tailscale/corp#22125
Signed-off-by: Jordan Whited <jordan@tailscale.com>
In 2f27319baf we disabled GRO due to a
data race around concurrent calls to tstun.Wrapper.Write(). This commit
refactors GRO to be thread-safe, and re-enables it on Linux.
This refactor now carries a GRO type across tstun and netstack APIs
with a lifetime that is scoped to a single tstun.Wrapper.Write() call.
In 25f0a3fc8f we used build tags to
prevent importation of gVisor's GRO package on iOS as at the time we
believed it was contributing to additional memory usage on that
platform. It wasn't, so this commit simplifies and removes those
build tags.
Updates tailscale/corp#22353
Updates tailscale/corp#22125
Updates #6816
Signed-off-by: Jordan Whited <jordan@tailscale.com>
cmd/k8s-operator/deploy: replace wildcards in Kubernetes Operator RBAC role definitions with verbs
fixes: #13168
Signed-off-by: Pierig Le Saux <pierig@n3xt.io>
Updates tailscale/corp#22120
Adds the ability to start the backend by reading an authkey stored in the syspolicy database (MDM). This is useful for devices that are provisioned in an unattended fashion.
Signed-off-by: Andrea Gottardo <andrea@gottardo.me>
updates tailcale/corp#22371
Adds custom macOS vm tooling. See the README for
the general gist, but this will spin up VMs with unixgram
capable network interfaces listening to a named socket,
and with a virtio socket device for host-guest communication.
We can add other devices like consoles, serial, etc as needed.
The whole things is buildable with a single make command, and
everything is controllable via the command line using the TailMac
utility.
This should all be generally functional but takes a few shortcuts
with error handling and the like. The virtio socket device support
has not been tested and may require some refinement.
Signed-off-by: Jonathan Nobels <jonathan@tailscale.com>
`DNS unavailable` was marked as a high severity warning. On Android (and other platforms), these trigger a system notification. Here we reduce the severity level to medium. A medium severity warning will still display the warning icon on platforms with a tray icon because of the `ImpactsConnectivity=true` flag being set here, but it won't show a notification anymore. If people enter an area with bad cellular reception, they're bound to receive so many of these notifications and we need to reduce notification fatigue.
Signed-off-by: Andrea Gottardo <andrea@tailscale.com>
In a93dc6cdb1 tryUpgradeToBatchingConn()
moved to build tag gated files, but the runtime.GOOS condition excluding
Android was removed unintentionally from batching_conn_linux.go. Add it
back.
Updates tailscale/corp#22348
Signed-off-by: Jordan Whited <jordan@tailscale.com>
By default, Windows sets the SIO_UDP_CONNRESET and SIO_UDP_NETRESET
options on created UDP sockets. These behaviours make the UDP socket
ICMP-aware; when the system gets an ICMP message (e.g. an "ICMP Port
Unreachable" message, in the case of SIO_UDP_CONNRESET), it will cause
the underlying UDP socket to throw an error. Confusingly, this can occur
even on reads, if the same UDP socket is used to write a packet that
triggers this response.
The Go runtime disabled the SIO_UDP_CONNRESET behavior in 3114bd6, but
did not change SIO_UDP_NETRESET–probably because that socket option
isn't documented particularly well.
Various other networking code seem to disable this behaviour, such as
the Godot game engine (godotengine/godot#22332) and the Eclipse TCF
agent (link below). Others appear to work around this by ignoring the
error returned (anacrolix/dht#16, among others).
For now, until it's clear whether this ends up in the upstream Go
implementation or not, let's also disable the SIO_UDP_NETRESET in a
similar manner to SIO_UDP_CONNRESET.
Eclipse TCF agent: https://gitlab.eclipse.org/eclipse/tcf/tcf.agent/-/blob/master/agent/tcf/framework/mdep.c
Updates #10976
Updates golang/go#68614
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I70a2f19855f8dec1bfb82e63f6d14fc4a22ed5c3
Coder has just adopted nhooyr/websocket which unfortunately changes the import path.
`github.com/coder/coder` imports `tailscale.com/net/wsconn` which was still pointing
to `nhooyr.io/websocket`, but this change updates it.
See https://coder.com/blog/websocket
Updates #13154
Change-Id: I3dec6512472b14eae337ae22c5bcc1e3758888d5
Signed-off-by: Kyle Carberry <kyle@carberry.com>
This PR modifies viewTypeForContainerType to use the last type parameter of a container type
as the value type, enabling the implementation of map-like container types where the second-to-last
(usually first) type parameter serves as the key type.
It also adds a MapContainer type to test the code generation.
Updates #12736
Signed-off-by: Nick Khyl <nickk@tailscale.com>
This prevents two things:
1. Crashing if there's no response body
2. Sending a nonsensical 0 response status code
Updates tailscale/corp#22357
Signed-off-by: Percy Wegmann <percy@tailscale.com>
cmd/k8s-operator,k8s-operator/sessionrecording: support recording WebSocket sessions
Kubernetes currently supports two streaming protocols, SPDY and WebSockets.
WebSockets are replacing SPDY, see
https://github.com/kubernetes/enhancements/issues/4006.
We were currently only supporting SPDY, erroring out if session
was not SPDY and relying on the kube's built-in SPDY fallback.
This PR:
- adds support for parsing contents of 'kubectl exec' sessions streamed
over WebSockets
- adds logic to distinguish 'kubectl exec' requests for a SPDY/WebSockets
sessions and call the relevant handler
Updates tailscale/corp#19821
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
Co-authored-by: Tom Proctor <tomhjp@users.noreply.github.com>
Add functionality to optionally serve a health check endpoint
(off by default).
Users can enable health check endpoint by setting
TS_HEALTHCHECK_ADDR_PORT to [<addr>]:<port>.
Containerboot will then serve an unauthenticatd HTTP health check at
/healthz at that address. The health check returns 200 OK if the
node has at least one tailnet IP address, else returns 503.
Updates tailscale/tailscale#12898
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
This latest version allows for building on various OpenBSD architectures.
(such as openbsd/riscv64)
Updates #8043
Change-Id: Ie9a8738e6aa96335214d5750e090db35e526a4a4
Signed-off-by: Aaron Bieber <aaron@bolddaemon.com>
... rather than abusing the generic tsapp.
Per discussion in https://github.com/gokrazy/gokrazy/pull/275
It also means we can remove stuff we don't need, like ntp or randomd.
Updates #13038
Change-Id: Iccf579c354bd3b5025d05fa1128e32f1d5bde4e4
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
It's too new to be supported in Debian bookworm so just remove it.
It doesn't seem to matter or help speed anything up.
Updates #13038
Change-Id: I39077ba8032bebecd75209552b88f1842c843c33
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
84adfa1ba3 made MAC addresses 1-based too, but didn't adjust this IP address
calculation which was based on the MAC address
Updates #13038
Change-Id: Idc112b303b0b85f41fe51fd61ce1c0d8a3f0f57e
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
The heartbeat package does nothing if not configured anyway, so don't
even put it in the image and pay the cost of it running.
Updates #13038
Updates #1866
Change-Id: Id22c0fb1f8395ad21ab0e0350973d31730e8d39f
The change in b7e48058c8 was too loose; it also captured the CLI
being run as a child process under cmd/tta.
Updates #13038
Updates #1866
Change-Id: Id410b87132938dd38ed4dd3959473c5d0d242ff5
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
* cmd/k8s-operator: fix DNS reconciler for dual-stack clusters
This fixes a bug where DNS reconciler logic was always assuming
that no more than one EndpointSlice exists for a Service.
In fact, there can be multiple, for example, in dual-stack
clusters, but also in other cases this is valid (as per kube docs).
This PR:
- allows for multiple EndpointSlices
- picks out the ones for IPv4 family
- deduplicates addresses
Updates tailscale/tailscale#13056
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
Co-authored-by: Tom Proctor <tomhjp@users.noreply.github.com>
These three packages aren't in gokrazy/tsapp/config.json but
used to be. Unfortunately, that meant that were being included
in the resulting image. Apparently `gok` doesn't delete them or
warn about them being present on disk when they're moved from
the config file.
Updates #13038
Updates #1866
Change-Id: I54918a9e3286ea755b11dde5e9efdd433b8f8fb8
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
We had a mix of 0-based and 1-based nodes and MACs in logs.
Updates #13038
Change-Id: I36d1b00f7f94b37b4ae2cd439bcdc5dbee6eda4d
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Using https://github.com/gokrazy/gokrazy/pull/275
This is much lower latency than logcatcher, which is higher latency
and chunkier. And this is better than getting it via 'tailscale debug
daemon-logs', which misses early interesting logs.
Updates #13038
Change-Id: I499ec254c003a9494c0e9910f9c650c8ac44ef33
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Package setting contains types for defining and representing policy settings.
It facilitates the registration of setting definitions using Register and RegisterDefinition,
and the retrieval of registered setting definitions via Definitions and DefinitionOf.
This package is intended for use primarily within the syspolicy package hierarchy,
and added in a preparation for the next PRs.
Updates #12687
Signed-off-by: Nick Khyl <nickk@tailscale.com>
In particular, tests showing that #3824 works. But that test doesn't
actually work yet; it only gets a DERP connection. (why?)
Updates #13038
Change-Id: Ie1fd1b6a38d4e90fae7e72a0b9a142a95f0b2e8f
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Similar to UseSocketOnly, but pulled out separately in case
people are doing unknown weird things.
Updates #13038
Change-Id: I7478e5cb9794439b947440b831caa798941845ea
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
getConns() is now responsible for returning both stable and unstable
conns. conn and measureFn are now passed together via connAndMeasureFn.
newConnAndMeasureFn() is responsible for constructing them.
TCP measurement timeouts are adjusted to more closely match netcheck.
Updates tailscale/corp#22114
Signed-off-by: Jordan Whited <jordan@tailscale.com>
To test local connections.
Updates #13038
Change-Id: I575dcab31ca812edf7d04fa126772611cf89b9a7
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This adds a new NodeAgentClient type that can be used to
invoke the LocalAPI using the LocalClient instead of
handcrafted URLs. However, there are certain cases where
it does make sense for the node agent to provide more
functionality than whats possible with just the LocalClient,
as such it also exposes a http.Client to make requests directly.
Signed-off-by: Maisem Ali <maisem@tailscale.com>
And don't make guests under vnet/natlab upload to logcatcher,
as there won't be a valid cert anyway.
Updates #13038
Change-Id: Ie1ce0139788036b8ecc1804549a9b5d326c5fef5
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
'stun' has been removed from metric names and replaced with a protocol
label. This refactor is preparation work for HTTPS & ICMP support.
Updates tailscale/corp#22114
Signed-off-by: Jordan Whited <jordan@tailscale.com>
Troubleshooting DNS resolution issues often requires additional information.
This PR expands the effect of the TS_DEBUG_DNS_FORWARD_SEND envknob to forwarder.forwardWithDestChan,
and includes the request type, domain name length, and the first 3 bytes of the domain's SHA-256 hash in the output.
Fixes#13070
Signed-off-by: Nick Khyl <nickk@tailscale.com>
In a situation when manual edits are made on the admin panel, around the
GitOps process, the pusher will be stuck if `--fail-on-manual-edits` is
set, as expected.
To recover from this, there are 2 options:
1. revert the admin panel changes to get back in sync with the code
2. check in the manual edits to code
The former will work well, since previous and local ETags will match
control ETag again. The latter will still fail, since local and control
ETags match, but previous does not.
For this situation, check the local ETag against control first and
ignore previous when things are already in sync.
Updates https://github.com/tailscale/corp/issues/22177
Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
For cases where users want to be extra careful about not overwriting
manual changes, add a flag to hard-fail. This is only useful if the etag
cache is persistent or otherwise reliable. This flag should not be used
in ephemeral CI workers that won't persist the cache.
Updates https://github.com/tailscale/corp/issues/22177
Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
* cmd/tsidp: add funnel support
Updates #10263.
Signed-off-by: Naman Sood <mail@nsood.in>
* look past funnel-ingress-node to see who we're authenticating
Signed-off-by: Naman Sood <mail@nsood.in>
* fix comment typo
Signed-off-by: Naman Sood <mail@nsood.in>
* address review feedback, support Basic auth for /token
Turns out you need to support Basic auth if you do client ID/secret
according to OAuth.
Signed-off-by: Naman Sood <mail@nsood.in>
* fix typos
Signed-off-by: Naman Sood <mail@nsood.in>
* review fixes
Signed-off-by: Naman Sood <mail@nsood.in>
* remove debugging log
Signed-off-by: Naman Sood <mail@nsood.in>
* add comments, fix header
Signed-off-by: Naman Sood <mail@nsood.in>
---------
Signed-off-by: Naman Sood <mail@nsood.in>
This commit adds a batchingConn interface, and renames batchingUDPConn
to linuxBatchingConn. tryUpgradeToBatchingConn() may return a platform-
specific implementation of batchingConn. So far only a Linux
implementation of this interface exists, but this refactor is being
done in anticipation of a Windows implementation.
Updates tailscale/corp#21874
Signed-off-by: Jordan Whited <jordan@tailscale.com>
The same context we use for the HTTP request here might be re-used by
the dialer, which could result in `GotConn` being called multiple times.
We only care about the last one.
Fixes#13009
Signed-off-by: Anton Tolchanov <anton@tailscale.com>
This change adds an HTTP handler with a table showing a list of all
probes, their status, and a button that allows triggering a specific
probe.
Updates tailscale/corp#20583
Signed-off-by: Anton Tolchanov <anton@tailscale.com>
- Keep track of the last 10 probe results and successful probe
latencies;
- Add an HTTP handler that triggers a given probe by name and returns it
result as a plaintext HTML page, showing recent probe results as a
baseline
Updates tailscale/corp#20583
Signed-off-by: Anton Tolchanov <anton@tailscale.com>
I noticed a few places with custom http.Transport where we are not
closing idle connections when transport is no longer used.
Updates tailscale/corp#21609
Signed-off-by: Anton Tolchanov <anton@tailscale.com>
During review of #8644 the `recover-compromised-key` command was renamed
to `revoke-key`, but the old name remained in some messages printed by
the command.
Fixestailscale/corp#19446
Signed-off-by: Anton Tolchanov <anton@tailscale.com>
We were copying 12 out of the 16 bytes which meant that
the 1:1 NAT required would only work if the last 4 bytes
happened to match between the new and old address, something
that our tests accidentally had. Fix it by copying the full
16 bytes and make the tests also verify the addr and use rand
addresses.
Updates #9511
Signed-off-by: Maisem Ali <maisem@tailscale.com>
It was returning a nil `*iptablesRunner` instead of a
nil `NetfilterRunner` interface which would then fail
checks later.
Fixes#13012
Signed-off-by: Maisem Ali <maisem@tailscale.com>
This commit increases gVisor's TCP max send (4->6MiB) and receive
(4->8MiB) buffer sizes on all platforms except iOS. These values are
biased towards higher throughput on high bandwidth-delay product paths.
The iperf3 results below demonstrate the effect of this commit between
two Linux computers with i5-12400 CPUs. 100ms of RTT latency is
introduced via Linux's traffic control network emulator queue
discipline.
The first set of results are from commit f0230ce prior to TCP buffer
resizing.
gVisor write direction:
Test Complete. Summary Results:
[ ID] Interval Transfer Bitrate Retr
[ 5] 0.00-10.00 sec 180 MBytes 151 Mbits/sec 0 sender
[ 5] 0.00-10.10 sec 179 MBytes 149 Mbits/sec receiver
gVisor read direction:
Test Complete. Summary Results:
[ ID] Interval Transfer Bitrate Retr
[ 5] 0.00-10.10 sec 337 MBytes 280 Mbits/sec 20 sender
[ 5] 0.00-10.00 sec 323 MBytes 271 Mbits/sec receiver
The second set of results are from this commit with increased TCP
buffer sizes.
gVisor write direction:
[ ID] Interval Transfer Bitrate Retr
[ 5] 0.00-10.00 sec 297 MBytes 249 Mbits/sec 0 sender
[ 5] 0.00-10.10 sec 297 MBytes 247 Mbits/sec receiver
gVisor read direction:
[ ID] Interval Transfer Bitrate Retr
[ 5] 0.00-10.10 sec 501 MBytes 416 Mbits/sec 17 sender
[ 5] 0.00-10.00 sec 485 MBytes 407 Mbits/sec receiver
Updates #9707
Updates tailscale/corp#22119
Signed-off-by: Jordan Whited <jordan@tailscale.com>
Updates tailscale/tailscale#1634
This PR optimizes captive portal detection on Android and iOS by excluding cellular data interfaces (`pdp*` and `rmnet`). As cellular networks do not present captive portals, frequent network switches between Wi-Fi and cellular would otherwise trigger captive detection unnecessarily, causing battery drain.
Signed-off-by: Andrea Gottardo <andrea@gottardo.me>
This commit implements TCP GRO for packets being written to gVisor on
Linux. Windows support will follow later. The wireguard-go dependency is
updated in order to make use of newly exported IP checksum functions.
gVisor is updated in order to make use of newly exported
stack.PacketBuffer GRO logic.
TCP throughput towards gVisor, i.e. TUN write direction, is dramatically
improved as a result of this commit. Benchmarks show substantial
improvement, sometimes as high as 2x. High bandwidth-delay product
paths remain receive window limited, bottlenecked by gVisor's default
TCP receive socket buffer size. This will be addressed in a follow-on
commit.
The iperf3 results below demonstrate the effect of this commit between
two Linux computers with i5-12400 CPUs. There is roughly ~13us of round
trip latency between them.
The first result is from commit 57856fc without TCP GRO.
Starting Test: protocol: TCP, 1 streams, 131072 byte blocks
- - - - - - - - - - - - - - - - - - - - - - - - -
Test Complete. Summary Results:
[ ID] Interval Transfer Bitrate Retr
[ 5] 0.00-10.00 sec 4.77 GBytes 4.10 Gbits/sec 20 sender
[ 5] 0.00-10.00 sec 4.77 GBytes 4.10 Gbits/sec receiver
The second result is from this commit with TCP GRO.
Starting Test: protocol: TCP, 1 streams, 131072 byte blocks
- - - - - - - - - - - - - - - - - - - - - - - - -
Test Complete. Summary Results:
[ ID] Interval Transfer Bitrate Retr
[ 5] 0.00-10.00 sec 10.6 GBytes 9.14 Gbits/sec 20 sender
[ 5] 0.00-10.00 sec 10.6 GBytes 9.14 Gbits/sec receiver
Updates #6816
Signed-off-by: Jordan Whited <jordan@tailscale.com>
I updated the address parsing stuff to return a specific error for
unspecified hosts passed as empty strings, and look for that
when logging errors. I explicitly did not make parseAddress return a
netip.Addr containing an unspecified address because at this layer,
in the absence of any host, we don't necessarily know the address
family we're dealing with.
For the purposes of this code I think this is fine, at least until
we implement #12588.
Fixes#12979
Signed-off-by: Aaron Klotz <aaron@tailscale.com>
It seems some security software or macOS itself might be MITMing TLS
(for ScreenTime?), so don't warn unless it fails x509 validation
against system roots.
Updates #3198
Change-Id: I6ea381b5bb6385b3d51da4a1468c0d803236b7bf
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This commit implements TCP GSO for packets being read from gVisor on
Linux. Windows support will follow later. The wireguard-go dependency is
updated in order to make use of newly exported GSO logic from its tun
package.
A new gVisor stack.LinkEndpoint implementation has been established
(linkEndpoint) that is loosely modeled after its predecessor
(channel.Endpoint). This new implementation supports GSO of monster TCP
segments up to 64K in size, whereas channel.Endpoint only supports up to
32K. linkEndpoint will also be required for GRO, which will be
implemented in a follow-on commit.
TCP throughput from gVisor, i.e. TUN read direction, is dramatically
improved as a result of this commit. Benchmarks show substantial
improvement through a wide range of RTT and loss conditions, sometimes
as high as 5x.
The iperf3 results below demonstrate the effect of this commit between
two Linux computers with i5-12400 CPUs. There is roughly ~13us of round
trip latency between them.
The first result is from commit 57856fc without TCP GSO.
Starting Test: protocol: TCP, 1 streams, 131072 byte blocks
- - - - - - - - - - - - - - - - - - - - - - - - -
Test Complete. Summary Results:
[ ID] Interval Transfer Bitrate Retr
[ 5] 0.00-10.00 sec 2.51 GBytes 2.15 Gbits/sec 154 sender
[ 5] 0.00-10.00 sec 2.49 GBytes 2.14 Gbits/sec receiver
The second result is from this commit with TCP GSO.
Starting Test: protocol: TCP, 1 streams, 131072 byte blocks
- - - - - - - - - - - - - - - - - - - - - - - - -
Test Complete. Summary Results:
[ ID] Interval Transfer Bitrate Retr
[ 5] 0.00-10.00 sec 12.6 GBytes 10.8 Gbits/sec 6 sender
[ 5] 0.00-10.00 sec 12.6 GBytes 10.8 Gbits/sec receiver
Updates #6816
Signed-off-by: Jordan Whited <jordan@tailscale.com>
Fixestailscale/tailscale#12973
Updates tailscale/tailscale#1634
There was a logic issue in the captive detection code we shipped in https://github.com/tailscale/tailscale/pull/12707.
Assume a captive portal has been detected, and the user notified. Upon switching to another Wi-Fi that does *not* have a captive portal, we were issuing a signal to interrupt any pending captive detection attempt. However, we were not also setting the `captive-portal-detected` warnable to healthy. The result was that any "captive portal detected" alert would not be cleared from the UI.
Also fixes a broken log statement value.
Signed-off-by: Andrea Gottardo <andrea@gottardo.me>
fixes tailscale#12968
The dns manager cleanup func was getting passed a nil
health tracker, which will panic. Fixed to pass it
the system health tracker.
Signed-off-by: Jonathan Nobels <jonathan@tailscale.com>
It is no longer correct to state that we don't support running Tailscale in containers or on Kubernetes.
Updates tailscale/tailscale#12842
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
Add a warning that the Dockerfile in the OSS repo is not the
currently used mechanism to build the images we publish - for folks
who want to contribute to image build scripts or otherwise need to
understand the image build process that we use.
Updates#cleanup
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
All wasi* are GOARCH wasm, so check that instead.
Updates #12732
Change-Id: Id3cc346295c1641bcf80a6c5eb1ad65488509656
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
updates tailscale/corp#21823
Misconfigured, broken, or blocked DNS will often present as
"internet is broken'" to the end user. This plumbs the health tracker
into the dns manager and forwarder and adds a health warning
with a 5 second delay that is raised on failures in the forwarder and
lowered on successes.
Signed-off-by: Jonathan Nobels <jonathan@tailscale.com>
cmd/k8s-operator,k8s-operator/sessionrecording,sessionrecording,ssh/tailssh: refactor session recording functionality
Refactor SSH session recording functionality (mostly the bits related to
Kubernetes API server proxy 'kubectl exec' session recording):
- move the session recording bits used by both Tailscale SSH
and the Kubernetes API server proxy into a shared sessionrecording package,
to avoid having the operator to import ssh/tailssh
- move the Kubernetes API server proxy session recording functionality
into a k8s-operator/sessionrecording package, add some abstractions
in preparation for adding support for a second streaming protocol (WebSockets)
Updates tailscale/corp#19821
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
Allows the use of tsweb.LogHandler exclusively for callbacks describing the
handler HTTP requests.
Fixes#12837
Signed-off-by: Paul Scott <paul@tailscale.com>
Re-instates the functionality that generates CRD API docs, but using
a different library as the one we were using earlier seemed to have
some issues with its Git history.
Also regenerates the docs (make kube-generate-all).
Updates tailscale/tailscale#12859
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
Remove the restriction that getent is skipped on non-Linux unixes.
Improve validation of the parsed output from getent, in case unknown
systems return unusable information.
Fixes#12730.
Signed-off-by: Ross Williams <ross@ross-williams.net>
Updates tailscale/corp#21949
As discussed with @raggi, this PR updates the static DERPMap embedded in the client to reflect the availability of HTTP on the DERP servers run by Tailscale.
Signed-off-by: Andrea Gottardo <andrea@gottardo.me>
Updates tailscale/tailscale#1634
This PR introduces a new `captive-portal-detected` Warnable which is set to an unhealthy state whenever a captive portal is detected on the local network, preventing Tailscale from connecting.
ipn/ipnlocal: fix captive portal loop shutdown
Change-Id: I7cafdbce68463a16260091bcec1741501a070c95
net/captivedetection: fix mutex misuse
ipn/ipnlocal: ensure that we don't fail to start the timer
Change-Id: I3e43fb19264d793e8707c5031c0898e48e3e7465
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Signed-off-by: Andrea Gottardo <andrea@gottardo.me>
wgengine/magicsock,ipn: allow setting static node endpoints via tailscaled config file.
Adds a new StaticEndpoints field to tailscaled config
that can be used to statically configure the endpoints
that the node advertizes. This field will replace
TS_DEBUG_PRETENDPOINTS env var that can be used to achieve the same.
Additionally adds some functionality that ensures that endpoints
are updated when configfile is reloaded.
Also, refactor configuring/reconfiguring components to use the
same functionality when configfile is parsed the first time or
subsequent times (after reload). Previously a configfile reload
did not result in resetting of prefs. Now it does- but does not yet
tell the relevant components to consume the new prefs. This is to
be done in a follow-up.
Updates tailscale/tailscale#12578
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
It is sometimes necessary to change a global lazy.SyncValue for the duration of a test. This PR adds a (*SyncValue[T]).SetForTest method to facilitate that.
Updates #12687
Signed-off-by: Nick Khyl <nickk@tailscale.com>
The standard library includes these for strings and byte slices,
but it lacks similar functions for generic slices of comparable types.
Although they are not as commonly used, these functions are useful
in scenarios such as working with field index sequences (i.e., []int)
via reflection.
Updates #12687
Signed-off-by: Nick Khyl <nickk@tailscale.com>
This adds support for container-like types such as Container[T] that
don't explicitly specify a view type for T. Instead, a package implementing
a container type should also implement and export a ContainerView[T, V] type
and a ContainerViewOf(*Container[T]) ContainerView[T, V] function, which
returns a view for the specified container, inferring the element view type V
from the element type T.
Updates #12736
Signed-off-by: Nick Khyl <nickk@tailscale.com>
Some users run "tailscale cert" in a cron job to renew their
certificates on disk. The time until the next cron job run may be long
enough for the old cert to expire with our default heristics.
Add a `--min-validity` flag which ensures that the returned cert is
valid for at least the provided duration (unless it's longer than the
cert lifetime set by Let's Encrypt).
Updates #8725
Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
Remove fybrik.io/crdoc dependency as it is causing issues for folks attempting
to vendor tailscale using GOPROXY=direct.
This means that the CRD API docs in ./k8s-operator/api.md will no longer
be generated- I am going to look at replacing it with another tool
in a follow-up.
Updates tailscale/tailscale#12859
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
Windows requires routes to have a nexthop. Routes created using the interface's local IP address or an unspecified IP address ("0.0.0.0" or "::") as the nexthop are considered on-link routes. Notably, Windows treats on-link subnet routes differently, reserving the last IP in the range as the broadcast IP and therefore prohibiting TCP connections to it, resulting in WSA error 10049: "The requested address is not valid in its context. This does not happen with single-host routes, such as routes to Tailscale IP addresses, but becomes a problem with advertised subnets when all IPs in the range should be reachable.
Before Windows 8, only routes created with an unspecified IP address were considered on-link, so our previous approach of using the interface's own IP as the nexthop likely worked on Windows 7.
This PR updates configureInterface to use the TailscaleServiceIP (100.100.100.100) and its IPv6 counterpart as the nexthop for subnet routes.
Fixestailscale/support-escalations#57
Signed-off-by: Nick Khyl <nickk@tailscale.com>
With this change, the error handling and request logging are all done in defers
after calling inner.ServeHTTP. This ensures that any recovered values which we
want to re-panic with retain a useful stacktrace. However, we now only
re-panic from errorHandler when there's no outside logHandler. Which if you're
using StdHandler there always is. We prefer this to ensure that we are able to
write a 500 Internal Server Error to the client. If a panic hits http.Server
then the response is not sent back.
Updates #12784
Signed-off-by: Paul Scott <paul@tailscale.com>
... and then do approximately nothing with that information, other
than a big TODO. This is mostly me relearning this code and leaving
breadcrumbs for others in the future.
Updates #12724
Signed-off-by: Brad Fitzpatrick <brad@danga.com>
StdHandler/retHandler would previously emit one log line for each request.
If there were multiple StdHandler in the chain, there would be one log line
per instance of retHandler.
With this change, only the outermost StdHandler/logHandler actually logs the
request or invokes OnStart or OnCompletion callbacks. The error-rendering part
of retHandler lives on in errorHandler, and errorHandler passes those errors up
the stack to logHandler through a callback that logHandler places in the
request.Context().
Updates tailscale/corp#19999
Signed-off-by: Paul Scott <paul@tailscale.com>
To match the format of exit node suggestions and ensure that the result
is not ambiguous, relax exit node CLI selection to permit using a FQDN
including the trailing dot.
Updates #12618
Change-Id: I04b9b36d2743154aa42f2789149b2733f8555d3f
Signed-off-by: Adrian Dewhurst <adrian@tailscale.com>
Fixestailscale/tailscale#12794
We were printing some leftover debug logs within a callback function that would be executed after the test completion, causing the test to fail. This change drops the log calls to address the issue.
Signed-off-by: Andrea Gottardo <andrea@gottardo.me>
If we get an non-disco presumably-wireguard-encrypted UDP packet from
an IP:port we don't recognize, rather than drop the packet, give it to
WireGuard anyway and let WireGuard try to figure out who it's from and
tell us.
This uses the new hook added in https://github.com/tailscale/wireguard-go/pull/27
Updates tailscale/corp#20732
Change-Id: I5c61a40143810592f9efac6c12808a87f924ecf2
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Some operations cannot be implemented with the prior API:
* Iterating over the map and deleting keys
* Iterating over the map and replacing items
* Calling APIs that expect a native Go map
Add a Map.WithLock method that acquires a write-lock on the map
and then calls a user-provided closure with the underlying Go map.
This allows users to interact with the Map as a regular Go map,
but with the gaurantees that it is concurrent safe.
Updates tailscale/corp#9115
Signed-off-by: Joe Tsai <joetsai@digital-static.net>
This adds support for generic types and interfaces to our cloner and viewer codegens.
It updates these packages to determine whether to make shallow or deep copies based
on the type parameter constraints. Additionally, if a template parameter or an interface
type has View() and Clone() methods, we'll use them for getters and the cloner of the
owning structure.
Updates #12736
Signed-off-by: Nick Khyl <nickk@tailscale.com>
Updates tailscale/tailscale#4136
To reduce the likelihood of presenting spurious warnings, add the ability to delay the visibility of certain Warnables, based on a TimeToVisible time.Duration field on each Warnable. The default is zero, meaning that a Warnable is immediately visible to the user when it enters an unhealthy state.
Signed-off-by: Andrea Gottardo <andrea@gottardo.me>
This commit truncates any additional information (mainly hostnames) that's passed to controlD via DOH URL in DoHIPsOfBase.
This change is to make sure only resolverID is passed to controlDv6Gen but not the additional information.
Updates: #7946
Signed-off-by: KevinLiang10 <37811973+KevinLiang10@users.noreply.github.com>
If an optional `hwaddrs` URL parameter is present, add network interface
hardware addresses to the posture identity response.
Just like with serial numbers, this requires client opt-in via MDM or
`tailscale set --posture-checking=true`
(https://tailscale.com/kb/1326/device-identity)
Updates tailscale/corp#21371
Signed-off-by: Anton Tolchanov <anton@tailscale.com>
Load Balancers often have more than one ingress IP, so allowing us to
add multiple means we can offer multiple options.
Updates #12578
Change-Id: I4aa49a698d457627d2f7011796d665c67d4c7952
Signed-off-by: Lee Briggs <lee@leebriggs.co.uk>
This adds a package with GP-related functions and types to be used in the future PRs.
It also updates nrptRuleDatabase to use the new package instead of its own gpNotificationWatcher implementation.
Updates #12687
Signed-off-by: Nick Khyl <nickk@tailscale.com>
We added a workaround for --wait, but didn't confirm the other flags,
which were added in systemd 235 and 236. Check systemd version for
deciding when to set all 3 flags.
Fixes#12136
Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
While `clientupdate.Updater` won't be able to apply updates on macsys,
we use `clientupdate.CanAutoUpdate` to gate the EditPrefs endpoint in
localAPI. We should allow the GUI client to set AutoUpdate.Apply on
macsys for it to properly get reported to the control plane. This also
allows the tailnet-wide default for auto-updates to propagate to macsys
clients.
Updates https://github.com/tailscale/corp/issues/21339
Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
cmd/k8s-operator,ssh/tailssh,tsnet: optionally record kubectl exec sessions
The Kubernetes operator's API server proxy, when it receives a request
for 'kubectl exec' session now reads 'RecorderAddrs', 'EnforceRecorder'
fields from tailcfg.KubernetesCapRule.
If 'RecorderAddrs' is set to one or more addresses (of a tsrecorder instance(s)),
it attempts to connect to those and sends the session contents
to the recorder before forwarding the request to the kube API
server. If connection cannot be established or fails midway,
it is only allowed if 'EnforceRecorder' is not true (fail open).
Updates tailscale/corp#19821
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
Co-authored-by: Maisem Ali <maisem@tailscale.com>
For testing. Lee wants to play with 'AWS Global Accelerator Custom
Routing with Amazon Elastic Kubernetes Service'. If this works well
enough, we can promote it.
Updates #12578
Change-Id: I5018347ed46c15c9709910717d27305d0aedf8f4
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
The DERP Return Path Optimization (DRPO) is over four years old (and
on by default for over two) and we haven't had problems, so time to
remove the emergency shutoff code (controlknob) which we've never
used. The controlknobs are only meant for new features, to mitigate
risk. But we don't want to keep them forever, as they kinda pollute
the code.
Updates #150
Change-Id: If021bc8fd1b51006d8bddd1ffab639bb1abb0ad1
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
And fix up a bogus comment and flesh out some other comments.
Updates #cleanup
Change-Id: Ia60a1c04b0f5e44e8d9587914af819df8e8f442a
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
cmd/containerboot,cmd/k8s-operator: enable IPv6 for fqdn egress proxies
Don't skip installing egress forwarding rules for IPv6 (as long as the host
supports IPv6), and set headless services `ipFamilyPolicy` to
`PreferDualStack` to optionally enable both IP families when possible. Note
that even with `PreferDualStack` set, testing a dual-stack GKE cluster with
the default DNS setup of kube-dns did not correctly set both A and
AAAA records for the headless service, and instead only did so when
switching the cluster DNS to Cloud DNS. For both IPv4 and IPv6 to work
simultaneously in a dual-stack cluster, we require headless services to
return both A and AAAA records.
If the host doesn't support IPv6 but the FQDN specified only has IPv6
addresses available, containerboot will exit with error code 1 and an
error message because there is no viable egress route.
Fixes#12215
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
Updates tailscale/tailscale#4136
We should make sure to send the value of ImpactsConnectivity over to the clients using LocalAPI as they need it to display alerts in the GUI properly.
Signed-off-by: Andrea Gottardo <andrea@gottardo.me>
This change expands the `exit-node list -filter` command to display all
location based exit nodes for the filtered country. This allows users
to switch to alternative servers when our recommended exit node is not
working as intended.
This change also makes the country filter matching case insensitive,
e.g. both USA and usa will work.
Updates #12698
Signed-off-by: Charlotte Brandhorst-Satzkorn <charlotte@tailscale.com>
Updates tailscale/tailscale#4136
High severity health warning = a system notification will appear, which can be quite disruptive to the user and cause unnecessary concern in the event of a temporary network issue.
Per design decision (@sonovawolf), the severity of all warnings but "network is down" should be tuned down to medium/low. ImpactsConnectivity should be set, to change the icon to an exclamation mark in some cases, but without a notification bubble.
I also tweaked the messaging for update-available, to reflect how each platform gets updates in different ways.
Signed-off-by: Andrea Gottardo <andrea@gottardo.me>
Updates tailscale/corp#20677
The recover function wasn't getting set in the benchmark
tests. Default changed to an empty func.
Signed-off-by: Jonathan Nobels <jonathan@tailscale.com>
Fix regression from #8108 (Mar 2023). Since that change, gocross has
always been rebuilt on each run of ./tool/go (gocross-wrapper.sh),
adding ~100ms. (Well, not totally rebuilt; cmd/go's caching still
ends up working fine.)
The problem was $gocross_path was just "gocross", which isn't in my
path (and "." isn't in my $PATH, as it shouldn't be), so this line was
always evaluating to the empty string:
gotver="$($gocross_path gocross-version 2>/dev/null || echo '')"
The ./gocross is fine because of the earlier `cd "$repo_root"`
Updates tailscale/corp#21262
Updates tailscale/corp#21263
Change-Id: I80d25446097a3bb3423490c164352f0b569add5f
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
go get github.com/tailscale/mkctr@main
Pulls in changes to support a local target that only pushes
a single-platform image to the machine's local image store.
Fixestailscale/mkctr#18
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
BPF links require that the owning FD remains open, this FD is embedded
into the RawLink returned by the attach function and must live for the
duration of the server.
Updates ENG-4274
Signed-off-by: James Tucker <james@tailscale.com>
Detection of duplicate Network Lock signature chains added in
01847e0123 failed to account for chains
originating with a SigCredential signature, which is used for wrapped
auth keys. This results in erroneous removal of signatures that
originate from the same re-usable auth key.
This change ensures that multiple nodes created by the same re-usable
auth key are not getting filtered out by the network lock.
Updates tailscale/corp#19764
Signed-off-by: Anton Tolchanov <anton@tailscale.com>
This change moves handling of wrapped auth keys to the `tka` package and
adds a test covering auth key originating signatures (SigCredential) in
netmap.
Updates tailscale/corp#19764
Signed-off-by: Anton Tolchanov <anton@tailscale.com>
In hopes it'll be found more.
Updates tailscale/corp#20844
Change-Id: Ic92ee9908f45b88f8770de285f838333f9467465
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
A few other minor language updates.
Updates tailscale/corp#20844
Change-Id: Idba85941baa0e2714688cc8a4ec3e242e7d1a362
Signed-off-by: James Tucker <james@tailscale.com>
And some misc doc tweaks for idiomatic Go style.
Updates #cleanup
Change-Id: I3ca45f78aaca037f433538b847fd6a9571a2d918
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
We cannot directly pass a flat domain name into NetUserGetInfo; we must
resolve the address of a domain controller first.
This PR implements the appropriate resolution mechanisms to do that, and
also exposes a couple of new utility APIs for future needs.
Fixes#12627
Signed-off-by: Aaron Klotz <aaron@tailscale.com>
This turns the checklocks workflow into a real check, and adds
annotations to a few basic packages as a starting point.
Updates #12625
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I2b0185bae05a843b5257980fc6bde732b1bdd93f
The exit node suggestion CLI command was written with the assumption
that it's possible to provide a stableid on the command line, but this
is incorrect. Instead, it will now emit the name of the exit node.
Fixes#12618
Change-Id: Id7277f395b5fca090a99b0d13bfee7b215bc9802
Signed-off-by: Adrian Dewhurst <adrian@tailscale.com>
The context can get canceled during backoff, and binding after that
makes the listener impossible to close afterwards.
Fixes#12620.
Signed-off-by: Naman Sood <mail@nsood.in>
To complement the existing `onCompletion` callback, which is called
after request handler.
Updates tailscale/corp#17075
Signed-off-by: Anton Tolchanov <anton@tailscale.com>
We can observe a data race in tests when logging after a test is
finished. `b.onHealthChange` is called in a goroutine after being
registered with `health.Tracker.RegisterWatcher`, which calls callbacks
in `setUnhealthyLocked` in a new goroutine.
See: https://github.com/tailscale/tailscale/actions/runs/9672919302/job/26686038740
Updates #12054
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: Ibf22cc994965d88a9e7236544878d5373f91229e
This PR ties together pseudoconsoles, user profiles, s4u logons, and
process creation into what is (hopefully) a simple API for various
Tailscale services to obtain Windows access tokens without requiring
knowledge of any Windows passwords. It works both for domain-joined
machines (Kerberos) and non-domain-joined machines. The former case
is fairly straightforward as it is fully documented. OTOH, the latter
case is not documented, though it is fully defined in the C headers in
the Windows SDK. The documentation blanks were filled in by reading
the source code of Microsoft's Win32 port of OpenSSH.
We need to do a bit of acrobatics to make conpty work correctly while
creating a child process with an s4u token; see the doc comments above
startProcessInternal for details.
Updates #12383
Signed-off-by: Aaron Klotz <aaron@tailscale.com>
Previously, if we had a umask set (e.g. 0027) that prevented creating a
world-readable file, /etc/resolv.conf would be created without the o+r
bit and thus other users may be unable to resolve DNS.
Since a umask only applies to file creation, chmod the file after
creation and before renaming it to ensure that it has the appropriate
permissions.
Updates #12609
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I2a05d64f4f3a8ee8683a70be17a7da0e70933137
The logic we added in #11378 would prevent selecting a home DERP if we
have no control connection.
Updates tailscale/corp#18095
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I44bb6ac4393989444e4961b8cfa27dc149a33c6e
Fixestailscale/corp#20677
Replaces the original attempt to rectify this (by injecting a netMon
event) which was both heavy handed, and missed cases where the
netMon event was "minor".
On apple platforms, the fetching the interface's nameservers can
and does return an empty list in certain situations. Apple's API
in particular is very limiting here. The header hints at notifications
for dns changes which would let us react ahead of time, but it's all
private APIs.
To avoid remaining in the state where we end up with no
nameservers but we absolutely need them, we'll react
to a lack of upstream nameservers by attempting to re-query
the OS.
We'll rate limit this to space out the attempts. It seems relatively
harmless to attempt a reconfig every 5 seconds (triggered
by an incoming query) if the network is in this broken state.
Missing nameservers might possibly be a persistent condition
(vs a transient error), but that would also imply that something
out of our control is badly misconfigured.
Tested by randomly returning [] for the nameservers. When switching
between Wifi networks, or cell->wifi, this will randomly trigger
the bug, and we appear to reliably heal the DNS state.
Signed-off-by: Jonathan Nobels <jonathan@tailscale.com>
So non-local users (e.g. Kerberos on FreeIPA) on Linux can be looked
up. Our default binaries are built with pure Go os/user which only
supports the classic /etc/passwd and not any libc-hooked lookups.
Updates #12601
Change-Id: I9592db89e6ca58bf972f2dcee7a35fbf44608a4f
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
stunstamp now sends data to Prometheus via remote write, and Prometheus
can serve the same data. Retaining and cleaning up old data in sqlite
leads to long probing pauses, and it's not worth investing more effort
to optimize the schema and/or concurrency model.
Updates tailscale/corp#20344
Signed-off-by: Jordan Whited <jordan@tailscale.com>
PeerPresentFlags was added in 5ffb2668ef but wasn't plumbed through to
the RunConnectionLoop. Rather than add yet another parameter (as
IP:port was added earlier), pass in the raw PeerPresentMessage and
PeerGoneMessage struct values, which are the same things, plus two
fields: PeerGoneReasonType for gone and the PeerPresentFlags from
5ffb2668ef.
Updates tailscale/corp#17816
Change-Id: Ib19d9f95353651ada90656071fc3656cf58b7987
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
When the store-appc-routes flag is on for a tailnet we are writing the
routes more often than seems necessary. Investigation reveals that we
are doing so ~every time we observe a dns response, even if this causes
us not to advertise any new routes. So when we have no new routes,
instead do not advertise routes.
Fixes#12593
Signed-off-by: Fran Bull <fran@tailscale.com>
I couldn't convince myself the old way was safe and couldn't lose
writes.
And it seemed too complicated.
Updates tailscale/corp#21104
Change-Id: I17ba7c7d6fd83458a311ac671146a1f6a458a5c1
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
sendMeshUpdates tries to write as much as possible without blocking,
being careful to check the bufio.Writer.Available size before writes.
Except that regressed in 6c791f7d60 which made those messages larger, which
meants we were doing network I/O with the Server mutex held.
Updates tailscale/corp#13945
Change-Id: Ic327071d2e37de262931b9b390cae32084811919
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This adds the ability to "peek" at the value of a SyncValue, so that
it's possible to observe a value without computing this.
Updates tailscale/corp#17122
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Co-authored-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Change-Id: I06f88c22a1f7ffcbc7ff82946335356bb0ef4622
This is implemented via GetBestInterfaceEx. Should we encounter errors
or fail to resolve a valid, non-Tailscale interface, we fall back to
returning the index for the default interface instead.
Fixes#12551
Signed-off-by: Aaron Klotz <aaron@tailscale.com>
Timeouts could already be identified as NaN values on
stunstamp_derp_stun_rtt_ns, but we can't use NaN effectively with
promql to visualize them. So, this commit adds a timeouts metric that
we can use with rate/delta/etc promql functions.
Updates tailscale/corp#20689
Signed-off-by: Jordan Whited <jordan@tailscale.com>
Changes "Accept" TCP logs to display in verbose logs only,
and removes lines from default logging behavior.
Updates #12158
Signed-off-by: Keli Velazquez <keli@tailscale.com>
This allows the SSH_AUTH_SOCK environment variable to work inside of
su and agent forwarding to succeed.
Fixes#12467
Signed-off-by: Percy Wegmann <percy@tailscale.com>
This actually performs a Noise request in the 'debug ts2021' command,
instead of just exiting once we've dialed a connection. This can help
debug certain forms of captive portals and deep packet inspection that
will allow a connection, but will RST the connection when trying to send
data on the post-upgraded TCP connection.
Updates #1634
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I1e46ca9c9a0751c55f16373a6a76cdc24fec1f18
So that it can be later used in the 'tailscale debug ts2021' function in
the CLI, to aid in debugging captive portals/WAFs/etc.
Updates #1634
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: Iec9423f5e7570f2c2c8218d27fc0902137e73909
Fix regression from bd93c3067e where I didn't notice the
32-bit test failure was real and not its usual slowness-related
regression. Yay failure blindness.
Updates #12526
Change-Id: I00e33bba697e2cdb61a0d76a71b62406f6c2eeb9
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Looks like a DERPmap might not be available when we try to get the
name associated with a region ID, and that was causing an intermittent
panic in CI.
Fixes#12534
Change-Id: I4ace53681bf004df46c728cff830b27339254243
Signed-off-by: Andrea Gottardo <andrea@gottardo.me>
It was hex-ifying the String() form of key.NodePublic, which was already hex.
I noticed in some logs:
"client 6e6f64656b65793a353537353..."
And thought that 6x6x6x6x looked strange. It's "nodekey:" in hex.
Updates tailscale/corp#20844
Change-Id: Ib9f2d63b37e324420b86efaa680668a9b807e465
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
The control plane hasn't sent it to clients in ages.
Updates tailscale/corp#20965
Change-Id: I1d71a4b6dd3f75010a05c544ee39827837c30772
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
I meant to do this in the earlier change and had a git fail.
To atone, add a test too while I'm here.
Updates #12486
Updates #12507
Change-Id: I4943b454a2530cb5047636f37136aa2898d2ffc7
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Fixestailscale/corp#20971
We added some Warnables for DERP failure situations, but their Text currently spits out the DERP region ID ("10") in the UI, which is super ugly. It would be better to provide the RegionName of the DERP region that is failing. We can do so by storing a reference to the last-known DERP map in the health package whenever we fetch one, and using it when generating the notification text.
This way, the following message...
> Tailscale could not connect to the relay server '10'. The server might be temporarily unavailable, or your Internet connection might be down.
becomes:
> Tailscale could not connect to the 'Seattle' relay server. The server might be temporarily unavailable, or your Internet connection might be down.
which is a lot more user-friendly.
Signed-off-by: Andrea Gottardo <andrea@gottardo.me>
Updates tailscale/corp#20969
Right now, when netcheck starts, it asks tailscaled for a copy of the DERPMap. If it doesn't have one, it makes a HTTPS request to controlplane.tailscale.com to fetch one.
This will always fail if you're on a network with a captive portal actively blocking HTTPS traffic. The code appears to hang entirely because the http.Client doesn't have a Timeout set. It just sits there waiting until the request succeeds or fails.
This adds a timeout of 10 seconds, and logs more details about the status of the HTTPS request.
Signed-off-by: Andrea Gottardo <andrea@gottardo.me>
Updates #4136
Small PR to expose the health Warnables dependencies to the GUI via LocalAPI, so that we can only show warnings for root cause issues, and filter out unnecessary messages before user presentation.
Signed-off-by: Andrea Gottardo <andrea@gottardo.me>
I noticed we were allocating these every time when they could just
share the same memory. Rather than document ownership, just lock it
down with a view.
I was considering doing all of the fields but decided to just do this
one first as test to see how infectious it became. Conclusion: not
very.
Updates #cleanup (while working towards tailscale/corp#20514)
Change-Id: I8ce08519de0c9a53f20292adfbecd970fe362de0
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Adds a new TailscaleProxyReady condition type for use in corev1.Service
conditions.
Also switch our CRDs to use metav1.Condition instead of
ConnectorCondition. The Go structs are seralized identically, but it
updates some descriptions and validation rules. Update k8s
controller-tools and controller-runtime deps to fix the documentation
generation for metav1.Condition so that it excludes comments and
TODOs.
Stop expecting the fake client to populate TypeMeta in tests. See
kubernetes-sigs/controller-runtime#2633 for details of the change.
Finally, make some minor improvements to validation for service hostnames.
Fixes#12216
Co-authored-by: Irbe Krumina <irbe@tailscale.com>
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
Previously, we were registering TCP and UDP connections in the same map,
which could result in erroneously removing a mapping if one of the two
connections completes while the other one is still active.
Add a "proto string" argument to these functions to avoid this.
Additionally, take the "proto" argument in LocalAPI, and plumb that
through from the CLI and add a new LocalClient method.
Updates tailscale/corp#20600
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I35d5efaefdfbf4721e315b8ca123f0c8af9125fb
We need to expand our enviornment information to include info about
the Windows store. Thinking about future plans, it would be nice
to include both the packaging mechanism and the distribution mechanism.
In this PR we change packageTypeWindows to check a new registry value
named MSIDist, and concatenate that value to "msi/" when present.
We also remove vestigial NSIS detection.
Updates https://github.com/tailscale/corp/issues/2790
Signed-off-by: Aaron Klotz <aaron@tailscale.com>
The Add method derives a new ID by adding a signed integer
to the ID, treating it as an unsigned 256-bit big-endian integer.
We also add Less and Compare methods to PrivateID to provide
feature parity with existing methods on PublicID.
Updates tailscale/corp#11038
Signed-off-by: Joe Tsai <joetsai@digital-static.net>
validate_udp_checksum was previously indeterminate (not zero) at
declaration, and IPv4 zero value UDP checksum packets were being passed
to the kernel.
Updates tailscale/corp#20689
Signed-off-by: Jordan Whited <jordan@tailscale.com>
* cmd/containerboot: store device ID before setting up proxy routes.
For containerboot instances whose state needs to be stored
in a Kubernetes Secret, we additonally store the device's
ID, FQDN and IPs.
This is used, between other, by the Kubernetes operator,
who uses the ID to delete the device when resources need
cleaning up and writes the FQDN and IPs on various kube
resource statuses for visibility.
This change shifts storing device ID earlier in the proxy setup flow,
to ensure that if proxy routing setup fails,
the device can still be deleted.
Updates tailscale/tailscale#12146
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
* code review feedback
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
---------
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
We do not support specific version updates or track switching on macOS.
Do not populate the flag to avoid confusion.
Updates #cleanup
Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
Previously, we would only compare the current version to resolved latest
version for track. When running `tailscale update --track=stable` from
an unstable build, it would almost always fail because the stable
version is "older". But we should support explicitly switching tracks
like that.
Fixes#12347
Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
For pprof cosmetic/confusion reasons more than performance, but it
might have tiny speed benefit.
Updates #12486
Change-Id: I40e03714f3afa3a7e7f5e1fa99b81c7e889b91b6
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
So profiles show more useful names than just func1, func2, func3, etc.
There will still be func1 on them all, but the symbol before will say
what the lookup type is.
Updates #12486
Change-Id: I910b024a7861394eb83d07f5a899eae338cb1f22
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This moves NewContainsIPFunc from tsaddr to new ipset package.
And wgengine/filter types gets split into wgengine/filter/filtertype,
so netmap (and thus the CLI, etc) doesn't need to bring in ipset,
bart, etc.
Then add a test making sure the CLI deps don't regress.
Updates #1278
Change-Id: Ia246d6d9502bbefbdeacc4aef1bed9c8b24f54d5
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
I noticed the not-local-v6 numbers were nowhere near the v4 numbers
(they should be identical) and then saw this. It meant the
Addr().Next() wasn't picking an IP that was no longer local, as
assumed.
Updates #12486
Change-Id: I18dfb641f00c74c6252666bc41bd2248df15fadd
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
NewContainsIPFunc was previously documented as performing poorly if
there were many netip.Prefixes to search over. As such, we never it used it
in such cases.
This updates it to use bart at a certain threshold (over 6 prefixes,
currently), at which point the bart lookup overhead pays off.
This is currently kinda useless because we're not using it. But now we
can and get wins elsewhere. And we can remove the caveat in the docs.
goos: darwin
goarch: arm64
pkg: tailscale.com/net/tsaddr
│ before │ after │
│ sec/op │ sec/op vs base │
NewContainsIPFunc/empty-8 2.215n ± 11% 2.239n ± 1% +1.08% (p=0.022 n=10)
NewContainsIPFunc/cidr-list-1-8 17.44n ± 0% 17.59n ± 6% +0.89% (p=0.000 n=10)
NewContainsIPFunc/cidr-list-2-8 27.85n ± 0% 28.13n ± 1% +1.01% (p=0.000 n=10)
NewContainsIPFunc/cidr-list-3-8 36.05n ± 0% 36.56n ± 13% +1.41% (p=0.000 n=10)
NewContainsIPFunc/cidr-list-4-8 43.73n ± 0% 44.38n ± 1% +1.50% (p=0.000 n=10)
NewContainsIPFunc/cidr-list-5-8 51.61n ± 2% 51.75n ± 0% ~ (p=0.101 n=10)
NewContainsIPFunc/cidr-list-10-8 95.65n ± 0% 68.92n ± 0% -27.94% (p=0.000 n=10)
NewContainsIPFunc/one-ip-8 4.466n ± 0% 4.469n ± 1% ~ (p=0.491 n=10)
NewContainsIPFunc/two-ip-8 8.002n ± 1% 7.997n ± 4% ~ (p=0.697 n=10)
NewContainsIPFunc/three-ip-8 27.98n ± 1% 27.75n ± 0% -0.82% (p=0.012 n=10)
geomean 19.60n 19.07n -2.71%
Updates #12486
Change-Id: I2e2320cc4384f875f41721374da536bab995c1ce
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This abstraction provides a nicer way to work with
maps of slices without having to write out three long type
params.
This also allows it to provide an AsMap implementation which
copies the map and the slices at least.
Updates tailscale/corp#20910
Signed-off-by: Maisem Ali <maisem@tailscale.com>
NewContainsIPFunc returns a contains matcher optimized for its
input. Use that instead of what this did before, always doing a test
over each of a list of netip.Prefixes.
goos: darwin
goarch: arm64
pkg: tailscale.com/wgengine/filter
│ before │ after │
│ sec/op │ sec/op vs base │
FilterMatch/file1-8 32.60n ± 1% 18.87n ± 1% -42.12% (p=0.000 n=10)
Updates #12486
Change-Id: I8f902bc064effb431e5b46751115942104ff6531
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Without this rule, Windows 8.1 and newer devices issue parallel DNS requests to DNS servers
associated with all network adapters, even when "Override local DNS" is enabled and/or
a Mullvad exit node is being used, resulting in DNS leaks.
This also adds "disable-local-dns-override-via-nrpt" nodeAttr that can be used to disable
the new behavior if needed.
Fixestailscale/corp#20718
Signed-off-by: Nick Khyl <nickk@tailscale.com>
Updates tailscale/tailscale#4136
This PR is the first round of work to move from encoding health warnings as strings and use structured data instead. The current health package revolves around the idea of Subsystems. Each subsystem can have (or not have) a Go error associated with it. The overall health of the backend is given by the concatenation of all these errors.
This PR polishes the concept of Warnable introduced by @bradfitz a few weeks ago. Each Warnable is a component of the backend (for instance, things like 'dns' or 'magicsock' are Warnables). Each Warnable has a unique identifying code. A Warnable is an entity we can warn the user about, by setting (or unsetting) a WarningState for it. Warnables have:
- an identifying Code, so that the GUI can track them as their WarningStates come and go
- a Title, which the GUIs can use to tell the user what component of the backend is broken
- a Text, which is a function that is called with a set of Args to generate a more detailed error message to explain the unhappy state
Additionally, this PR also begins to send Warnables and their WarningStates through LocalAPI to the clients, using ipn.Notify messages. An ipn.Notify is only issued when a warning is added or removed from the Tracker.
In a next PR, we'll get rid of subsystems entirely, and we'll start using structured warnings for all errors affecting the backend functionality.
Signed-off-by: Andrea Gottardo <andrea@gottardo.me>
Fixestailscale/corp#18366.
This PR provides serial number collection on iOS, by allowing system administrators to pass a `DeviceSerialNumber` MDM key which can be read by the `posture` package in Go.
Signed-off-by: Andrea Gottardo <andrea@gottardo.me>
S4U logons do not automatically load the associated user profile. In this
PR we add UserProfile to handle that part. Windows docs indicate that
we should try to resolve a remote profile path when present, so we attempt
to do so when the local computer is joined to a domain.
Updates #12383
Signed-off-by: Aaron Klotz <aaron@tailscale.com>
We do not intend to use this value for feature support communication in
the future, and have applied changes elsewhere that now fix the expected
value.
Updates tailscale/corp#19391
Updates tailscale/corp#20398
Signed-off-by: James Tucker <james@tailscale.com>
This commit introduces a userspace program for managing an experimental
eBPF XDP STUN server program. derp/xdp contains the eBPF pseudo-C along
with a Go pkg for loading it and exporting its metrics.
cmd/xdpderper is a package main user of derp/xdp.
Updates tailscale/corp#20689
Signed-off-by: Jordan Whited <jordan@tailscale.com>
This refactors the logic for determining whether a packet should be sent
to the host or not into a function, and then adds tests for it.
Updates #11304
Updates #12448
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: Ief9afa98eaffae00e21ceb7db073c61b170355e5
Fix a bug where, for a subnet router that advertizes
4via6 route, all packets with a source IP matching
the 4via6 address were being sent to the host itself.
Instead, only send to host packets whose destination
address is host's local address.
Fixestailscale/tailscale#12448
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
Co-authored-by: Andrew Dunham <andrew@du.nham.ca>
Checking in the incubator as this used to do fails because
the getenforce command is not on the PATH.
Updates #12442
Signed-off-by: Percy Wegmann <percy@tailscale.com>
Fixestailscale/corp#20677
On macOS sleep/wake, we're encountering a condition where reconfigure the network
a little bit too quickly - before apple has set the nameservers for our interface.
This results in a persistent condition where we have no upstream resolver and
fail all forwarded DNS queries.
No upstream nameservers is a legitimate configuration, and we have no (good) way
of determining when Apple is ready - but if we need to forward a query, and we
have no nameservers, then something has gone badly wrong and the network is
very broken.
A simple fix here is to simply inject a netMon event, which will go through the
configuration dance again when we hit the SERVFAIL condition.
Tested by artificially/randomly returning [] for the list of nameservers in the bespoke
ipn-bridge code responsible for getting the nameservers.
Signed-off-by: Jonathan Nobels <jonathan@tailscale.com>
As an alterative to #11935 using #12003.
Updates #11935
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I05f643fe812ceeaec5f266e78e3e529cab3a1ac3
Add an additional RecorderAddrs field to tailscale.com/cap/kubernetes
capability. RecorderAddrs will only be populated by control
with the addresses of any tsrecorder tags set via Recorder.
Updates tailscale/corp#19821
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
When we're starting child processes on Windows that are CLI programs that
don't need to output to a console, we should pass in DETACHED_PROCESS as a
CreationFlag on SysProcAttr. This prevents the OS from even creating a console
for the child (and paying the associated time/space penalty for new conhost
processes). This is more efficient than letting the OS create the console
window and then subsequently trying to hide it, which we were doing at a few
callsites.
Fixes#12270
Signed-off-by: Aaron Klotz <aaron@tailscale.com>
As quad-100 is an authoritative server for 4via6 domains, it should always return responses
with a response code of 0 (indicating no error) when resolving records for these domains.
If there's no resource record of the specified type (e.g. A), it should return a response
with an empty answer section rather than NXDomain. Such a response indicates that there
is at least one RR of a different type (e.g., AAAA), suggesting the Windows stub resolver
to look for it.
Fixestailscale/corp#20767
Signed-off-by: Nick Khyl <nickk@tailscale.com>
This adds a variant for Connect that takes in a context.Context
which allows passing through cancellation etc by the caller.
Updates tailscale/corp#18266
Signed-off-by: Maisem Ali <maisem@tailscale.com>
This is a variant of DoChan that supports context propagation, such that
the context provided to the inner function will only be canceled when
there are no more waiters for a given key. This can be used to
deduplicate expensive and cancelable calls among multiple callers
safely.
Updates #11935
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: Ibe1fb67442a854babbc6924fd8437b02cc9e7bcf
Return empty response and NOERROR for AAAA record queries
for DNS names for which we have an A record.
This is to allow for callers that might be first sending an AAAA query and then,
if that does not return a response, follow with an A record query.
Previously we were returning NOTIMPL that caused some callers
to potentially not follow with an A record query or misbehave in different ways.
Also return NXDOMAIN for AAAA record queries for names
that we DO NOT have an A record for to ensure that the callers
do not follow up with an A record query.
Returning an empty response and NOERROR is the behaviour
that RFC 4074 recommends:
https://datatracker.ietf.org/doc/html/rfc4074
Updates tailscale/tailscale#12321
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
AllocateContiguousBuffer is for allocating structs with trailing buffers
containing additional data. It is to be used for various Windows structures
containing pointers to data located immediately after the struct.
SetNTString performs in-place setting of windows.NTString and
windows.NTUnicodeString.
Updates #12383
Signed-off-by: Aaron Klotz <aaron@tailscale.com>
This PR is in prep of adding logic to control to be able to parse
tailscale.com/cap/kubernetes grants in control:
- moves the type definition of PeerCapabilityKubernetes cap to a location
shared with control.
- update the Kubernetes cap rule definition with fields for granting
kubectl exec session recording capabilities.
- adds a convenience function to produce tailcfg.RawMessage from an
arbitrary cap rule and a test for it.
An example grant defined via ACLs:
"grants": [{
"src": ["tag:eng"],
"dst": ["tag:k8s-operator"],
"app": {
"tailscale.com/cap/kubernetes": [{
"recorder": ["tag:my-recorder"]
“enforceRecorder”: true
}],
},
}
]
This grant enforces `kubectl exec` sessions from tailnet clients,
matching `tag:eng` via API server proxy matching `tag:k8s-operator`
to be recorded and recording to be sent to a tsrecorder instance,
matching `tag:my-recorder`.
The type needs to be shared with control because we want
control to parse this cap and resolve tags to peer IPs.
Updates tailscale/corp#19821
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
Add a new .spec.tailscale.acceptRoutes field to ProxyClass,
that can be optionally set to true for the proxies to
accept routes advertized by other nodes on tailnet (equivalent of
setting --accept-routes to true).
Updates tailscale/tailscale#12322,tailscale/tailscale#10684
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
Also removes hardcoded image repo/tag from example DNSConfig resource
as the operator now knows how to default those.
Updates tailscale/tailscale#11019
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
Add new fields TailnetIPs and Hostname to Connector Status. These
contain the addresses of the Tailscale node that the operator created
for the Connector to aid debugging.
Fixes#12214
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
cmd/k8s-operator,k8s-operator,go.{mod,sum}: make individual proxy images/image pull policies configurable
Allow to configure images and image pull policies for individual proxies
via ProxyClass.Spec.StatefulSet.Pod.{TailscaleContainer,TailscaleInitContainer}.Image,
and ProxyClass.Spec.StatefulSet.Pod.{TailscaleContainer,TailscaleInitContainer}.ImagePullPolicy
fields.
Document that we have images in ghcr.io on the relevant Helm chart fields.
Updates tailscale/tailscale#11675
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
The last suggested exit node needs to be incorporated in the decision
making process when a new suggestion is requested, but currently it is
not quite right: it'll be used if the suggestion code has an error or a
netmap is unavailable, but it won't be used otherwise.
Instead, this makes the last suggestion into a tiebreaker when making a
random selection between equally-good options. If the last suggestion
does not make it to the final selection pool, then a different
suggestion will be made.
Since LocalBackend.SuggestExitNode is back to being a thin shim that
sets up the parameters to suggestExitNode, it no longer needs a test.
Its test was unable to be comprehensive anyway as the code being tested
contains an uncontrolled random number generator.
Updates tailscale/corp#19681
Change-Id: I94ecc9a0d1b622de3df4ef90523f1d3e67b4bfba
Signed-off-by: Adrian Dewhurst <adrian@tailscale.com>
We assume most containers are immutable and don't expect tailscale
running in them to auto-update. But there's no reason to prohibit it
outright.
Ignore the tailnet-wide default auto-update setting in containers, but
allow local users to turn on auto-updates via the CLI.
RELNOTE=Auto-updates are allowed in containers, but ignore the tailnet-wide default.
Fixes#12292
Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
Updates corp#15802.
Adds the ability for control to disable the recently added change that uses split DNS in more cases on iOS. This will allow us to disable the feature if it leads to regression in production. We plan to remove this knob once we've verified that the feature works properly.
Signed-off-by: Andrea Gottardo <andrea@gottardo.me>
StartupInfoBuilder is a helper for constructing StartupInfoEx structures
featuring proc/thread attribute lists. Calling its setters triggers the
appropriate setting of fields, adjusting flags as necessary, and populating
the proc/thread attribute list as necessary. Currently it supports four
features: setting std handles, setting pseudo-consoles, specifying handles
for inheritance, and specifying jobs.
The conpty package simplifies creation of pseudo-consoles, their associated
pipes, and assignment of the pty to StartupInfoEx proc/thread attributes.
Updates #12383
Signed-off-by: Aaron Klotz <aaron@tailscale.com>
To make it easier for people to monitor their custom DERP fleet.
Updates tailscale/corp#20654
Change-Id: Id8af22936a6d893cc7b6186d298ab794a2672524
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This bug was introduced in e6b84f215 (May 2020) but was only used in
tests when stringifying probeProto values on failure so it wasn't
noticed for a long time.
But then it was moved into non-test code in 8450a18aa (Jun 2024) and I
didn't notice during the code movement that it was wrong. It's still
only used in failure paths in logs, but having wrong/ambiguous
debugging information isn't the best.
Whoops.
Updates tailscale/corp#20654
Change-Id: I296c727ed1c292a04db7b46ecc05c07fc1abc774
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Evaluation of remote write errors was using errors.Is() where it should
have been using errors.As().
Updates tailscale/corp#20344
Signed-off-by: Jordan Whited <jordan@tailscale.com>
This updates breakglass to use the now-upsteamed
https://github.com/gokrazy/breakglass/pull/18 change
so we're not using our fork now.
It also adds a gok wrapper tool, because doing it by hand
was tedious.
Updates #1866
Change-Id: Ifacbf5fbf0e377b3bd95c5f76c18751c2e1af7d7
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This is done in preparation for adding kubectl
session recording rules to this capability grant that will need to
be unmarshalled by control, so will also need to be
in a shared location.
Updates tailscale/corp#19821
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
Rather than building a new suggested exit node set every time, compute
it once on first use. Currently, syspolicy ensures that values do not
change without a restart anyway.
Since the set is being constructed in a separate func now, the test code
that manipulates syspolicy can live there, and the TestSuggestExitNode
can now run in parallel with other tests because it does not have global
dependencies.
Updates tailscale/corp#19681
Change-Id: Ic4bb40ccc91b671f9e542bd5ba9c96f942081515
Signed-off-by: Adrian Dewhurst <adrian@tailscale.com>
cmd/k8s-operator/deploy/chart: Support image 'repo' or 'repository' keys in helm values
Fixes#12100
Signed-off-by: Michael Long <michaelongdev@gmail.com>
Clean up the updater goroutine on shutdown, in addition to doing that on
backend state change. This fixes a goroutine leak on shutdown in tests.
Updates #cleanup
When the client is disconnected from control for any reason (typically
just turned off), we should still attempt to update if auto-updates are
enabled. This may help users who turn tailscale on infrequently for
accessing resources.
RELNOTE: Apply auto-updates even if the node is down or disconnected
from the coordination server.
Updates #12117
Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
This adds a new prototype `cmd/natc` which can be used
to expose a services/domains to the tailnet.
It requires the user to specify a set of IPv4 prefixes
from the CGNAT range. It advertises these as normal subnet
routes. It listens for DNS on the first IP of the first range
provided to it.
When it gets a DNS query it allocates an IP for that domain
from the v4 range. Subsequent connections to the assigned IP
are then tcp proxied to the domain.
It is marked as a WIP prototype and requires the use of the
`TAILSCALE_USE_WIP_CODE` env var.
Updates tailscale/corp#20503
Signed-off-by: Maisem Ali <maisem@tailscale.com>
A `*listener` implements net.Listener which breaks
a test in another repo.
Regressed in 42cfbf427c.
Updates #12182
Signed-off-by: Maisem Ali <maisem@tailscale.com>
In order to test the sticky last suggestion code, a test was written for
LocalBackend.SuggestExitNode but it contains a random number generator
which makes writing comprehensive tests very difficult. This doesn't
change how the last suggestion works, but it adds some infrastructure to
make that easier in a later PR.
This adds func parameters for the two randomized parts: breaking ties
between DERP regions and breaking ties between nodes. This way tests can
validate the entire list of tied options, rather than expecting a
particular outcome given a particular random seed.
As a result of this, the global random number generator can be used
rather than seeding a local one each time.
In order to see the tied nodes for the location based (i.e. Mullvad)
case, pickWeighted needed to return a slice instead of a single
arbitrary option, so there is a small change in how that works.
Updates tailscale/corp#19681
Change-Id: I83c48a752abdec0f59c58ccfd8bfb3f3f17d0ea8
Signed-off-by: Adrian Dewhurst <adrian@tailscale.com>
stunstamp timestamping includes userspace and SO_TIMESTAMPING kernel
timestamping where available. Measurements are written locally to a
sqlite DB, exposed over an HTTP API, and written to prometheus
via remote-write protocol.
Updates tailscale/corp#20344
Signed-off-by: Jordan Whited <jordan@tailscale.com>
This mostly removes a lot of repetition by predefining some nodes and
other data structures, plus adds some helpers for creating Peer entries
in the netmap. Several existing test cases were reworked to ensure
better coverage of edge cases, and several new test cases were added to
handle some additional responsibility that is in (or will be shortly
moving in) suggestExitNode().
Updates tailscale/corp#19681
Change-Id: Ie14c2988d7fd482f7d6a877f78525f7788669b85
Signed-off-by: Adrian Dewhurst <adrian@tailscale.com>
A non-signing node can be allowed to re-sign its new node keys following
key renewal/rotation (e.g. via `tailscale up --force-reauth`). To be
able to do this, node's TLK is written into WrappingPubkey field of the
initial SigDirect signature, signed by a signing node.
The intended use of this field implies that, for each WrappingPubkey, we
typically expect to have at most one active node with a signature
tracing back to that key. Multiple valid signatures referring to the
same WrappingPubkey can occur if a client's state has been cloned, but
it's something we explicitly discourage and don't support:
https://tailscale.com/s/clone
This change propagates rotation details (wrapping public key, a list
of previous node keys that have been rotated out) to netmap processing,
and adds tracking of obsolete node keys that, when found, will get
filtered out.
Updates tailscale/corp#19764
Signed-off-by: Anton Tolchanov <anton@tailscale.com>
This adds a new ListenPacket function on tsnet.Server
which acts mostly like `net.ListenPacket`.
Unlike `Server.Listen`, this requires listening on a
specific IP and does not automatically listen on both
V4 and V6 addresses of the Server when the IP is unspecified.
To test this, it also adds UDP support to tsdial.Dialer.UserDial
and plumbs it through the localapi. Then an associated test
to make sure the UDP functionality works from both sides.
Updates #12182
Signed-off-by: Maisem Ali <maisem@tailscale.com>
Alpine APK repos are versioned, and contain different package sets.
Older APK releases and repos don't have the latest tailscale package.
When we report "no update available", check whether pkgs.tailscale.com
has a newer tarball release. If it does, it's possible that the system
is on an older Alpine release. Print additional messages to suggest the
user to upgrade their OS.
Fixes#11309
Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
This fixes an issue where, on containerized environments an upgrade
1.66.3 -> 1.66.4 failed with default containerboot configuration.
This was because containerboot by default runs 'tailscale up'
that requires all previously set flags to be explicitly provided
on subsequent runs and we explicitly set --stateful-filtering
to true on 1.66.3, removed that settingon 1.66.4.
Updates tailscale/tailscale#12307
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
Co-authored-by: Andrew Lytvynov <awly@tailscale.com>
The derp metrics got out of sync in 74eb99aed1 (2023-03).
They were fixed in 0380cbc90d (2024-05).
This adds some further guardrails (atop the previous fix) to make sure
they don't get out of sync again.
Updates #12288
Change-Id: I809061a81f8ff92f45054d0253bc13871fc71634
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This change makes our access log record more consistent with the
new log/tslog package formatting of "time". Note that we can
change slog itself to call "time" "when" but we're chosing
to make this breaking change to be consistent with the std lib's
defaults.
Updates tailscale/corp#17071
Signed-off-by: Marwan Sulaiman <marwan@tailscale.com>
- Add current node signature to `ipnstate.NetworkLockStatus`;
- Print current node signature in a human-friendly format as part
of `tailscale lock status`.
Examples:
```
$ tailscale lock status
Tailnet lock is ENABLED.
This node is accessible under tailnet lock. Node signature:
SigKind: direct
Pubkey: [OTB3a]
KeyID: tlpub:44a0e23cd53a4b8acc02f6732813d8f5ba8b35d02d48bf94c9f1724ebe31c943
WrappingPubkey: tlpub:44a0e23cd53a4b8acc02f6732813d8f5ba8b35d02d48bf94c9f1724ebe31c943
This node's tailnet-lock key: tlpub:44a0e23cd53a4b8acc02f6732813d8f5ba8b35d02d48bf94c9f1724ebe31c943
Trusted signing keys:
tlpub:44a0e23cd53a4b8acc02f6732813d8f5ba8b35d02d48bf94c9f1724ebe31c943 1 (self)
tlpub:6fa21d242a202b290de85926ba3893a6861888679a73bc3a43f49539d67c9764 1 (pre-auth key kq3NzejWoS11KTM59)
```
For a node created via a signed auth key:
```
This node is accessible under tailnet lock. Node signature:
SigKind: rotation
Pubkey: [e3nAO]
Nested:
SigKind: credential
KeyID: tlpub:6fa21d242a202b290de85926ba3893a6861888679a73bc3a43f49539d67c9764
WrappingPubkey: tlpub:3623b0412cab0029cb1918806435709b5947ae03554050f20caf66629f21220a
```
For a node that rotated its key a few times:
```
This node is accessible under tailnet lock. Node signature:
SigKind: rotation
Pubkey: [DOzL4]
Nested:
SigKind: rotation
Pubkey: [S/9yU]
Nested:
SigKind: rotation
Pubkey: [9E9v4]
Nested:
SigKind: direct
Pubkey: [3QHTJ]
KeyID: tlpub:44a0e23cd53a4b8acc02f6732813d8f5ba8b35d02d48bf94c9f1724ebe31c943
WrappingPubkey: tlpub:2faa280025d3aba0884615f710d8c50590b052c01a004c2b4c2c9434702ae9d0
```
Updates tailscale/corp#19764
Signed-off-by: Anton Tolchanov <anton@tailscale.com>
The `--wait` flag for `systemd-run` was added in systemd 232. While it
is quite old, it doesn't hurt to special-case them and skip the `--wait`
flag. The consequence is that we lose the update command output in logs,
but at least auto-updates will work.
Fixes#12136
Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
This busybox fwmaskWorks check was added before we moved away from
using the "ip" command to using netlink directly.
So it's now just wasted work (and log spam on Gokrazy) to check the
"ip" command capabilities if we're never going to use it.
Do it lazily instead.
Updates #12277
Change-Id: I8ab9acf64f9c0d8240ce068cb9ec8c0f6b1ecee7
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Updates https://github.com/tailscale/corp/issues/15802.
On iOS exclusively, this PR adds logic to use a split DNS configuration in more cases, with the goal of improving battery life. Acting as the global DNS resolver on iOS should be avoided, as it leads to frequent wakes of IPNExtension.
We try to determine if we can have Tailscale only handle DNS queries for resources inside the tailnet, that is, all routes in the DNS configuration do not require a custom resolver (this is the case for app connectors, for instance).
If so, we set all Routes as MatchDomains. This enables a split DNS configuration which will help preserve battery life. Effectively, for the average Tailscale user who only relies on MagicDNS to resolve *.ts.net domains, this means that Tailscale DNS will only be used for those domains.
This PR doesn't affect users with Override Local DNS enabled. For these users, there should be no difference and Tailscale will continue acting as a global DNS resolver.
Signed-off-by: Andrea Gottardo <andrea@tailscale.com>
This allows pam authentication to run for ssh sessions, triggering
automation like pam_mkhomedir.
Updates #11854
Signed-off-by: Percy Wegmann <percy@tailscale.com>
This can be used to implement a persistent pool (i.e. one that isn't
cleared like sync.Pool is) of items–e.g. database connections.
Some benchmarks vs. a naive implementation that uses a single map
iteration show a pretty meaningful improvement:
$ benchstat -col /impl ./bench.txt
goos: darwin
goarch: arm64
pkg: tailscale.com/util/pool
│ Pool │ map │
│ sec/op │ sec/op vs base │
Pool_AddDelete-10 10.56n ± 2% 15.11n ± 1% +42.97% (p=0.000 n=10)
Pool_TakeRandom-10 56.75n ± 4% 1899.50n ± 20% +3246.84% (p=0.000 n=10)
geomean 24.49n 169.4n +591.74%
Updates tailscale/corp#19900
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: Ie509cb65573c4726cfc3da9a97093e61c216ca18
We don't build a lot of tools with CGO, but we do build some, and it's
extremely valuable for production services in particular to have symbols
included - for perf and so on.
I tested various other builds that could be affected negatively, in
particular macOS/iOS, but those use split-dwarf already as part of their
build path, and Android which does not currently use gocross.
One binary which is normally 120mb only grew to 123mb, so the trade-off
is definitely worthwhile in context.
Updates tailscale/corp#20296
Signed-off-by: James Tucker <james@tailscale.com>
Palo Alto reported interpreting hairpin probes as LAND attacks, and the
firewalls may be responding to this by shutting down otherwise in use NAT sessions
prematurely. We don't currently make use of the outcome of the hairpin
probes, and they contribute to other user confusion with e.g. the
AirPort Extreme hairpin session workaround. We decided in response to
remove the whole probe feature as a result.
Updates #188
Updates tailscale/corp#19106
Updates tailscale/corp#19116
Signed-off-by: James Tucker <james@tailscale.com>
After some analysis, stateful filtering is only necessary in tailnets
that use `autogroup:danger-all` in `src` in ACLs. And in those cases
users explicitly specify that hosts outside of the tailnet should be
able to reach their nodes. To fix local DNS breakage in containers, we
disable stateful filtering by default.
Updates #12108
Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
This change updates the existing api.md TOC links to point at the new
publicapi folder/files. It also removes the body of the docs from the
file, to avoid the docs becoming out of sync.
This change also renames overview.md to readme.md.
Updates tailscale/corp#19526
Signed-off-by: Charlotte Brandhorst-Satzkorn <charlotte@tailscale.com>
It was requested by the first customer 4-5 years ago and only used
for a brief moment of time. We later added netmap visibility trimming
which removes the need for this.
It's been hidden by the CLI for quite some time and never documented
anywhere else.
This keeps the CLI flag, though, out of caution. It just returns an
error if it's set to anything but true (its default).
Fixes#12058
Change-Id: I7514ba572e7b82519b04ed603ff9f3bdbaecfda7
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Updates #12172 (then need to update other repos)
Change-Id: I439f65e0119b09e00da2ef5c7a4f002f93558578
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This change includes the device and user invites API docs in the
new publicapi documentation structure.
Updates tailscale/corp#19526
Signed-off-by: Charlotte Brandhorst-Satzkorn <charlotte@tailscale.com>
Palo Alto firewalls have a typically hard NAT, but also have a mode
called Persistent DIPP that is supposed to provide consistent port
mapping suitable for STUN resolution of public ports. Persistent DIPP
works initially on most Palo Alto firewalls, but some models/software
versions have a bug which this works around.
The bug symptom presents as follows:
- STUN sessions resolve a consistent public IP:port to start with
- Much later netchecks report the same IP:Port for a subset of
sessions, most often the users active DERP, and/or the port related
to sustained traffic.
- The broader set of DERPs in a full netcheck will now consistently
observe a new IP:Port.
- After this point of observation, new inbound connections will only
succeed to the new IP:Port observed, and existing/old sessions will
only work to the old binding.
In this patch we now advertise the lowest latency global endpoint
discovered as we always have, but in addition any global endpoints that
are observed more than once in a single netcheck report. This should
provide viable endpoints for potential connection establishment across
a NAT with this behavior.
Updates tailscale/corp#19106
Signed-off-by: James Tucker <james@tailscale.com>
This change creates a new folder called publicapi that will become the
future home to the Tailscale public API docs.
This change also splits the existing API docs (still located in api.md)
into separate files, for easier reading and contribution.
Updates tailscale/corp#19526
Signed-off-by: Charlotte Brandhorst-Satzkorn <charlotte@tailscale.com>
Fixestailscale/tailscale#10393Fixestailscale/corp#15412Fixestailscale/corp#19808
On Apple platforms, exit nodes and subnet routers have been unable to relay pings from Tailscale devices to non-Tailscale devices due to sandbox restrictions imposed on our network extensions by Apple. The sandbox prevented the code in netstack.go from spawning the `ping` process which we were using.
Replace that exec call with logic to send an ICMP echo request directly, which appears to work in userspace, and not trigger a sandbox violation in the syslog.
Signed-off-by: Andrea Gottardo <andrea@gottardo.me>
Tracking down the side effect can otherwise be a pain, for example on
Darwin an empty GOOS resulted in CGO being implicitly disabled. The user
intended for `export GOOS=` to act like unset, and while this is a
misunderstanding, the main toolchain would treat it this way.
Fixestailscale/corp#20059
Signed-off-by: James Tucker <james@tailscale.com>
In this commit I updated the Ipv6 range we use to generate Control D DOH ip, we were using the NextDNSRanges to generate Control D DOH ip, updated to use the correct range.
Updates: #7946
Signed-off-by: Kevin Liang <kevinliang@tailscale.com>
In a configuration where the local node (ip1) has a different IP (ip2)
that it uses to communicate with a peer (ip3) we would do UDP flow
tracking on the `ip2->ip3` tuple. When we receive the response from
the peer `ip3->ip2` we would dnat it back to `ip3->ip1` which would
then not match the flow track state and the packet would get dropped.
To fix this, we should do flow tracking on the `ip1->ip3` tuple instead
of `ip2->ip3` which requires doing SNAT after the running filterPacketOutboundToWireGuard.
Updates tailscale/corp#19971, tailscale/corp#8020
Signed-off-by: Maisem Ali <maisem@tailscale.com>
Fixestailscale/corp#19979
A build with version number 275 was uploaded to the App Store without bumping OSS first. The presence of that build is causing any 274.* build to be rejected. To address this, added -1 to the year component, which means new builds will use the 275.* prefix.
Signed-off-by: Andrea Gottardo <andrea@gottardo.me>
Without changing behaviour, don't create a goroutine per connection that
sits and sleeps, but rather use a timer that wakes up and gathers
statistics on a regular basis.
Fixes#12127
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: Ibc486447e403070bdc3c2cd8ae340e7d02854f21
* util/linuxfw: fix IPv6 NAT availability check for nftables
When running firewall in nftables mode,
there is no need for a separate NAT availability check
(unlike with iptables, there are no hosts that support nftables, but not IPv6 NAT - see tailscale/tailscale#11353).
This change fixes a firewall NAT availability check that was using the no-longer set ipv6NATAvailable field
by removing the field and using a method that, for nftables, just checks that IPv6 is available.
Updates tailscale/tailscale#12008
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
The previous LocalBackend & CLI 'up' changes improved some stuff, but
might've been too aggressive in some edge cases.
This simplifies the authURL vs authURLSticky distinction and removes
the interact field, which seemed to just just be about duplicate URL
suppression in IPN bus, back from when the IPN bus was a single client
at a time. This moves that suppression to a different spot.
Fixes#12119
Updates #12028
Updates #12042
Change-Id: I1f8800b1e82ccc1c8a0d7abba559e7404ddf41e4
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This was a typo in 2e19790f61.
It should have been on `Map` and not on `*Map` as otherwise
it doesn't allow for chaining like `someView.SomeMap().AsMap()`
and requires first assigning it to a variable.
Updates #typo
Signed-off-by: Maisem Ali <maisem@tailscale.com>
This adds a new `UserLogf` field to the `Server` struct.
When set this any logs generated by Server are logged using
`UserLogf` and all spammy backend logs are logged to `Logf`.
If it `UserLogf` is unset, we default to `log.Printf` and
if `Logf` is unset we discard all the spammy logs.
Fixes#12094
Signed-off-by: Maisem Ali <maisem@tailscale.com>
Clients often perform a PROPFIND for the parent directory before
performing PROPFIND for specific children within that directory.
The PROPFIND for the parent directory is usually done at depth 1,
meaning that we already have information for all of the children.
By immediately adding that to the cache, we save a roundtrip to
the remote peer on the PROPFIND for the specific child.
Updates tailscale/corp#19779
Signed-off-by: Percy Wegmann <percy@tailscale.com>
Turn off stateful filtering for egress proxies to allow cluster
traffic to be forwarded to tailnet.
Allow configuring stateful filter via tailscaled config file.
Deprecate EXPERIMENTAL_TS_CONFIGFILE_PATH env var and introduce a new
TS_EXPERIMENTAL_VERSIONED_CONFIG env var that can be used to provide
containerboot a directory that should contain one or more
tailscaled config files named cap-<tailscaled-cap-version>.hujson.
Containerboot will pick the one with the newest capability version
that is not newer than its current capability version.
Proxies with this change will not work with older Tailscale
Kubernetes operator versions - users must ensure that
the deployed operator is at the same version or newer (up to
4 version skew) than the proxies.
Updates tailscale/tailscale#12061
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
Co-authored-by: Maisem Ali <maisem@tailscale.com>
When Docker is detected on the host and stateful filtering is enabled,
Docker containers may be unable to reach Tailscale nodes (depending on
the network settings of a container). Detect Docker when stateful
filtering is enabled and print a health warning to aid users in noticing
this issue.
We avoid printing the warning if the current node isn't advertising any
subnet routes and isn't an exit node, since without one of those being
true, the node wouldn't have the correct AllowedIPs in WireGuard to
allow a Docker container to connect to another Tailscale node anyway.
Updates #12070
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: Idef538695f4d101b0ef6f3fb398c0eaafc3ae281
We were missing `snat-subnet-routes`, `stateful-filtering`
and `netfilter-mode`. Add those to set too.
Fixes#12061
Signed-off-by: Maisem Ali <maisem@tailscale.com>
We are now publishing nameserver images to tailscale/k8s-nameserver,
so we can start defaulting the images if users haven't set
them explicitly, same as we already do with proxy images.
The nameserver images are currently only published for unstable
track, so we have to use the static 'unstable' tag.
Once we start publishing to stable, we can make the operator
default to its own tag (because then we'll know that for each
operator tag X there is also a nameserver tag X as we always
cut all images for a given tag.
Updates tailscale/tailscale#10499
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
Previously, a node that was advertising a 4via6 route wouldn't be able
to make use of that same route; the packet would be delivered to
Tailscale, but since we weren't accepting it in handleLocalPackets, the
packet wouldn't be delivered to netstack and would never hit the 4via6
logic. Let's add that support so that usage of 4via6 is consistent
regardless of where the connection is initiated from.
Updates #11304
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: Ic28dc2e58080d76100d73b93360f4698605af7cb
The CLI's "up" is kinda chaotic and LocalBackend.Start is kinda
chaotic and they both need to be redone/deleted (respectively), but
this fixes some buggy behavior meanwhile. We were previously calling
StartLoginInteractive (to start the controlclient's RegisterRequest)
redundantly in some cases, causing test flakes depending on timing and
up's weird state machine.
We only need to call StartLoginInteractive in the client if Start itself
doesn't. But Start doesn't tell us that. So cheat a bit and a put the
information about whether there's a current NodeKey in the ipn.Status.
It used to be accessible over LocalAPI via GetPrefs as a private key but
we removed that for security. But a bool is fine.
So then only call StartLoginInteractive if that bool is false and don't
do it in the WatchIPNBus loop.
Fixes#12028
Updates #12042
Change-Id: I0923c3f704a9d6afd825a858eb9a63ca7c1df294
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
There was a small window in ipnserver after we assigned a LocalBackend
to the ipnserver's atomic but before we Start'ed it where our
initalization Start could conflict with API calls from the LocalAPI.
Simplify that a bit and lay out the rules in the docs.
Updates #12028
Change-Id: Ic5f5e4861e26340599184e20e308e709edec68b1
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
We used to Lock, Unlock, Lock, Unlock quite a few
times in Start resulting in all sorts of weird race
conditions. Simplify it all and only Lock/Unlock once.
Updates #11649
Signed-off-by: Maisem Ali <maisem@tailscale.com>
This removes one of the Lock,Unlock,Lock,Unlock at least in
the Start function. Still has 3 more of these.
Updates #11649
Signed-off-by: Maisem Ali <maisem@tailscale.com>
It was documented as such but seems to have been dropped in a
refactor, restore the behavior. This brings down the time it
takes to run a single integration test by 2s which adds up
quite a bit.
Updates tailscale/corp#19786
Signed-off-by: Maisem Ali <maisem@tailscale.com>
I found this too hard to read before.
This is pulled out of #12033 as it's unrelated cleanup in retrospect.
Updates #12028
Change-Id: I727c47e573217e3d1973c5b66a76748139cf79ee
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This way the default gets populated on first start, when no existing
state exists to migrate. Also fix `ipn.PrefsFromBytes` to preserve empty
fields, rather than layering `NewPrefs` values on top.
Updates https://github.com/tailscale/corp/issues/19623
Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
This adds a new bool that can be sent down from control
to do jailing on the client side. Previously this would
only be done from control by modifying the packet filter
we sent down to clients. This would result in a lot of
additional work/CPU on control, we could instead just
do this on the client. This has always been a TODO which
we keep putting off, might as well do it now.
Updates tailscale/corp#19623
Signed-off-by: Maisem Ali <maisem@tailscale.com>
This plumbs a packet filter for jailed nodes through to the
tstun.Wrapper; the filter for a jailed node is equivalent to a "shields
up" filter. Currently a no-op as there is no way for control to
tell the client whether a peer is jailed.
Updates tailscale/corp#19623
Co-authored-by: Andrew Dunham <andrew@du.nham.ca>
Signed-off-by: Maisem Ali <maisem@tailscale.com>
Change-Id: I5ccc5f00e197fde15dd567485b2a99d8254391ad
Now that tsdial.Dialer.UserDial has been updated to honor the configured routes
and dial external network addresses without going through Tailscale, while also being
able to dial a node/subnet router on the tailnet, we can start using UserDial to forward
DNS requests. This is primarily needed for DNS over TCP when forwarding requests
to internal DNS servers, but we also update getKnownDoHClientForProvider to use it.
Updates tailscale/corp#18725
Signed-off-by: Nick Khyl <nickk@tailscale.com>
This refactors the peerConfig struct to allow storing more
details about a peer and not just the masq addresses. To be
used in a follow up change.
As a side effect, this also makes the DNAT logic on the inbound
packet stricter. Previously it would only match against the packets
dst IP, not it also takes the src IP into consideration. The beahvior
is at parity with the SNAT case.
Updates tailscale/corp#19623
Co-authored-by: Andrew Dunham <andrew@du.nham.ca>
Signed-off-by: Maisem Ali <maisem@tailscale.com>
Change-Id: I5f40802bebbf0f055436eb8824e4511d0052772d
The CLI "up" command is a historical mess, both on the CLI side and
the LocalBackend side. We're getting closer to cleaning it up, but in
the meantime it was again implicated in flaky tests.
In this case, the background goroutine running WatchIPNBus was very
occasionally running enough to get to its StartLoginInteractive call
before the original goroutine did its Start call. That meant
integration tests were very rarely but sometimes logging in with the
default control plane URL out on the internet
(controlplane.tailscale.com) instead of the localhost control server
for tests.
This also might've affected new Headscale etc users on initial "up".
Fixes#11960Fixes#11962
Change-Id: I36f8817b69267a99271b5ee78cb7dbf0fcc0bd34
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
I noticed this while working on the following fix to #11962.
Updates #11962
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Change-Id: I4c5894d8899d1ae8c42f54ecfd4d05a4a7ac598c
We'd like to use tsdial.Dialer.UserDial instead of SystemDial for DNS over TCP.
This is primarily necessary to properly dial internal DNS servers accessible
over Tailscale and subnet routes. However, to avoid issues when switching
between Wi-Fi and cellular, we need to ensure that we don't retain connections
to any external addresses on the old interface. Therefore, we need to determine
which dialer to use internally based on the configured routes.
This plumbs routes and localRoutes from router.Config to tsdial.Dialer,
and updates UserDial to use either the peer dialer or the system dialer,
depending on the network address and the configured routes.
Updates tailscale/corp#18725
Fixes#4529
Signed-off-by: Nick Khyl <nickk@tailscale.com>
set.Of(1, 2, 3) is prettier than set.SetOf([]int{1, 2, 3}).
I was going to change the signature of SetOf but then I noticed its
name has stutter anyway, so I kept it for compatibility. People can
prefer to use set.Of for new code or slowly migrate.
Also add a lazy Make method, which I often find myself wanting,
without having to resort to uglier mak.Set(&set, k, struct{}{}).
Updates #cleanup
Change-Id: Ic6f3870115334efcbd65e79c437de2ad3edb7625
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
The inflight request tracker only starts recording a new bucket after
the first non-error request. Unfortunately, it's written in such a way
that ONLY successful requests are ever marked as being finished. Once a
bucket has had at least one successful request and begun to be tracked,
all subsequent error cases are never marked finished and always appear
as in-flight.
This change ensures that if a request is recorded has having been
started, we also mark it as finished at the end.
Updates tailscale/corp#19767
Signed-off-by: Will Norris <will@tailscale.com>
Setting the field after-the-fact wasn't working because we could migrate
prefs on creation, which would set health status for auto updates.
Updates #11986
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I41d79ebd61d64829a3a9e70586ce56f62d24ccfd
While debugging a failing test in airplane mode on macOS, I noticed
netcheck logspam about ICMP socket creation permission denied errors.
Apparently macOS just can't do those, or at least not in airplane
mode. Not worth spamming about.
Updates #cleanup
Change-Id: I302620cfd3c8eabb25202d7eef040c01bd8a843c
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
The netcheck client, when no UDP is available, probes distance using
HTTPS.
Several problems:
* It probes using /derp/latency-check.
* But cmd/derper serves the handler at /derp/probe
* Despite the difference, it work by accident until c8f4dfc8c0
which made netcheck's probe require a 2xx status code.
* in tests, we only use derphttp.Handler, so the cmd/derper-installed
mux routes aren't preesnt, so there's no probe. That breaks
tests in airplane mode. netcheck.Client then reports "unexpected
HTTP status 426" (Upgrade Required)
This makes derp handle both /derp/probe and /derp/latency-check
equivalently, and in both cmd/derper and derphttp.Handler standalone
modes.
I notice this when wgengine/magicsock TestActiveDiscovery was failing
in airplane mode (no wifi). It still doesn't pass, but it gets
further.
Fixes#11989
Change-Id: I45213d4bd137e0f29aac8bd4a9ac92091065113f
Not buying wifi on a short flight is a good way to find tests
that require network. Whoops.
Updates #cleanup
Change-Id: Ibe678e9c755d27269ad7206413ffe9971f07d298
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
In prep for it being required in more places.
Updates #11874
Change-Id: Ib743205fc2a6c6ff3d2c4ed3a2b28cac79156539
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This prevents Mark-of-the-Web bypass attacks in case someone visits the
localhost WebDAV server directly.
Fixestailscale/corp#19592
Signed-off-by: Percy Wegmann <percy@tailscale.com>
This prevents Mark-of-the-Web bypass attacks in case someone visits the
localhost WebDAV server directly.
Fixestailscale/corp#19592
Signed-off-by: Percy Wegmann <percy@tailscale.com>
This prevents Mark-of-the-Web bypass attacks in case someone visits the
localhost WebDAV server directly.
Fixestailscale/corp#19592
Signed-off-by: Percy Wegmann <percy@tailscale.com>
This prevents Mark-of-the-Web bypass attacks in case someone visits the
localhost WebDAV server directly.
Fixestailscale/corp#19592
Signed-off-by: Percy Wegmann <percy@tailscale.com>
This prevents Mark-of-the-Web bypass attacks in case someone visits the
localhost WebDAV server directly.
Fixestailscale/corp#19592
Signed-off-by: Percy Wegmann <percy@tailscale.com>
This prevents Mark-of-the-Web bypass attacks in case someone visits the
localhost WebDAV server directly.
Fixestailscale/corp#19592
Signed-off-by: Percy Wegmann <percy@tailscale.com>
To aid in debugging exactly what's going wrong, instead of the
not-particularly-useful "dns udp query: context deadline exceeded" error
that we currently get.
Updates #3786
Updates #10768
Updates #11620
(etc.)
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I76334bf0681a8a2c72c90700f636c4174931432c
cmd/k8s-operator: optionally update dnsrecords Configmap with DNS records for proxies.
This commit adds functionality to automatically populate
DNS records for the in-cluster ts.net nameserver
to allow cluster workloads to resolve MagicDNS names
associated with operator's proxies.
The records are created as follows:
* For tailscale Ingress proxies there will be
a record mapping the MagicDNS name of the Ingress
device and each proxy Pod's IP address.
* For cluster egress proxies, configured via
tailscale.com/tailnet-fqdn annotation, there will be
a record for each proxy Pod, mapping
the MagicDNS name of the exposed
tailnet workload to the proxy Pod's IP.
No records will be created for any other proxy types.
Records will only be created if users have configured
the operator to deploy an in-cluster ts.net nameserver
by applying tailscale.com/v1alpha1.DNSConfig.
It is user's responsibility to add the ts.net nameserver
as a stub nameserver for ts.net DNS names.
https://kubernetes.io/docs/tasks/administer-cluster/dns-custom-nameservers/#configuration-of-stub-domain-and-upstream-nameserver-using-corednshttps://cloud.google.com/kubernetes-engine/docs/how-to/kube-dns#upstream_nameservers
See also https://github.com/tailscale/tailscale/pull/11017
Updates tailscale/tailscale#10499
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
It's deprecated and using it gets us the old slow behavior
according to https://go.dev/blog/randv2.
> Having eliminated repeatability of the global output stream, Go 1.20
> was also able to make the global generator scale better in programs
> that don’t call rand.Seed, replacing the Go 1 generator with a very
> cheap per-thread wyrand generator already used inside the Go
> runtime. This removed the global mutex and made the top-level
> functions scale much better. Programs that do call rand.Seed fall
> back to the mutex-protected Go 1 generator.
Updates #7123
Change-Id: Ia5452e66bd16b5457d4b1c290a59294545e13291
Signed-off-by: Maisem Ali <maisem@tailscale.com>
In prep for making health warnings rich objects with metadata rather
than a bunch of strings, start moving it all into the same place.
We'll still ultimately need the stringified form for the CLI and
LocalAPI for compatibility but we'll next convert all these warnings
into Warnables that have severity levels and such, and legacy
stringification will just be something each Warnable thing can do.
Updates #4136
Change-Id: I83e189435daae3664135ed53c98627c66e9e53da
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
So that we can use this for additional, non-NAT configuration without it
being confusing.
Updates #cleanup
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I1658d59c9824217917a94ee76d2d08f0a682986f
This was a holdover from the older, pre-BART days and is no longer
necessary.
Updates #cleanup
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I71b892bab1898077767b9ff51cef33d59c08faf8
Updates tailscale/corp#18960
Tests in corp called us using the wrong logging calls. Removed.
This is logged downstream anyway.
Signed-off-by: Jonathan Nobels <jonathan@tailscale.com>
This configures localClient correctly during flag parsing, so that the --socket
option is effective when generating tab-completion results. For example, the
following would not connect to the system Tailscale for tab-completion results:
tailscale --socket=/tmp/tailscaled.socket switch <TAB>
Updates #3793
Signed-off-by: Paul Scott <paul@tailscale.com>
Updates tailscale/corp#18960
iOS uses Apple's NetworkMonitor to track the default interface and
there's no reason we shouldn't also use this on macOS, for the same
reasons noted in the comments for why this change was made on iOS.
This eliminates the need to load and parse the routing table when
querying the defaultRouter() in almost all cases.
A slight modification here (on both platforms) to fallback to the default
BSD logic in the unhappy-path rather than making assumptions that
may not hold. If netmon is eventually parsing AF_ROUTE and able
to give a consistently correct answer for the default interface index,
we can fall back to that and eliminate the Swift dependency.
Signed-off-by: Jonathan Nobels <jonathan@tailscale.com>
cmd/k8s-operator/deploy/chart: allow users to configure additional labels for the operator's Pod via Helm chart values.
Fixes#11947
Signed-off-by: Gabe Gorelick <gabe@hightouch.io>
We had this in a different repo, but moving it here, as this a more
fitting package.
Updates #cleanup
Change-Id: I5fb9b10e465932aeef5841c67deba4d77d473d57
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Also, reset it in a few more places (e.g. logout, new blank profiles,
etc.) to avoid a few more cases where a pre-existing dialPlan can cause
a new Headscale server take 10+ seconds to connect.
Updates #11938
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I3095173a5a3d9720507afe4452548491e9e45a3e
If AtomicValue[T] is used with a T that is an interface kind,
then Store may panic if different concret types are ever stored.
Fix this by always wrapping in a concrete type.
Technically, this is only needed if T is an interface kind,
but there is no harm in doing it also for non-interface kinds.
Updates #cleanup
Signed-off-by: Joe Tsai <joetsai@digital-static.net>
* cmd/k8s-nameserver,k8s-operator: add a nameserver that can resolve ts.net DNS names in cluster.
Adds a simple nameserver that can respond to A record queries for ts.net DNS names.
It can respond to queries from in-memory records, populated from a ConfigMap
mounted at /config. It dynamically updates its records as the ConfigMap
contents changes.
It will respond with NXDOMAIN to queries for any other record types
(AAAA to be implemented in the future).
It can respond to queries over UDP or TCP. It runs a miekg/dns
DNS server with a single registered handler for ts.net domain names.
Queries for other domain names will be refused.
The intended use of this is:
1) to allow non-tailnet cluster workloads to talk to HTTPS tailnet
services exposed via Tailscale operator egress over HTTPS
2) to allow non-tailnet cluster workloads to talk to workloads in
the same cluster that have been exposed to tailnet over their
MagicDNS names but on their cluster IPs.
DNSConfig CRD can be used to configure
the operator to deploy kube nameserver (./cmd/k8s-nameserver) to cluster.
Updates tailscale/tailscale#10499
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
When switching profile, the server URL can change (e.g.
because of switching to a self-hosted headscale instance).
If it is not reset here, dial plans returned by old
server (e.g. tailscale control server) will be used to
connect to new server (e.g. self-hosted headscale server),
and the register request will be blocked by it until
timeout, leading to very slow profile switches.
Updates #11938 11938
Signed-off-by: Shaw Drastin <showier.drastic0a@icloud.com>
Certain device drivers (e.g. vxlan, geneve) do not properly handle
coalesced UDP packets later in the stack, resulting in packet loss.
Updates #11026
Signed-off-by: Jordan Whited <jordan@tailscale.com>
Add documentation for GET/PATCH/PUT `api/v2/tailnet/<ID>/dns/split-dns`.
These endpoints allow for reading, partially updating, and replacing the
split DNS settings for a given tailnet.
Updates https://github.com/tailscale/corp/issues/19483
Signed-off-by: Mario Minardi <mario@tailscale.com>
Before attempting to enable IPv6 forwarding in the proxy init container
check if the relevant module is found, else the container crashes
on hosts that don't have it.
Updates#11860
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
The tailscale package is in the community Alpine repo. Check if it's
commented out in `/etc/apk/repositories` and run `setup-apkrepos -c -1`
if it's not.
Fixes#11263
Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
Add node attribute to determine whether or not to show suggested exit
node in UI.
Updates tailscale/corp#19515
Signed-off-by: Claire Wang <claire@tailscale.com>
This fixes bugs where after using the cli to set AdvertiseRoutes users
were finding that they had to restart tailscaled before the app
connector would advertise previously learned routes again. And seems
more in line with user expectations.
Fixes#11006
Signed-off-by: Fran Bull <fran@tailscale.com>
If the controlknob to persist app connector routes is enabled, when
reconfiguring an app connector unadvertise routes that are no longer
relevant.
Updates #11008
Signed-off-by: Fran Bull <fran@tailscale.com>
If the controlknob is on.
This will allow us to remove discovered routes associated with a
particular domain.
Updates #11008
Signed-off-by: Fran Bull <fran@tailscale.com>
When an app connector is reconfigured and domains to route are removed,
we would like to no longer advertise routes that were discovered for
those domains. In order to do this we plan to store which routes were
discovered for which domains.
Add a controlknob so that we can enable/disable the new behavior.
Updates #11008
Signed-off-by: Fran Bull <fran@tailscale.com>
Lays the groundwork for the ability to persist app connectors discovered
routes, which will allow us to stop advertising routes for a domain if
the app connector no longer monitors that domain.
Updates #11008
Signed-off-by: Fran Bull <fran@tailscale.com>
Explicitly set `-H "Content-Type: application/json"` in CURL examples
for POST endpoints as the default content type used by CURL is otherwise
`application/x-www-form-urlencoded` and these endpoints expect JSON data.
Updates https://github.com/tailscale/tailscale/issues/11914
Signed-off-by: Mario Minardi <mario@tailscale.com>
cmd/containerboot,kube,ipn/store/kubestore: allow interactive login and empty state Secrets, check perms
* Allow users to pre-create empty state Secrets
* Add a fake internal kube client, test functionality that has dependencies on kube client operations.
* Fix an issue where interactive login was not allowed in an edge case where state Secret does not exist
* Make the CheckSecretPermissions method report whether we have permissions to create/patch a Secret if it's determined that these operations will be needed
Updates tailscale/tailscale#11170
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
In prep for most of the package funcs in net/interfaces to become
methods in a long-lived netmon.Monitor that can cache things. (Many
of the funcs are very heavy to call regularly, whereas the long-lived
netmon.Monitor can subscribe to things from the OS and remember
answers to questions it's asked regularly later)
Updates tailscale/corp#10910
Updates tailscale/corp#18960
Updates #7967
Updates #3299
Change-Id: Ie4e8dedb70136af2d611b990b865a822cd1797e5
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
... in prep for merging the net/interfaces package into net/netmon.
This is a no-op change that updates a bunch of the API signatures ahead of
a future change to actually move things (and remove the type alias)
Updates tailscale/corp#10910
Updates tailscale/corp#18960
Updates #7967
Updates #3299
Change-Id: I477613388f09389214db0d77ccf24a65bff2199c
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Modifies containerboot to wait on tailscaled process
only, not on any child process of containerboot.
Waiting on any subprocess was racing with Go's
exec.Cmd.Run, used to run iptables commands and
that starts its own subprocesses and waits on them.
Containerboot itself does not run anything else
except for tailscaled, so there shouldn't be a need
to wait on anything else.
Updates tailscale/tailscale#11593
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
The goal is to move more network state accessors to netmon.Monitor
where they can be cheaper/cached. But first (this change and others)
we need to make sure the one netmon.Monitor is plumbed everywhere.
Some notable bits:
* tsdial.NewDialer is added, taking a now-required netmon
* because a tsdial.Dialer always has a netmon, anything taking both
a Dialer and a NetMon is now redundant; take only the Dialer and
get the NetMon from that if/when needed.
* netmon.NewStatic is added, primarily for tests
Updates tailscale/corp#10910
Updates tailscale/corp#18960
Updates #7967
Updates #3299
Change-Id: I877f9cb87618c4eb037cee098241d18da9c01691
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This has been a TODO for ages. Time to do it.
The goal is to move more network state accessors to netmon.Monitor
where they can be cheaper/cached.
Updates tailscale/corp#10910
Updates tailscale/corp#18960
Updates #7967
Updates #3299
Change-Id: I60fc6508cd2d8d079260bda371fc08b6318bcaf1
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
I'm working on moving all network state queries to be on
netmon.Monitor, removing old APIs.
Updates tailscale/corp#10910
Updates tailscale/corp#18960
Updates #7967
Updates #3299
Change-Id: If0de137e0e2e145520f69e258597fb89cf39a2a3
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Fixestailscale/corp#19558
A request for the suggested exit nodes that occurs too early in the
VPN lifecycle would result in a null deref of the netmap and/or
the netcheck report. This checks both and errors out.
Signed-off-by: Jonathan Nobels <jonathan@tailscale.com>
Adds a new .spec.metrics field to ProxyClass to allow users to optionally serve
client metrics (tailscaled --debug) on <Pod-IP>:9001.
Metrics cannot currently be enabled for proxies that egress traffic to tailnet
and for Ingress proxies with tailscale.com/experimental-forward-cluster-traffic-via-ingress annotation
(because they currently forward all cluster traffic to their respective backends).
The assumption is that users will want to have these metrics enabled
continuously to be able to monitor proxy behaviour (as opposed to enabling
them temporarily for debugging). Hence we expose them on Pod IP to make it
easier to consume them i.e via Prometheus PodMonitor.
Updates tailscale/tailscale#11292
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
This adds a health.Tracker to tsd.System, accessible via
a new tsd.System.HealthTracker method.
In the future, that new method will return a tsd.System-specific
HealthTracker, so multiple tsnet.Servers in the same process are
isolated. For now, though, it just always returns the temporary
health.Global value. That permits incremental plumbing over a number
of changes. When the second to last health.Global reference is gone,
then the tsd.System.HealthTracker implementation can return a private
Tracker.
The primary plumbing this does is adding it to LocalBackend and its
dozen and change health calls. A few misc other callers are also
plumbed. Subsequent changes will flesh out other parts of the tree
(magicsock, controlclient, etc).
Updates #11874
Updates #4136
Change-Id: Id51e73cfc8a39110425b6dc19d18b3975eac75ce
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
In prep for tsd.System Tracker plumbing throughout tailscaled,
defensively permit all methods on Tracker to accept a nil receiver
without crashing, lest I screw something up later. (A health tracking
system that itself causes crashes would be no good.) Methods on nil
receivers should not be called, so a future change will also collect
their stacks (and panic during dev/test), but we should at least not
crash in prod.
This also locks that in with a test using reflect to automatically
call all methods on a nil receiver and check they don't crash.
Updates #11874
Updates #4136
Change-Id: I8e955046ebf370ec8af0c1fb63e5123e6282a9d3
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Previously it was both metadata about the class of warnable item as
well as the value.
Now it's only metadata and the value is per-Tracker.
Updates #11874
Updates #4136
Change-Id: Ia1ed1b6c95d34bc5aae36cffdb04279e6ba77015
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This moves most of the health package global variables to a new
`health.Tracker` type.
But then rather than plumbing the Tracker in tsd.System everywhere,
this only goes halfway and makes one new global Tracker
(`health.Global`) that all the existing callers now use.
A future change will eliminate that global.
Updates #11874
Updates #4136
Change-Id: I6ee27e0b2e35f68cb38fecdb3b2dc4c3f2e09d68
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This improves convenience and security.
* Convenience - no need to see nodes that can't share anything with you.
* Security - malicious nodes can't expose shares to peers that aren't
allowed to access their shares.
Updates tailscale/corp#19432
Signed-off-by: Percy Wegmann <percy@tailscale.com>
If seamless key renewal is enabled, we typically do not stop the engine
(deconfigure networking). However, if the node key has expired there is
no point in keeping the connection up, and it might actually prevent
key renewal if auth relies on endpoints routed via app connectors.
Fixestailscale/corp#5800
Signed-off-by: Anton Tolchanov <anton@tailscale.com>
Fixestailscale/corp#19459
This PR adds the ability for users of the syspolicy handler to read string arrays from the MDM solution configured on the system.
Signed-off-by: Andrea Gottardo <andrea@gottardo.me>
This change allows for the release/dist/qnap package to be used
outside of the tailscale repo (notably, will be used from corp),
by using an embedded file system for build files which gets
temporarily written to a new folder during qnap build runs.
Without this change, when used from corp, the release/dist/qnap
folder will fail to be found within the corp repo, causing
various steps of the build to fail.
The file renames in this change are to combine the build files
into a /files folder, separated into /scripts and /Tailscale.
Updates tailscale/tailscale-qpkg#135
Signed-off-by: Sonia Appasamy <sonia@tailscale.com>
This PR bumps iptables to a newer version that has a function to detect
'NotExists' errors and uses that function to determine whether errors
received on iptables rule and chain clean up are because the rule/chain
does not exist- if so don't log the error.
Updates corp#19336
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
* cmd/containerboot,util/linuxfw: support proxy backends specified by DNS name
Adds support for optionally configuring containerboot to proxy
traffic to backends configured by passing TS_EXPERIMENTAL_DEST_DNS_NAME env var
to containerboot.
Containerboot will periodically (every 10 minutes) attempt to resolve
the DNS name and ensure that all traffic sent to the node's
tailnet IP gets forwarded to the resolved backend IP addresses.
Currently:
- if the firewall mode is iptables, traffic will be load balanced
accross the backend IP addresses using round robin. There are
no health checks for whether the IPs are reachable.
- if the firewall mode is nftables traffic will only be forwarded
to the first IP address in the list. This is to be improved.
* cmd/k8s-operator: support ExternalName Services
Adds support for exposing endpoints, accessible from within
a cluster to the tailnet via DNS names using ExternalName Services.
This can be done by annotating the ExternalName Service with
tailscale.com/expose: "true" annotation.
The operator will deploy a proxy configured to route tailnet
traffic to the backend IPs that service.spec.externalName
resolves to. The backend IPs must be reachable from the operator's
namespace.
Updates tailscale/tailscale#10606
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
Since the tailscaled binaries that we distribute are static and don't
link cgo, we previously wouldn't fetch group IDs that are returned via
NSS. Try shelling out to the 'id' command, similar to how we call
'getent', to detect such cases.
Updates #11682
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I9bdc938bd76c71bc130d44a97cc2233064d64799
There is an undocumented 16KiB limit for text log messages.
However, the limit for JSON messages is 256KiB.
Even worse, logging JSON as text results in significant overhead
since each double quote needs to be escaped.
Instead, use logger.Logf.JSON to explicitly log the info as JSON.
We also modify osdiag to return the information as structured data
rather than implicitly have the package log on our behalf.
This gives more control to the caller on how to log.
Updates #7802
Signed-off-by: Joe Tsai <joetsai@digital-static.net>
Creates new QNAP builder target, which builds go binaries then uses
docker to build into QNAP packages. Much of the docker/script code
here is pulled over from https://github.com/tailscale/tailscale-qpkg,
with adaptation into our builder structures.
The qnap/Tailscale folder contains static resources needed to build
Tailscale qpkg packages, and is an exact copy of the existing folder
in the tailscale-qpkg repo.
Builds can be run with:
```
sudo ./tool/go run ./cmd/dist build qnap
```
Updates tailscale/tailscale-qpkg#135
Signed-off-by: Sonia Appasamy <sonia@tailscale.com>
Since we already track active SSH connections, it's not hard to
proactively reject updates until those finish. We attempt to do the same
on the control side, but the detection latency for new connections is in
the minutes, which is not fast enough for common short sessions.
Handle a `force=true` query parameter to override this behavior, so that
control can still trigger an update on a server where some long-running
abandoned SSH session is open.
Updates https://github.com/tailscale/corp/issues/18556
Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
It was only obviously unused after the previous change, c39cde79d.
Updates #19334
Change-Id: I9896d5fa692cb4346c070b4a339d0d12340c18f7
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
We were storing server-side lots of:
"Auth":{"Provider":"","LoginName":"","Oauth2Token":null,"AuthKey":""},
That was about 7% of our total storage of pending RegisterRequest
bodies.
Updates tailscale/corp#19327
Change-Id: Ib73842759a2b303ff5fe4c052a76baea0d68ae7d
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
If this happens, it results in us pessimistically closing more
connections than might be necessary, but is more correct since we won't
"miss" a change to the default route interface and keep trying to send
data over a nonexistent interface, or one that can't reach the internet.
Updates tailscale/corp#19124
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: Ia0b8b04cb8cdcb0da0155fd08751c9dccba62c1a
The Network Location Awareness service identifies networks authenticated against
an Active Directory domain and categorizes them as "Domain Authenticated".
This includes the Tailscale network if a Domain Controller is reachable through it.
If a network is categories as NLM_NETWORK_CATEGORY_DOMAIN_AUTHENTICATED,
it is not possible to override its category, and we shouldn't attempt to do so.
Additionally, our Windows Firewall rules should be compatible with both private
and domain networks.
This fixes both issues.
Fixes#11813
Signed-off-by: Nick Khyl <nickk@tailscale.com>
Containers are typically immutable and should be updated as a whole (and
not individual packages within). Deny enablement of auto-updates in
containers.
Also, add the missing check in EditPrefs in LocalAPI, to catch cases
like tailnet default auto-updates getting enabled for nodes that don't
support it.
Updates #11544
Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
We don't always have the same latest version for all platforms (like
with 1.64.2 is only Synology+Windows), so we should use the OS-specific
result from pkgs JSON response instead of the main Version field.
Updates #11795
Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
Kubernetes cluster domain defaults to 'cluster.local', but can also be customized.
We need to determine cluster domain to set up in-cluster forwarding to our egress proxies.
This was previously hardcoded to 'cluster.local', so was the egress proxies were not usable in clusters with custom domains.
This PR ensures that we attempt to determine the cluster domain by parsing /etc/resolv.conf.
In case the cluster domain cannot be determined from /etc/resolv.conf, we fall back to 'cluster.local'.
Updates tailscale/tailscale#10399,tailscale/tailscale#11445
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
peerapi does not want these, but rclone includes them.
Removing them allows rclone to work with Taildrive configured
as a WebDAV remote.
Updates #cleanup
Signed-off-by: Percy Wegmann <percy@tailscale.com>
peerapi does not want these, but rclone includes them.
Stripping them out allows rclone to work with Taildrive configured
as a WebDAV remote.
Updates #cleanup
Signed-off-by: Percy Wegmann <percy@tailscale.com>
This ensures that MOVE, LOCK and any other verbs that use the Location
header work correctly.
Fixes#11758
Signed-off-by: Percy Wegmann <percy@tailscale.com>
-Move Android impl into interfaces_android.go
-Instead of using ip route to get the interface name, use the one passed in by Android (ip route is restricted in Android 13+ per termux/termux-app#2993)
Follow-up will be to do the same for router
Fixestailscale/corp#19215Fixestailscale/corp#19124
Signed-off-by: kari-ts <kari@tailscale.com>
Some editions of Windows server share the same build number as their
client counterpart; we must use an additional field found in the OS
version information to distinguish between them.
Even though "Distro" has Linux connotations, it is the most appropriate
hostinfo field. What is Windows Server if not an alternate distribution
of Windows? This PR populates Distro with "Server" when applicable.
Fixes#11785
Signed-off-by: Aaron Klotz <aaron@tailscale.com>
This change is safe (self is still safe, by
definition), and makes the code match the comment.
Updates #cleanup
Signed-off-by: Chris Palmer <cpalmer@tailscale.com>
Most of the magicsock tests fake the network, simulating packets going
out and coming in. There's no reason to actually hit your router to do
UPnP/NAT-PMP/PCP during in tests. But while debugging thousands of
iterations of tests to deflake some things, I saw it slamming my
router. This stops that.
Updates #11762
Change-Id: I59b9f48f8f5aff1fa16b4935753d786342e87744
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Turns out, profileManager is not safe for concurrent use and I missed
all the locking infrastructure in LocalBackend, oops.
I was not able to reproduce the race even with `go test -count 100`, but
this seems like an obvious fix.
Fixes#11773
Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
This ensures that we close the underlying connection(s) when a major
link change happens. If we don't do this, on mobile platforms switching
between WiFi and cellular can result in leftover connections in the
http.Client's connection pool which are bound to the "wrong" interface.
Updates #10821
Updates tailscale/corp#19124
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: Ibd51ce2efcaf4bd68e14f6fdeded61d4e99f9a01
The approach is lifted from cobra: `tailscale completion bash` emits a bash
script for configuring the shell's autocomplete:
. <( tailscale completion bash )
so that typing:
tailscale st<TAB>
invokes:
tailscale completion __complete -- st
RELNOTE=tailscale CLI now supports shell tab-completion
Fixes#3793
Signed-off-by: Paul Scott <paul@tailscale.com>
This removes AWS and Kubernetes support from Linux binaries by default
on GOARCH values where people don't typically run on AWS or use
Kubernetes, such as 32-bit mips CPUs.
It primarily focuses on optimizing for the static binaries we
distribute. But for people building it themselves, they can set
ts_kube or ts_aws (the opposite of ts_omit_kube or ts_omit_aws) to
force it back on.
Makes tailscaled binary ~2.3MB (~7%) smaller.
Updates #7272, #10627 etc
Change-Id: I42a8775119ce006fa321462cb2d28bc985d1c146
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This wasn't previously handling the case where an interface in s2 was
removed and not present in s1, and would cause the Equal method to
incorrectly return that the states were equal.
Updates tailscale/corp#19124
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I3af22bc631015d1ddd0a1d01bfdf312161b9532d
It should've been deleted in 11ece02f52.
Updates #9040
Change-Id: If8a136bdb6c82804af658c9d2b0a8c63ce02d509
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Updates tailscale/corp#18724
When localAPI clients directly set ExitNodeID to "", the expected behaviour is that the prior exit node also gets zero'd - effectively setting the UI state back to 'no exit node was ever selected'
The IntenalExitNodePrior has been changed to be a non-opaque type, as it is read by the UI to render the users last selected exit node, and must be concrete. Future-us can either break this, or deprecate it and replace it with something more interesting.
Signed-off-by: Jonathan Nobels <jonathan@tailscale.com>
Seems to deflake tstest/integration tests. I can't reproduce it
anymore on one of my VMs that was consistently flaking after a dozen
runs before. Now I can run hundreds of times.
Updates #11649Fixes#7036
Change-Id: I2f7d4ae97500d507bdd78af9e92cd1242e8e44b8
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
We haven't needed this hack for quite some time Andrea says.
Updates #11649
Change-Id: Ie854b7edd0a01e92495669daa466c7c0d57e7438
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
I'm on a mission to simplify LocalBackend.Start and its locking
and deflake some tests.
I noticed this hasn't been used since March 2023 when it was removed
from the Windows client in corp 66be796d33c.
So, delete.
Updates #11649
Change-Id: I40f2cb75fb3f43baf23558007655f65a8ec5e1b2
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
We have seen in macOS client logs that the "operation not permitted", a
syscall.EPERM error, is being returned when traffic is attempted to be
sent. This may be caused by security software on the client.
This change will perform a rebind and restun if we receive a
syscall.EPERM error on clients running darwin. Rebinds will only be
called if we haven't performed one specifically for an EPERM error in
the past 5 seconds.
Updates #11710
Signed-off-by: Charlotte Brandhorst-Satzkorn <charlotte@tailscale.com>
The gliderlabs/ssh license is actually already included in the standard
package listing. I'm not sure why I thought it wasn't.
Updates tailscale/corp#5780
This reverts commit 11dca08e93.
Signed-off-by: Will Norris <will@tailscale.com>
Trying to run iptables/nftables on Synology pauses for minutes with
lots of errors and ultimately does nothing as it's not used and we
lack permissions.
This fixes a regression from db760d0bac (#11601) that landed
between Synology testing on unstable 1.63.110 and 1.64.0 being cut.
Fixes#11737
Change-Id: Iaf9563363b8e45319a9b6fe94c8d5ffaecc9ccef
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Adds new ProxyClass.spec.statefulSet.pod.{tailscaleContainer,tailscaleInitContainer}.Env field
that allow users to provide key, value pairs that will be set as env vars for the respective containers.
Allow overriding all containerboot env vars,
but warn that this is not supported and might break (in docs + a warning when validating ProxyClass).
Updates tailscale/tailscale#10709
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
It was broken with the move to dist in 32e0ba5e68 which doesn't accept
amd64 anymore.
Updates #cleanup
Change-Id: Iaaaba2d73c6a09a226934fe8e5c18b16731ee7a6
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
We have tstest/integration nowadays.
And this test was one of the lone holdouts using the to-be-nuked
SetControlClientGetterForTesting.
Updates #11649
Change-Id: Icf8a6a2e9b8ae1ac534754afa898c00dc0b7623b
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
cc vs ccAuto is a mess. It needs to go. But this is a baby step towards
getting there.
Updates #11649
Change-Id: I34f33934844e580bd823a7d8f2b945cf26c87b3b
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
The new Android app and its libtailscale don't use this anymore;
it uses LocalAPI like other clients now.
Updates #11649
Change-Id: Ic9f42b41e0e0280b82294329093dc6c275f41d50
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
At least in userspace-networking mode.
Fixes#11361
Change-Id: I78d33f0f7e05fe9e9ee95b97c99b593f8fe498f2
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Changes made:
* Avoid "encoding/json" for JSON processing, and instead use
"github.com/go-json-experiment/json/jsontext".
Use jsontext.Value.IsValid for validation, which is much faster.
Use jsontext.AppendQuote instead of our own JSON escaping.
* In drainPending, use a different maxLen depending on lowMem.
In lowMem mode, it is better to perform multiple uploads
than it is to construct a large body that OOMs the process.
* In drainPending, if an error is encountered draining,
construct an error message in the logtail JSON format
rather than something that is invalid JSON.
* In appendTextOrJSONLocked, use jsontext.Decoder to check
whether the input is a valid JSON object. This is faster than
the previous approach of unmarshaling into map[string]any and
then re-marshaling that data structure.
This is especially beneficial for network flow logging,
which produces relatively large JSON objects.
* In appendTextOrJSONLocked, enforce maxSize on the input.
If too large, then we may end up in a situation where the logs
can never be uploaded because it exceeds the maximum body size
that the Tailscale logs service accepts.
* Use "tailscale.com/util/truncate" to properly truncate a string
on valid UTF-8 boundaries.
* In general, remove unnecessary spaces in JSON output.
Performance:
name old time/op new time/op delta
WriteText 776ns ± 2% 596ns ± 1% -23.24% (p=0.000 n=10+10)
WriteJSON 110µs ± 0% 9µs ± 0% -91.77% (p=0.000 n=8+8)
name old alloc/op new alloc/op delta
WriteText 448B ± 0% 0B -100.00% (p=0.000 n=10+10)
WriteJSON 37.9kB ± 0% 0.0kB ± 0% -99.87% (p=0.000 n=10+10)
name old allocs/op new allocs/op delta
WriteText 1.00 ± 0% 0.00 -100.00% (p=0.000 n=10+10)
WriteJSON 1.08k ± 0% 0.00k ± 0% -99.91% (p=0.000 n=10+10)
For text payloads, this is 1.30x faster.
For JSON payloads, this is 12.2x faster.
Updates #cleanup
Updates tailscale/corp#18514
Signed-off-by: Joe Tsai <joetsai@digital-static.net>
We were being too aggressive when deciding whether to write our NRPT rules
to the local registry key or the group policy registry key.
After once again reviewing the document which calls itself a spec
(see issue), it is clear that the presence of the DnsPolicyConfig subkey
is the important part, not the presence of values set in the DNSClient
subkey. Furthermore, a footnote indicates that the presence of
DnsPolicyConfig in the GPO key will always override its counterpart in
the local key. The implication of this is important: we may unconditionally
write our NRPT rules to the local key. We copy our rules to the policy
key only when it contains NRPT rules belonging to somebody other than us.
Fixes https://github.com/tailscale/corp/issues/19071
Signed-off-by: Aaron Klotz <aaron@tailscale.com>
This package is included in the tempfork directory, rather than as a go
module dependency, so is not included in the normal package list.
Updates tailscale/corp#5780
Signed-off-by: Will Norris <will@tailscale.com>
assert("a has fewer",routesWithout(prefixes("1.1.1.1/32","1.1.1.2/32"),prefixes("1.1.1.1/32","1.1.1.2/32","1.1.1.3/32","1.1.1.4/32")),[]netip.Prefix{})
assert("a has more",routesWithout(prefixes("1.1.1.1/32","1.1.1.2/32","1.1.1.3/32","1.1.1.4/32"),prefixes("1.1.1.1/32","1.1.1.3/32")),prefixes("1.1.1.2/32","1.1.1.4/32"))
panic("binary built with tailscale_go build tag but failed to read build info or find tailscale.toolchain.rev in build info")
}
want:=strings.TrimSpace(GoToolchainRev)
iftsRev!=want{
ifos.Getenv("TS_PERMIT_TOOLCHAIN_MISMATCH")=="1"{
fmt.Fprintf(os.Stderr,"tailscale.toolchain.rev = %q, want %q; but ignoring due to TS_PERMIT_TOOLCHAIN_MISMATCH=1\n",tsRev,want)
return
}
panic(fmt.Sprintf("binary built with tailscale_go build tag but Go toolchain %q doesn't match github.com/tailscale/tailscale expected value %q; override this failure with TS_PERMIT_TOOLCHAIN_MISMATCH=1",tsRev,want))
logf("The latest Tailscale release for Linux is %q, but your apk repository only provides %q.\nYour Alpine version is %q, you may need to upgrade the system to get the latest Tailscale version: https://wiki.alpinelinux.org/wiki/Upgrading_Alpine",latest,apkVer,alpineVer)
}
returnnil
}
func(up*Updater)updateMacSys()error{
returnerrors.New("NOTREACHED: On MacSys builds, `tailscale update` is handled in Swift to launch the GUI updater")
log.Printf("Warning: proxy backend ClusterIP is an IPv6 address and the proxy has a IPv4 tailnet address. You might need to disable IPv4 address allocation for the proxy for forwarding to work. See https://github.com/tailscale/tailscale/issues/12156")
}
if!local.IsValid(){
returnfmt.Errorf("no tailscale IP matching family of %s found in %v",dstStr,tsIPs)
returnfmt.Errorf("adding rule to clamp traffic via tailscale0: %v",err)
}
returnnil
}
iflen(v4Backends)!=0{
if!tsv4.IsValid(){
log.Printf("backend targets %v contain at least one IPv4 address, but this node's Tailscale IPs do not contain a valid IPv4 address: %v",backendAddrs,tsIPs)
log.Printf("backend targets %v contain at least one IPv6 address, but this node's Tailscale IPs do not contain a valid IPv6 address: %v",backendAddrs,tsIPs)
}elseif!nfr.HasIPV6NAT(){
log.Printf("backend targets %v contain at least one IPv6 address, but the chosen firewall mode does not support IPv6 NAT",backendAddrs)
log.Printf("backend addresses for egress target %q have changed old IPs %v, new IPs %v trigger egress config resync",nn.Name(),ips,nn.Addresses().AsSlice())
}
returntrue
}
}
}
returnfalse
}
// ensureServiceDeleted ensures that any rules for an egress service are removed
returnfmt.Errorf("error validating whether tailscaled config directory %q contains tailscaled config for current capability version %q: %w. If this is a Tailscale Kubernetes operator proxy, please ensure that the version of the operator is not older than the version of the proxy",dir,file,err)
returnerrors.New("TS_EXPERIMENTAL_VERSIONED_CONFIG_DIR cannot be set in combination with TS_HOSTNAME, TS_EXTRA_ARGS, TS_AUTHKEY, TS_ROUTES, TS_ACCEPT_DNS.")
// - Protocol & Go docs: https://pkg.go.dev/tailscale.com/derp
// - Running a DERP server: https://github.com/tailscale/tailscale/tree/main/cmd/derper#derp
packagemain// import "tailscale.com/cmd/derper"
import(
@@ -13,6 +19,7 @@ import (
"expvar"
"flag"
"fmt"
"html/template"
"io"
"log"
"math"
@@ -22,6 +29,9 @@ import (
"os/signal"
"path/filepath"
"regexp"
"runtime"
runtimemetrics"runtime/metrics"
"strconv"
"strings"
"syscall"
"time"
@@ -55,7 +65,7 @@ var (
meshPSKFile=flag.String("mesh-psk-file",defaultMeshPSKFile(),"if non-empty, path to file containing the mesh pre-shared key file. It should contain some hex string; whitespace is trimmed.")
meshWith=flag.String("mesh-with","","optional comma-separated list of hostnames to mesh with; the server's own hostname can be in the list")
bootstrapDNS=flag.String("bootstrap-dns-names","","optional comma-separated list of hostnames to make available at /bootstrap-dns")
unpublishedDNS=flag.String("unpublished-bootstrap-dns-names","","optional comma-separated list of hostnames to make available at /bootstrap-dns and not publish in the list")
unpublishedDNS=flag.String("unpublished-bootstrap-dns-names","","optional comma-separated list of hostnames to make available at /bootstrap-dns and not publish in the list. If an entry contains a slash, the second part names a DNS record to poll for its TXT record with a `0` to `100` value for rollout percentage.")
verifyClients=flag.Bool("verify-clients",false,"verify clients to this DERP server through a local tailscaled instance.")
verifyClientURL=flag.String("verify-client-url","","if non-empty, an admission controller URL for permitting client connections; see tailcfg.DERPAdmitClientRequest")
verifyFailOpen=flag.Bool("verify-client-url-fail-open",true,"whether we fail open if --verify-client-url is unreachable")
@@ -191,34 +201,35 @@ func main() {
http.Error(w,"derp server disabled",http.StatusNotFound)
}))
}
mux.HandleFunc("/derp/probe",probeHandler)
// These two endpoints are the same. Different versions of the clients
// have assumes different paths over time so we support both.
derpMapURL=flag.String("derp-map","https://login.tailscale.com/derpmap/default","URL to DERP map (https:// or file://)")
derpMapURL=flag.String("derp-map","https://login.tailscale.com/derpmap/default","URL to DERP map (https:// or file://) or 'local' to use the local tailscaled's DERP map")
versionFlag=flag.Bool("version",false,"print version and exit")
subcmd.FlagSet.BoolVar(&synologyPackageCenter,"synology-package-center",false,"build synology packages with extra metadata for the official package center")
subcmd.FlagSet.StringVar(&qnapPrivateKeyPath,"qnap-private-key-path","","sign qnap packages with given key (must also provide --qnap-certificate-path)")
subcmd.FlagSet.StringVar(&qnapCertificatePath,"qnap-certificate-path","","sign qnap packages with given certificate (must also provide --qnap-private-key-path)")
apiServer=rootFlagSet.String("api-server","api.tailscale.com","API server to contact")
failOnManualEdits=rootFlagSet.Bool("fail-on-manual-edits",false,"fail if manual edits to the ACLs in the admin panel are detected; when set to false (the default) only a warning is printed")
)
funcmodifiedExternallyError(){
funcmodifiedExternallyError()error{
if*githubSyntax{
fmt.Printf("::warning file=%s,line=1,col=1,title=Policy File Modified Externally::The policy file was modified externally in the admin console.\n",*policyFname)
returnfmt.Errorf("::warning file=%s,line=1,col=1,title=Policy File Modified Externally::The policy file was modified externally in the admin console.",*policyFname)
}else{
fmt.Printf("The policy file was modified externally in the admin console.\n")
returnfmt.Errorf("The policy file was modified externally in the admin console.")
# Repository defaults to DockerHub, but images are also synced to ghcr.io/tailscale/tailscale.
repository:tailscale/tailscale
# Digest will be prioritized over tag. If neither are set appVersion will be
# used.
tag:""
@@ -64,10 +78,14 @@ proxyConfig:
# Note that if you pass multiple tags to this field via `--set` flag to helm upgrade/install commands you must escape the comma (for example, "tag:k8s-proxies\,tag:prod"). See https://github.com/helm/helm/issues/1556
defaultTags:"tag:k8s"
firewallMode:auto
# If defined, this proxy class will be used as the default proxy class for
# service and ingress resources that do not have a proxy class defined. It
# does not apply to Connector resources.
defaultProxyClass:""
# apiServerProxyConfig allows to configure whether the operator should expose
description: 'APIVersion defines the versioned schema of this representation of an object. Servers should convert recognized schemas to the latest internal value, and may reject unrecognized values. More info:https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources'
description:|-
APIVersion defines the versioned schema of this representation of an object.
Servers should convert recognized schemas to the latest internal value, and
may reject unrecognized values.
More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources
type:string
kind:
description: 'Kind is a string value representing the REST resource this object represents. Servers may infer this from the endpoint the client submits requests to. Cannot be updated. In CamelCase. More info:https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds'
description:|-
Kind is a string value representing the REST resource this object represents.
Servers may infer this from the endpoint the client submits requests to.
Cannot be updated.
In CamelCase.
More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds
type:string
metadata:
type:object
spec:
description:ConnectorSpec describes the desired Tailscale component.
description:|-
ConnectorSpec describes the desired Tailscale component.
description:ExitNode defines whether the Connector node should act as a Tailscale exit node. Defaults to false. https://tailscale.com/kb/1103/exit-nodes
description:|-
ExitNode defines whether the Connector device should act as a Tailscale exit node. Defaults to false.
This field is mutually exclusive with the appConnector field.
https://tailscale.com/kb/1103/exit-nodes
type:boolean
hostname:
description:Hostname is the tailnet hostname that should be assigned to the Connector node. If unset, hostname defaults to <connector name>-connector. Hostname can contain lower case letters, numbers and dashes, it must not start or end with a dash and must be between 2 and 63 characters long.
description:|-
Hostname is the tailnet hostname that should be assigned to the
Connector node. If unset, hostname defaults to <connector
name>-connector. Hostname can contain lower case letters, numbers and
dashes, it must not start or end with a dash and must be between 2
and 63 characters long.
type:string
pattern:^[a-z0-9][a-z0-9-]{0,61}[a-z0-9]$
proxyClass:
description:ProxyClass is the name of the ProxyClass custom resource that contains configuration options that should be applied to the resources created for this Connector. If unset, the operator will create resources with the default configuration.
description:|-
ProxyClass is the name of the ProxyClass custom resource that
contains configuration options that should be applied to the
resources created for this Connector. If unset, the operator will
create resources with the default configuration.
type:string
subnetRouter:
description:SubnetRouter defines subnet routes that the Connector node should expose to tailnet. If unset, none are exposed. https://tailscale.com/kb/1019/subnets/
description:|-
SubnetRouter defines subnet routes that the Connector device should
expose to tailnet as a Tailscale subnet router.
https://tailscale.com/kb/1019/subnets/
If this field is unset, the device does not get configured as a Tailscale subnet router.
This field is mutually exclusive with the appConnector field.
type:object
required:
- advertiseRoutes
properties:
advertiseRoutes:
description:AdvertiseRoutes refer to CIDRs that the subnet router should make available. Route values must be strings that represent a valid IPv4 or IPv6 CIDR range. Values can be Tailscale 4via6 subnet routes. https://tailscale.com/kb/1201/4via6-subnets/
description:|-
AdvertiseRoutes refer to CIDRs that the subnet router should make
available. Route values must be strings that represent a valid IPv4
or IPv6 CIDR range. Values can be Tailscale 4via6 subnet routes.
https://tailscale.com/kb/1201/4via6-subnets/
type:array
minItems:1
items:
type:string
format:cidr
tags:
description:Tags that the Tailscale node will be tagged with. Defaults to [tag:k8s]. To autoapprove the subnet routes or exit node defined by a Connector, you can configure Tailscale ACLs to give these tags the necessary permissions. See https://tailscale.com/kb/1018/acls/#auto-approvers-for-routes-and-exit-nodes. If you specify custom tags here, you must also make the operator an owner of these tags. See https://tailscale.com/kb/1236/kubernetes-operator/#setting-up-the-kubernetes-operator. Tags cannot be changed once a Connector node has been created. Tag values must be in form ^tag:[a-zA-Z][a-zA-Z0-9-]*$.
description:|-
Tags that the Tailscale node will be tagged with.
Defaults to [tag:k8s].
To autoapprove the subnet routes or exit node defined by a Connector,
you can configure Tailscale ACLs to give these tags the necessary
permissions.
See https://tailscale.com/kb/1337/acl-syntax#autoapprovers.
If you specify custom tags here, you must also make the operator an owner of these tags.
See https://tailscale.com/kb/1236/kubernetes-operator/#setting-up-the-kubernetes-operator.
Tags cannot be changed once a Connector node has been created.
Tag values must be in form ^tag:[a-zA-Z][a-zA-Z0-9-]*$.
message:The appConnector field is mutually exclusive with exitNode and subnetRouter fields.
status:
description:ConnectorStatus describes the status of the Connector. This is set and managed by the Tailscale operator.
description:|-
ConnectorStatus describes the status of the Connector. This is set
and managed by the Tailscale operator.
type:object
properties:
conditions:
description:List of status conditions to indicate the status of the Connector. Known condition types are `ConnectorReady`.
description:|-
List of status conditions to indicate the status of the Connector.
Known condition types are `ConnectorReady`.
type:array
items:
description:ConnectorCondition contains condition information for a Connector.
description:Condition contains details for one aspect of the current state of this API Resource.
type:object
required:
- lastTransitionTime
- message
- reason
- status
- type
properties:
lastTransitionTime:
description:LastTransitionTime is the timestamp corresponding to the last status change of this condition.
description:|-
lastTransitionTime is the last time the condition transitioned from one status to another.
This should be when the underlying condition changed. If that is not known, then using the time when the API field changed is acceptable.
type:string
format:date-time
message:
description:Message is a human readable description of the details of the last transition, complementing reason.
description:|-
message is a human readable message indicating details about the transition.
This may be an empty string.
type:string
maxLength:32768
observedGeneration:
description:If set, this represents the .metadata.generation that the condition was set based upon. For instance, if .metadata.generation is currently 12, but the .status.condition[x].observedGeneration is 9, the condition is out of date with respect to the current state of the Connector.
description:|-
observedGeneration represents the .metadata.generation that the condition was set based upon.
For instance, if .metadata.generation is currently 12, but the .status.conditions[x].observedGeneration is 9, the condition is out of date
with respect to the current state of the instance.
type:integer
format:int64
minimum:0
reason:
description:Reason is a brief machine readable explanation for the condition's last transition.
description:|-
reason contains a programmatic identifier indicating the reason for the condition's last transition.
Producers of specific condition types may define expected values and meanings for this field,
and whether the values are considered a guaranteed API.
The value should be a CamelCase string.
This field may not be empty.
type:string
maxLength:1024
minLength:1
pattern:^[A-Za-z]([A-Za-z0-9_,:]*[A-Za-z0-9_])?$
status:
description:Status of the condition, one of ('True', 'False', 'Unknown').
description:status of the condition, one of True, False, Unknown.
type:string
enum:
- "True"
- "False"
- Unknown
type:
description:Type of the condition, known values are (`SubnetRouterReady`).
description:type of condition in CamelCase or in foo.example.com/CamelCase.
l.Infof("proxy has configured egress service for tailnet target %v, current target is %v, waiting for proxy to reconfigure...",st.TailnetTarget,cfg.TailnetTarget)
returnfalse,nil
}
if!reflect.DeepEqual(cfg.Ports,st.Ports){
l.Debugf("proxy has configured egress service for ports %#+v, wants ports %#+v, waiting for proxy to reconfigure",st.Ports,cfg.Ports)
returnfalse,nil
}
l.Debugf("proxy is ready to route traffic to egress service")
// Currently we only ever set a single address on and Endpoint and nothing else is meant to modify this.
iflen(ep.Addresses)!=1{
returnfalse
}
returnstrings.EqualFold(ep.Addresses[0],podIP)&&
*ep.Conditions.Ready&&
*ep.Conditions.Serving&&
!*ep.Conditions.Terminating
}
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.