For instance, ephemeral nodes with only IPv6 addresses can now
SOCKS5-dial out to names like "foo" and resolve foo's IPv6 address
rather than foo's IPv4 address and get a "no route"
(*tcpip.ErrNoRoute) error from netstack's dialer.
Per https://github.com/tailscale/tailscale/issues/2268#issuecomment-870027626
which is only part of the isuse.
Updates #2268
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
We also have to make a one-off change to /etc/wsl.conf to stop every
invocation of wsl.exe clobbering the /etc/resolv.conf. This appears to
be a safe change to make permanently, as even though the resolv.conf is
constantly clobbered, it is always the same stable internal IP that is
set as a nameserver. (I believe the resolv.conf clobbering predates the
MS stub resolver.)
Tested on WSL2, should work for WSL1 too.
Fixes#775
Signed-off-by: David Crawshaw <crawshaw@tailscale.com>
This is preliminary work for using the directManager as
part of a wslManager on windows, where in addition to configuring
windows we'll use wsl.exe to edit the linux file system and modify the
system resolv.conf.
The pinholeFS is a little funky, but it's designed to work through
simple unix tools via wsl.exe without invoking bash. I would not have
thought it would stand on its own like this, but it turns out it's
useful for writing a test for the directManager.
Signed-off-by: David Crawshaw <crawshaw@tailscale.com>
After allowing for custom DERP maps, it's convenient to be able to see their latency in
netcheck. This adds a query to the local tailscaled for the current DERPMap.
Updates #1264
Signed-off-by: julianknodt <julianknodt@gmail.com>
Turns out we never reliably log the control plane URL a client connects
to. Do it here, and include the server public key, which might
inadvertently tell us something interesting some day.
Signed-off-by: David Crawshaw <crawshaw@tailscale.com>
This is an experiment to see how often this test would fail if we run it
on every commit. This depends on #2145 to fix a flaky part of the test.
Signed-off-by: Christine Dodrill <xe@tailscale.com>
Okay, so, at a high level testing NixOS is a lot different than
other distros due to NixOS' determinism. Normally NixOS wants packages to
be defined in either an overlay, a custom packageOverrides or even
yolo-inline as a part of the system configuration. This is going to have
us take a different approach compared to other distributions. The overall
plan here is as following:
1. make the binaries as normal
2. template in their paths as raw strings to the nixos system module
3. run `nixos-generators -f qcow -o $CACHE_DIR/tailscale/nixos/version -c generated-config.nix`
4. pass that to the steps that make the virtual machine
It doesn't really make sense for us to use a premade virtual machine image
for this as that will make it harder to deterministically create the image.
Nix commands generate a lot of output, so their output is hidden behind the
`-verbose-nix-output` flag.
This unfortunately makes this test suite have a hard dependency on
Nix/NixOS, however the test suite has only ever been run on NixOS (and I
am not sure if it runs on other distros at all), so this probably isn't too
big of an issue.
Signed-off-by: Christine Dodrill <xe@tailscale.com>
This has been bothering me for a while, but everytime I run format from the root directory
it also formats this file. I didn't want to add it to my other PRs but it's annoying to have to
revert it every time.
Signed-off-by: julianknodt <julianknodt@gmail.com>
Move derpmap.Prod to a static JSON file (go:generate'd) instead,
to make its role explicit. And add a TODO about making dnsfallback
use an update-over-time DERP map file instead of a baked-in one.
Updates #1264
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Previously this test would reach out to the public DERP servers in order
to help machines connect with eachother. This is not ideal given our
plans to run these tests completely disconnected from the internet. This
patch introduces an in-process DERP server running on its own randomly
assigned HTTP port.
Updates #1988
Signed-off-by: Christine Dodrill <xe@tailscale.com>
Occasionally the test framework would fail with a timeout due to a
virtual machine not phoning home in time. This seems to be happen
whenever qemu can't bind the VNC or SSH ports for a virtual machine.
This was fixed by taking the following actions:
1. Don't listen on VNC unless the `-use-vnc` flag is passed, this
removes the need to listen on VNC at all in most cases. The option to
use VNC is still left in for debugging virtual machines, but removing
this makes it easier to deal with (VNC uses this odd system of
"displays" that are mapped to ports above 5900, and qemu doesn't
offer a decent way to use a normal port number, so we just disable
VNC by default as a compromise).
2. Use a (hopefully) inactive port for SSH. In an ideal world I'd just
have the VM's SSH port be exposed via a Unix socket, however the QEMU
documentation doesn't really say if you can do this or not. While I
do more research, this stopgap will have to make do.
3. Strictly tie more VM resource lifetimes to the tests themselves.
Previously the disk image layers for virtual machines were only
cleaned up at the end of the test and existed in the parent
test-scoped temporary folder. This can make your tmpfs run out of
space, which is not ideal. This should minimize the use of temporary
storage as much as I know how to.
4. Strictly tie the qemu process lifetime to the lifetime of the test
using testing.T#Cleanup. Previously it used a defer statement to
clean up the qemu process, however if the tests timed out this defer
was not run. This left around an orphaned qemu process that had to be
killed manually. This change ensures that all qemu processes exit
when their relevant tests finish.
Signed-off-by: Christine Dodrill <xe@tailscale.com>
Fix regression from 19c3e6cc9e
which made the locking coarser.
Found while debugging #2245, which ended up looking like a tswin/Windows
issue where Crawshaw had blocked cmd.exe's output.
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This uses a debug envvar to optionally disable filter logging rate
limits by setting the environment variable
TS_DEBUG_FILTER_RATE_LIMIT_LOGS to "all", and if it matches,
the code will effectively disable the limits on the log rate by
setting the limit to 1 millisecond. This should make sure that all
filter logs will be captured.
Signed-off-by: Christine Dodrill <xe@tailscale.com>
This change (subject to some limitations) looks for the EDNS OPT record
in queries and responses, clamping the size field to fit within our DNS
receive buffer. If the size field is smaller than the DNS receive buffer
then it is left unchanged.
I think we will eventually need to transition to fully processing the
DNS queries to handle all situations, but this should cover the most
common case.
Mostly fixes#2066
Signed-off-by: Adrian Dewhurst <adrian@tailscale.com>
This adds a flag to the DERP server which specifies to verify clients through a local
tailscaled. It is opt-in, so should not affect existing clients, and is mainly intended for
users who want to run their own DERP servers. It assumes there is a local tailscaled running and
will attempt to hit it for peer status information.
Updates #1264
Signed-off-by: julianknodt <julianknodt@gmail.com>
Unused so far, but eventually we'll want this for SOCKS5 UDP binds (we
currently only do TCP with SOCKS5), and also for #2102 for forwarding
MagicDNS upstream to Tailscale IPs over netstack.
Updates #2102
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Windows 8.1 incorrectly handles search paths on an interface with no
associated resolver, so we have to provide a full primary DNS config
rather than use Windows 8.1's nascent-but-present NRPT functionality.
Fixes#2237.
Signed-off-by: David Anderson <danderson@tailscale.com>
This adds a flag to derp maps which specifies that default Tailscale DERP servers should not be
used. If true and there are entries in this map, it indicates that the entries in this map
should take precedent and not hit any of tailscale's DERP servers.
This change is backwards compatible, as the default behavior should be false.
Updates #1264
Signed-off-by: julianknodt <julianknodt@gmail.com>
In order to clone DERPMaps, it was necessary to extend the cloner so that it supports
nested pointers inside of maps which are also cloneable. This also adds cloning for DERPRegions
and DERPNodes because they are on DERPMap's maps.
Signed-off-by: julianknodt <julianknodt@gmail.com>
The only connectivity an AWS Lambda container has is an IPv4 link-local
169.254.x.x address using NAT:
12: vtarget_1@if11: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500
qdisc noqueue state UP group default qlen 1000
link/ether 7e:1c:3f:00:00:00 brd ff:ff:ff:ff:ff:ff link-netnsid 1
inet 169.254.79.1/32 scope global vtarget_1
valid_lft forever preferred_lft forever
If there are no other IPv4/v6 addresses available, and we are running
in AWS Lambda, allow IPv4 169.254.x.x addresses to be used.
----
Similarly, a Google Cloud Run container's only connectivity is
a Unique Local Address fddf:3978:feb1:d745::c001/128.
If there are no other addresses available then allow IPv6
Unique Local Addresses to be used.
We actually did this in an earlier release, but now refactor it to
work the same way as the IPv4 link-local support is being done.
Signed-off-by: Denton Gentry <dgentry@tailscale.com>
Before it was using the local address and port, so fix that.
The fields in the response from `ss` are:
State, Recv-Q, Send-Q, Local Address:Port, Peer Address:Port, Process
Signed-off-by: julianknodt <julianknodt@gmail.com>
This adds a handler on the DERP server for logging bytes send and received by clients of the
server, by holding open a connection and recording if there is a difference between the number
of bytes sent and received. It sends a JSON marshalled object if there is an increase in the
number of bytes.
Signed-off-by: julianknodt <julianknodt@gmail.com>
Split out of Denton's #2164, to make that diff smaller to review.
This change has no behavior changes.
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Previously we used t.Logf indirectly via package log. This worked, but
it was not ideal for our needs. It could cause the streams of output to
get crossed. This change uses a logger.FuncWriter every place log.Output
was previously used, which will more correctly write log information to
the right test output stream.
Signed-off-by: Christine Dodrill <xe@tailscale.com>
It's possible to install a configuration that passes our current checks
for systemd-resolved, without actually pointing to systemd-resolved. In
that case, we end up programming DNS in resolved, but that config never
applies to any name resolution requests on the system.
This is quite a far-out edge case, but there's a simple additional check
we can do: if the header comment names systemd-resolved, there should be
a single nameserver in resolv.conf pointing to 127.0.0.53. If not, the
configuration should be treated as an unmanaged resolv.conf.
Fixes#2136.
Signed-off-by: David Anderson <danderson@tailscale.com>
The dependency is a "soft" ordering dependency only, meaning that
tailscaled will start after those services if those services were
going to be run anyway, but doesn't force either of them to run.
That's why it's safe to specify this dependency unconditionally,
even for systems that don't run those services.
Updates #2127.
Signed-off-by: David Anderson <danderson@tailscale.com>
AWS Lambda uses Docker containers but does not
have the string "docker" in its /proc/1/cgroup.
Infer AWS Lambda via the environment variables
it sets.
Signed-off-by: Denton Gentry <dgentry@tailscale.com>
It would be useful to know the time that packets spend inside of a queue before they are sent
off, as that can be indicative of the load the server is handling (and there was also an
existing TODO). This adds a simple exponential moving average metric to track the average packet
queue duration.
Changes during review:
Add CAS loop for recording queue timing w/ expvar.Func, rm snake_case, annotate in milliseconds,
convert
Signed-off-by: julianknodt <julianknodt@gmail.com>
Alpine Linux[1] is a minimal Linux distribution built around musl libc.
It boots very quickly, requires very little ram and is as close as you
can get to an ideal citizen for testing Tailscale on musl. Alpine has a
Tailscale package already[2], but this patch also makes it easier for us
to provide an Alpine Linux package off of pkgs in the future.
Alpine only offers Tailscale on the rolling-release edge branch.
[1]: https://alpinelinux.org/
[2]: https://pkgs.alpinelinux.org/packages?name=tailscale&branch=edge
Updates #1988
Signed-off-by: Christine Dodrill <xe@tailscale.com>
This fails pretty reliably with a lot of output now showing what's
happening:
TS_DEBUG_MAP=1 go test --failfast -v -run=Ping -race -count=20 ./tstest/integration --verbose-tailscaled
I haven't dug into the details yet, though.
Updates #2079
The route creation for the `tun` device was augmented in #1469 but
didn't account for adding IPv4 vs. IPv6 routes. There are 2 primary
changes as a result:
* Ensure that either `-inet` or `-inet6` was used in the
[`route(8)`](https://man.openbsd.org/route) command
* Use either the `localAddr4` or `localAddr6` for the gateway argument
depending which destination network is being added
The basis for the approach is based on the implementation from
`router_userspace_bsd.go`, including the `inet()` helper function.
Fixes#2048
References #1469
Signed-off-by: Fletcher Nichol <fnichol@nichol.ca>
This raises the maximum DNS response message size from 512 to 4095. This
should be large enough for almost all situations that do not need TCP.
We still do not recognize EDNS, so we will still forward requests that
claim support for a larger response size than 4095 (that will be solved
later). For now, when a response comes back that is too large to fit in
our receive buffer, we now set the truncation flag in the DNS header,
which is an improvement from before but will prompt attempts to use TCP
which isn't supported yet.
On Windows, WSARecvFrom into a buffer that's too small returns an error
in addition to the data. On other OSes, the extra data is silently
discarded. In this case, we prefer the latter so need to catch the error
on Windows.
Partially addresses #1123
Signed-off-by: Adrian Dewhurst <adrian@tailscale.com>
This runner is in my homelab while we muse about a better, more
permanent home for these tests.
Updates #1988
Signed-off-by: Christine Dodrill <xe@tailscale.com>
This makes integration tests pull pristine VM images from Amazon S3 if
they don't exist on disk. If the S3 fetch fails, it will fall back to
grabbing the image from the public internet. The VM images on the public
internet are known to be updated without warning and thusly change their
SHA256 checksum. This is not ideal for a test that we want to be able to
fire and forget, then run reliably for a very long time.
This requires an AWS profile to be configured at the default path. The
S3 bucket is rigged so that the requester pays. The VM images are
currently about 6.9 gigabytes. Please keep this in mind when running
these tests on your machine.
Documentation was added to the integration test folder to aid others in
running these tests on their machine.
Some wording in the logs of the tests was altered.
Updates #1988
Signed-off-by: Christine Dodrill <xe@tailscale.com>
Some downstream distros eval'd version/version.sh to get at the shell variables
within their own build process. They can now `./build_dist.sh shellvars` to get
those.
Fixes#2058.
Signed-off-by: David Anderson <dave@natulte.net>
It is a bit faster.
But more importantly, it matches upstream byte-for-byte,
which ensures there'll be no corner cases in which we disagree.
name old time/op new time/op delta
SetPeers-8 3.58µs ± 0% 3.16µs ± 2% -11.74% (p=0.016 n=4+5)
name old alloc/op new alloc/op delta
SetPeers-8 2.53kB ± 0% 2.53kB ± 0% ~ (all equal)
name old allocs/op new allocs/op delta
SetPeers-8 99.0 ± 0% 99.0 ± 0% ~ (all equal)
Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
The image downloads can take a significant amount of time for the tests.
This creates a new test that will download every distro image into the
local cache in parallel, optionally matching the distribution regex.
Updates #1988
Signed-off-by: Christine Dodrill <xe@tailscale.com>
I've run into a couple issues where the tests time out while a VM image
is being downloaded, making the cache poisoned for the next run. This
moves the hash checking into its own function and calls it much sooner
in the testing chain. If the hash check fails, the OS is redownloaded.
Signed-off-by: Christine Dodrill <xe@tailscale.com>
Most of the time qemu will output nothing when it is running. This is
expected behavior. However when qemu is unable to start due to some
problem, it prints that to either stdout or stderr. Previously this
output wasn't being captured. This patch captures that output to aid in
debugging qemu issues.
Updates #1988
Signed-off-by: Christine Dodrill <xe@tailscale.com>
We were crashing on in initPeerAPIListener when called from
authReconfig when b.netMap is nil. But authReconfig already returns
before the call to initPeerAPIListener when b.netMap is nil, but it
releases the b.mu mutex before calling initPeerAPIListener which
reacquires it and assumes it's still nil.
The only thing that can be setting it to nil is setNetMapLocked, which
is called by ResetForClientDisconnect, Logout/logout, or Start, all of
which can happen during an authReconfig.
So be more defensive.
Fixes#1996
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
We used to use "redo" for that, but it was pretty vague.
Also, fix the build tags broken in interfaces_default_route_test.go from
a9745a0b68, moving those Linux-specific
tests to interfaces_linux_test.go.
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Previously this built the binaries for every distro. This is a bit
overkill given we are using static binaries. This patch makes us only
build once.
There was also a weird issue with how processes were being managed.
Previously we just killed qemu with Process.Kill(), however that was
leaving behind zombies. This has been mended to not only kill qemu but
also waitpid() the process so it doesn't become a zombie.
Updates #1988
Signed-off-by: Christine Dodrill <xe@tailscale.com>
The OpenSUSE 15.1 image we are using (and conseqentially the only one
that is really available easily given it is EOL) has cloud-init
hardcoded to use the OpenStack metadata thingy. Other OpenSUSE Leap
images function fine with the NoCloud backend, but this one seems to
just not work with it. No bother, we can just pretend to be OpenStack.
Thanks to Okami for giving me an example OpenStack configuration seed
image.
Updates #1988
Signed-off-by: Christine Dodrill <xe@tailscale.com>
Arch is a bit of a weirder distro, however as a side effect it is much
more of a systemd purist experience. Adding it to our test suite will
make sure that we are working in the systemd happy path.
Updates #1988
Signed-off-by: Christine Dodrill <xe@tailscale.com>
This distro is about to be released. OpenSUSE has historically had the
least coverage for functional testing, so this may prove useful in the
future.
Signed-off-by: Christine Dodrill <xe@tailscale.com>
DNS names consist of labels, but outside of length limits, DNS
itself permits any content within the labels. Some records require
labels to conform to hostname limitations (which is what we implemented
before), but not all.
Fixes#2024.
Signed-off-by: David Anderson <danderson@tailscale.com>
Instead of testing all the VMs at once when they are all ready, this
patch changes the testing logic so that the vms are tested as soon as
they register with testcontrol. Also limit the amount of VM ram used at
once with the `-ram-limit` flag. That uses a semaphore to guard resource
use.
Also document CentOS' sins.
Updates #1988
Signed-off-by: Christine Dodrill <xe@tailscale.com>
The resulting empty Prefs had AllowSingleHosts=false and
Routeall=false, so that on iOS if you did these steps:
- Login and leave running
- Terminate the frontend
- Restart the frontend (fast path restart, missing prefs)
- Set WantRunning=false
- Set WantRunning=true
...then you would have Tailscale running, but with no routes. You would
also accidentally disable the ExitNodeID/IP prefs (symptom: the current
exit node setting didn't appear in the UI), but since nothing
else worked either, you probably didn't notice.
The fix was easy enough. It turns out we already knew about the
problem, so this also fixes one of the BUG entries in state_test.
Fixes: #1918 (BUG-1) and some as-yet-unreported bugs with exit nodes.
Signed-off-by: Avery Pennarun <apenwarr@tailscale.com>
Previously, there was no server round trip required to log out, so when
you asked ipnlocal to Logout(), it could clear the netmap immediately
and switch to NeedsLogin state.
In v1.8, we added a true Logout operation. ipn.Logout() would trigger
an async cc.StartLogout() and *also* immediately switch to NeedsLogin.
Unfortunately, some frontends would see NeedsLogin and immediately
trigger a new StartInteractiveLogin() operation, before the
controlclient auth state machine actually acted on the Logout command,
thus accidentally invalidating the entire logout operation, retaining
the netmap, and violating the user's expectations.
Instead, add a new LogoutFinished signal from controlclient
(paralleling LoginFinished) and, upon starting a logout, don't update
the ipn state machine until it's received.
Updates: #1918 (BUG-2)
Signed-off-by: Avery Pennarun <apenwarr@tailscale.com>
If you set `-distro-regex` to match a subset of distros, only those
distros will be tested. Ex:
$ go test -run-vm-tests -distro-regex='opensuse'
Signed-off-by: Christine Dodrill <xe@tailscale.com>
Don't try to do heuristics on the name. Use the net/interfaces package
which we already have to do this sort of stuff.
Fixes#2011
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Instead of pulling packages from pkgs.tailscale.com, we should use the
tailscale binaries that are local to this git commit. This exposes a bit
of the integration testing stack in order to copy the binaries
correctly.
This commit also bumps our version of github.com/pkg/sftp to the latest
commit.
If you run into trouble with yaml, be sure to check out the
commented-out alpine linux image complete with instructions on how to
use it.
Updates #1988
Signed-off-by: Christine Dodrill <xe@tailscale.com>
netaddr allocated at the time this was written. No longer.
name old time/op new time/op delta
TailscaleServiceAddr-4 5.46ns ± 4% 1.83ns ± 3% -66.52% (p=0.008 n=5+5)
A bunch of the others can probably be simplified too, but this
was the only one with just an IP and not an IPPrefix.
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Previously we spewed a lot of output to stdout and stderr, even when
`-v` wasn't set. This is sub-optimal for various reasons. This patch
shunts that output to test logs so it only shows up when `-v` is set.
Updates #1988
Signed-off-by: Christine Dodrill <xe@tailscale.com>
The cyolosecurity fork of certstore did not update its module name and
thus can only be used with a replace directive. This interferes with
installing using `go install` so I created a tailscale fork with an
updated module name.
Signed-off-by: Adrian Dewhurst <adrian@tailscale.com>
Instead of relying on a libvirtd bridge address that you probably won't
have on your system.
Updates #1988
Signed-off-by: Christine Dodrill <xe@tailscale.com>
On clean installs we didn't set use iptables, but during upgrades it
looks like we could use old prefs that directed us to go into the iptables
paths that might fail on Synology.
Updates #1995Fixestailscale/tailscale-synology#57 (I think)
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This will spin up a few vms and then try and make them connect to a
testcontrol server.
Updates #1988
Signed-off-by: Christine Dodrill <xe@tailscale.com>
When tailscaled starts up, these lines run:
func run() error {
// ...
pol := logpolicy.New("tailnode.log.tailscale.io")
pol.SetVerbosityLevel(args.verbose)
// ...
}
If there are old log entries present, they immediate start getting uploaded. This races with the call to pol.SetVerbosityLevel.
This manifested itself as a test failure in tailscale.com/tstest/integration
when run with -race:
WARNING: DATA RACE
Read at 0x00c0001bc970 by goroutine 24:
tailscale.com/logtail.(*Logger).Write()
/Users/josh/t/corp/oss/logtail/logtail.go:517 +0x27c
log.(*Logger).Output()
/Users/josh/go/ts/src/log/log.go:184 +0x2b8
log.Printf()
/Users/josh/go/ts/src/log/log.go:323 +0x94
tailscale.com/logpolicy.newLogtailTransport.func1()
/Users/josh/t/corp/oss/logpolicy/logpolicy.go:509 +0x36c
net/http.(*Transport).dial()
/Users/josh/go/ts/src/net/http/transport.go:1168 +0x238
net/http.(*Transport).dialConn()
/Users/josh/go/ts/src/net/http/transport.go:1606 +0x21d0
net/http.(*Transport).dialConnFor()
/Users/josh/go/ts/src/net/http/transport.go:1448 +0xe4
Previous write at 0x00c0001bc970 by main goroutine:
tailscale.com/logtail.(*Logger).SetVerbosityLevel()
/Users/josh/t/corp/oss/logtail/logtail.go:131 +0x98
tailscale.com/logpolicy.(*Policy).SetVerbosityLevel()
/Users/josh/t/corp/oss/logpolicy/logpolicy.go:463 +0x60
main.run()
/Users/josh/t/corp/oss/cmd/tailscaled/tailscaled.go:178 +0x50
main.main()
/Users/josh/t/corp/oss/cmd/tailscaled/tailscaled.go:163 +0x71c
Goroutine 24 (running) created at:
net/http.(*Transport).queueForDial()
/Users/josh/go/ts/src/net/http/transport.go:1417 +0x4d8
net/http.(*Transport).getConn()
/Users/josh/go/ts/src/net/http/transport.go:1371 +0x5b8
net/http.(*Transport).roundTrip()
/Users/josh/go/ts/src/net/http/transport.go:585 +0x7f4
net/http.(*Transport).RoundTrip()
/Users/josh/go/ts/src/net/http/roundtrip.go:17 +0x30
net/http.send()
/Users/josh/go/ts/src/net/http/client.go:251 +0x4f0
net/http.(*Client).send()
/Users/josh/go/ts/src/net/http/client.go:175 +0x148
net/http.(*Client).do()
/Users/josh/go/ts/src/net/http/client.go:717 +0x1d0
net/http.(*Client).Do()
/Users/josh/go/ts/src/net/http/client.go:585 +0x358
tailscale.com/logtail.(*Logger).upload()
/Users/josh/t/corp/oss/logtail/logtail.go:367 +0x334
tailscale.com/logtail.(*Logger).uploading()
/Users/josh/t/corp/oss/logtail/logtail.go:289 +0xec
Rather than complicate the logpolicy API,
allow the verbosity to be adjusted concurrently.
Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
Pull in the latest version of wireguard-windows.
Switch to upstream wireguard-go.
This requires reverting all of our import paths.
Unfortunately, this has to happen at the same time.
The wireguard-go change is very low risk,
as that commit matches our fork almost exactly.
(The only changes are import paths, CI files, and a go.mod entry.)
So if there are issues as a result of this commit,
the first place to look is wireguard-windows changes.
Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
We repeat many peers each time we call SetPeers.
Instead of constructing strings for them from scratch every time,
keep strings alive across iterations.
name old time/op new time/op delta
SetPeers-8 3.58µs ± 1% 2.41µs ± 1% -32.60% (p=0.000 n=9+10)
name old alloc/op new alloc/op delta
SetPeers-8 2.53kB ± 0% 1.30kB ± 0% -48.73% (p=0.000 n=10+10)
name old allocs/op new allocs/op delta
SetPeers-8 99.0 ± 0% 16.0 ± 0% -83.84% (p=0.000 n=10+10)
We could reduce alloc/op 12% and allocs/op 23% if strs had
type map[string]strCache instead of map[string]*strCache,
but that wipes out the execution time impact.
Given that re-use is the most common scenario, let's optimize for it.
Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
e66d4e4c81 added AppendTo methods
to some key types. Their marshaled form is longer than 64 bytes.
name old time/op new time/op delta
Hash-8 15.5µs ± 1% 14.8µs ± 1% -4.17% (p=0.000 n=9+9)
name old alloc/op new alloc/op delta
Hash-8 1.18kB ± 0% 0.47kB ± 0% -59.87% (p=0.000 n=10+10)
name old allocs/op new allocs/op delta
Hash-8 12.0 ± 0% 6.0 ± 0% -50.00% (p=0.000 n=10+10)
This is still a bit worse than explicitly handling the types,
but much nicer.
Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
All netaddr types that we are concerned with now implement AppendTo.
Use the AppendTo method if available, and remove all references to netaddr.
This is slower but cleaner, and more readily re-usable by others.
name old time/op new time/op delta
Hash-8 12.6µs ± 0% 14.8µs ± 1% +18.05% (p=0.000 n=8+10)
HashMapAcyclic-8 21.4µs ± 1% 21.9µs ± 1% +2.39% (p=0.000 n=10+9)
name old alloc/op new alloc/op delta
Hash-8 408B ± 0% 408B ± 0% ~ (p=1.000 n=10+10)
HashMapAcyclic-8 1.00B ± 0% 1.00B ± 0% ~ (all equal)
name old allocs/op new allocs/op delta
Hash-8 6.00 ± 0% 6.00 ± 0% ~ (all equal)
HashMapAcyclic-8 0.00 0.00 ~ (all equal)
Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
These exist so we can use the optimized MapIter APIs
while still working with released versions of Go.
They're pretty simple, but some docs won't hurt.
Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
Reduce to just a single external endpoint.
Convert from a variadic number of interfaces to a slice there.
name old time/op new time/op delta
Hash-8 14.4µs ± 0% 14.0µs ± 1% -3.08% (p=0.000 n=9+9)
name old alloc/op new alloc/op delta
Hash-8 873B ± 0% 793B ± 0% -9.16% (p=0.000 n=9+6)
name old allocs/op new allocs/op delta
Hash-8 18.0 ± 0% 14.0 ± 0% -22.22% (p=0.000 n=10+10)
Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
Slightly slower, but lots less garbage.
We will recover the speed lost in a follow-up commit.
name old time/op new time/op delta
Hash-8 13.5µs ± 1% 14.3µs ± 0% +5.84% (p=0.000 n=10+9)
name old alloc/op new alloc/op delta
Hash-8 1.46kB ± 0% 0.87kB ± 0% -40.10% (p=0.000 n=7+10)
name old allocs/op new allocs/op delta
Hash-8 43.0 ± 0% 18.0 ± 0% -58.14% (p=0.000 n=10+10)
Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
This requires changes to the Go toolchain.
The changes are upstream at https://golang.org/cl/320929.
They haven't been pulled into our fork yet.
No need to allocate new iteration scratch values for every map.
name old time/op new time/op delta
Hash-8 13.6µs ± 0% 13.5µs ± 0% -1.01% (p=0.008 n=5+5)
HashMapAcyclic-8 21.2µs ± 1% 21.1µs ± 2% ~ (p=0.310 n=5+5)
name old alloc/op new alloc/op delta
Hash-8 1.58kB ± 0% 1.46kB ± 0% -7.60% (p=0.008 n=5+5)
HashMapAcyclic-8 152B ± 0% 128B ± 0% -15.79% (p=0.008 n=5+5)
name old allocs/op new allocs/op delta
Hash-8 49.0 ± 0% 43.0 ± 0% -12.24% (p=0.008 n=5+5)
HashMapAcyclic-8 4.00 ± 0% 2.00 ± 0% -50.00% (p=0.008 n=5+5)
Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
To get the benefit of this optimization requires help from the Go toolchain.
The changes are upstream at https://golang.org/cl/320929,
and have been pulled into the Tailscale fork at
728ecc58fd.
It also requires building with the build tag tailscale_go.
name old time/op new time/op delta
Hash-8 14.0µs ± 0% 13.6µs ± 0% -2.88% (p=0.008 n=5+5)
HashMapAcyclic-8 24.3µs ± 1% 21.2µs ± 1% -12.47% (p=0.008 n=5+5)
name old alloc/op new alloc/op delta
Hash-8 2.16kB ± 0% 1.58kB ± 0% -27.01% (p=0.008 n=5+5)
HashMapAcyclic-8 2.53kB ± 0% 0.15kB ± 0% -93.99% (p=0.008 n=5+5)
name old allocs/op new allocs/op delta
Hash-8 77.0 ± 0% 49.0 ± 0% -36.36% (p=0.008 n=5+5)
HashMapAcyclic-8 202 ± 0% 4 ± 0% -98.02% (p=0.008 n=5+5)
Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
setkey
The acyclic map code interacts badly with netaddr.IPs.
One of the netaddr.IP fields is an *intern.Value,
and we use a few sentinel values.
Those sentinel values make many of the netaddr data structures appear cyclic.
One option would be to replace the cycle-detection code with
a Floyd-Warshall style algorithm. The downside is that this will take
longer to detect cycles, particularly if the cycle is long.
This problem is exacerbated by the fact that the acyclic cycle detection
code shares a single visited map for the entire data structure,
not just the subsection of the data structure localized to the map.
Unfortunately, the extra allocations and work (and code) to use per-map
visited maps make this option not viable.
Instead, continue to special-case netaddr data types.
name old time/op new time/op delta
Hash-8 22.4µs ± 0% 14.0µs ± 0% -37.59% (p=0.008 n=5+5)
HashMapAcyclic-8 23.8µs ± 0% 24.3µs ± 1% +1.75% (p=0.008 n=5+5)
name old alloc/op new alloc/op delta
Hash-8 2.49kB ± 0% 2.16kB ± 0% ~ (p=0.079 n=4+5)
HashMapAcyclic-8 2.53kB ± 0% 2.53kB ± 0% ~ (all equal)
name old allocs/op new allocs/op delta
Hash-8 86.0 ± 0% 77.0 ± 0% -10.47% (p=0.008 n=5+5)
HashMapAcyclic-8 202 ± 0% 202 ± 0% ~ (all equal)
Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
Hash and xor each entry instead, then write final xor'ed result.
name old time/op new time/op delta
Hash-4 33.6µs ± 4% 34.6µs ± 3% +3.03% (p=0.013 n=10+9)
name old alloc/op new alloc/op delta
Hash-4 1.86kB ± 0% 1.77kB ± 0% -5.10% (p=0.000 n=10+9)
name old allocs/op new allocs/op delta
Hash-4 51.0 ± 0% 49.0 ± 0% -3.92% (p=0.000 n=10+10)
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
At the start of a dev cycle we'll upgrade all dependencies.
Done with:
$ for Dep in $(cat go.mod | perl -ne '/(\S+) v/ and print "$1\n"'); do go get $Dep@upgrade; done
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Our wireguard-go fork used different values from upstream for
package device's memory limits on iOS.
This was the last blocker to removing our fork.
These values are now vars rather than consts for iOS.
c27ff9b9f6
Adjust them on startup to our preferred values.
Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
Typical maps in production are considerably longer.
This helps benchmarks more accurately reflect the costs per key
vs the costs per map in deephash.
Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
A couple of code paths in ipnserver use a NewBackendServer with a nil
backend just to call the callback with an encapsulated error message.
This covers a panic case seen in logs.
For #1920
Signed-off-by: David Crawshaw <crawshaw@tailscale.com>
This leads to a cleaner separation of intent vs. implementation
(Routes is now the only place specifying who handles DNS requests),
and allows for cleaner expression of a configuration that creates
MagicDNS records without serving them to the OS.
Signed-off-by: David Anderson <danderson@tailscale.com>
* Added new Addresses / AllowedIPs fields to testcontrol when creating new &tailcfg.Node
Signed-off-by: Simeng He <simeng@tailscale.com>
* Added single node test to check Addresses and AllowedIPs
Signed-off-by: Simeng He <simeng@tailscale.com>
Co-authored-by: Simeng He <simeng@tailscale.com>
The script detects one of the supported OS/version combos, and issues
the right install instructions for it.
Co-authored-by: Christine Dodrill <xe@tailscale.com>
Signed-off-by: David Anderson <danderson@tailscale.com>
If --until-direct is set, the goal is to make a direct connection.
If we failed at that, say so, and exit with an error.
RELNOTE=tailscale ping --until-direct (the default) now exits with
a non-zero exit code if no direct connection was established.
Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
This code path is very tricky since it was originally designed for the
"re-authenticate to refresh my keys" use case, which didn't want to
lose the original session even if the refresh cycle failed. This is why
it acts differently from the Logout(); Login(); case.
Maybe that's too fancy, considering that it probably never quite worked
at all, for switching between users without logging out first. But it
works now.
This was more invasive than I hoped, but the necessary fixes actually
removed several other suspicious BUG: lines from state_test.go, so I'm
pretty confident this is a significant net improvement.
Fixestailscale/corp#1756.
Signed-off-by: Avery Pennarun <apenwarr@tailscale.com>
If the engine was shutting down from a previous session
(e.closing=true), it would return an error code when trying to get
status. In that case, ipnlocal would never unblock any callers that
were waiting on the status.
Not sure if this ever happened in real life, but I accidentally
triggered it while writing a test.
Signed-off-by: Avery Pennarun <apenwarr@tailscale.com>
magicsock.Conn.ParseEndpoint requires a peer's public key,
disco key, and legacy ip/ports in order to do its job.
We currently accomplish that by:
* adding the public key in our wireguard-go fork
* encoding the disco key as magic hostname
* using a bespoke comma-separated encoding
It's a bit messy.
Instead, switch to something simpler: use a json-encoded struct
containing exactly the information we need, in the form we use it.
Our wireguard-go fork still adds the public key to the
address when it passes it to ParseEndpoint, but now the code
compensating for that is just a couple of simple, well-commented lines.
Once this commit is in, we can remove that part of the fork
and remove the compensating code.
Signed-off-by: Josh Bleecher Snyder <josharian@gmail.com>
The new code is ugly, but much faster and leaner.
name old time/op new time/op delta
SetPeers-8 7.81µs ± 1% 3.59µs ± 1% -54.04% (p=0.000 n=9+10)
name old alloc/op new alloc/op delta
SetPeers-8 7.68kB ± 0% 2.53kB ± 0% -67.08% (p=0.000 n=10+10)
name old allocs/op new allocs/op delta
SetPeers-8 237 ± 0% 99 ± 0% -58.23% (p=0.000 n=10+10)
Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
Because it showed up on hello profiles.
Cycle through some moderate-sized sets of peers.
This should cover the "small tweaks to netmap"
and the "up/down cycle" cases.
Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
Yes, it printed, but that was an implementation detail for hashing.
And coming optimization will make it print even less.
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Not that it matters, but we were missing a close parens.
It's cheap, so add it.
name old time/op new time/op delta
Hash-8 6.64µs ± 0% 6.67µs ± 1% +0.42% (p=0.008 n=9+10)
name old alloc/op new alloc/op delta
Hash-8 1.54kB ± 0% 1.54kB ± 0% ~ (all equal)
name old allocs/op new allocs/op delta
Hash-8 37.0 ± 0% 37.0 ± 0% ~ (all equal)
Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
The struct field names don't change within a single run,
so they are irrelevant. Use the field index instead.
name old time/op new time/op delta
Hash-8 6.52µs ± 0% 6.64µs ± 0% +1.91% (p=0.000 n=6+9)
name old alloc/op new alloc/op delta
Hash-8 1.67kB ± 0% 1.54kB ± 0% -7.66% (p=0.000 n=10+10)
name old allocs/op new allocs/op delta
Hash-8 53.0 ± 0% 37.0 ± 0% -30.19% (p=0.000 n=10+10)
Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
These show up a lot in our data structures.
name old time/op new time/op delta
Hash-8 11.5µs ± 1% 7.8µs ± 1% -32.17% (p=0.000 n=10+10)
name old alloc/op new alloc/op delta
Hash-8 1.98kB ± 0% 1.67kB ± 0% -15.73% (p=0.000 n=10+10)
name old allocs/op new allocs/op delta
Hash-8 82.0 ± 0% 53.0 ± 0% -35.37% (p=0.000 n=10+10)
Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
The sha256 hash writer doesn't implement WriteString.
(See https://github.com/golang/go/issues/38776.)
As a consequence, we end up converting many strings to []byte.
Wrapping a bufio.Writer around the hash writer lets us
avoid these conversions by using WriteString.
Using a bufio.Writer is, perhaps surprisingly, almost as cheap as using unsafe.
The reason is that the sha256 writer does internal buffering,
but doesn't do any when handed larger writers.
Using a bufio.Writer merely shifts the data copying from one buffer
to a different one.
Using a concrete type for Print and print cuts 10% off of the execution time.
name old time/op new time/op delta
Hash-8 15.3µs ± 0% 11.5µs ± 0% -24.84% (p=0.000 n=10+10)
name old alloc/op new alloc/op delta
Hash-8 2.82kB ± 0% 1.98kB ± 0% -29.57% (p=0.000 n=10+10)
name old allocs/op new allocs/op delta
Hash-8 140 ± 0% 82 ± 0% -41.43% (p=0.000 n=10+10)
Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
deepprint currently accounts for 15% of allocs in tailscaled.
This is a useful benchmark to have.
Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
On benchmark completion, we shut down the wgengine.
If we happen to poll for status during shutdown,
we get an "engine closing" error.
It doesn't hurt anything; ignore it.
Fixestailscale/corp#1776
Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
interfaces.Tailscale only returns an interface if it has at least one Tailscale
IP assigned to it. In the resolved DNS manager, when we're called upon to tear
down DNS config, the interface no longer has IPs.
Instead, look up the interface index on construction and reuse it throughout
the daemon lifecycle.
Fixes#1892.
Signed-off-by: David Anderson <dave@natulte.net>
If nobody is connected to the IPN bus, don't burn CPU & waste
allocations (causing more GC) by encoding netmaps for nobody.
This will notably help hello.ipn.dev.
Updates tailscale/corp#1773
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Without any synchronization here, the "first packet" callback can
be delayed indefinitely, while other work continues.
Since the callback starts the benchmark timer, this could skew results.
Worse, if the benchmark manages to complete before the benchmark timer begins,
it'll cause a data race with the benchmark shutdown performed by package testing.
That is what is reported in #1881.
This is a bit unfortunate, in that it means that users of TrafficGen have
to be careful to keep this callback speedy and lightweight and to avoid deadlocks.
Fixes#1881
Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
It is possible to get multiple status callbacks from an Engine.
We need to wait for at least one from each Engine.
Without limiting to one per Engine,
wait.Wait can exit early or can panic due to a negative counter.
Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
This reduces the speed with which these benchmarks exhaust their supply fds.
Not to zero unfortunately, but it's still helpful when doing long runs.
Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
This function accounted for ~1% of all allocs by tailscaled.
It is trivial to improve, so may as well.
name old time/op new time/op delta
KeyMarshalText-8 197ns ± 0% 47ns ± 0% -76.12% (p=0.016 n=4+5)
name old alloc/op new alloc/op delta
KeyMarshalText-8 200B ± 0% 80B ± 0% -60.00% (p=0.008 n=5+5)
name old allocs/op new allocs/op delta
KeyMarshalText-8 5.00 ± 0% 1.00 ± 0% -80.00% (p=0.008 n=5+5)
Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
The old way was way too fragile and had felt like it had more special
cases than normal cases. (see #1874, #1860, #1834, etc) It became very
obvious the old algorithm didn't work when we made the output be
pretty and try to show the user the command they need to run in
5ecc7c7200 for #1746)
The new algorithm is to map the prefs (current and new) back to flags
and then compare flags. This nicely handles the OS-specific flags and
the n:1 and 1:n flag:pref cases.
No change in the existing already-massive test suite, except some ordering
differences (the missing items are now sorted), but some new tests are
added for behavior that was broken before. In particular, it now:
* preserves non-pref boolean flags set to false, and preserves exit
node IPs (mapping them back from the ExitNodeID pref, as well as
ExitNodeIP),
* doesn't ignore --advertise-exit-node when doing an EditPrefs call
(#1880)
* doesn't lose the --operator on the non-EditPrefs paths (e.g. with
--force-reauth, or when the backend was not in state Running).
Fixes#1880
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Needed for the "up checker" to map back from exit node stable IDs (the
ipn.Prefs.ExitNodeID) back to an IP address in error messages.
But also previously requested so people can use it to then make API
calls. The upcoming "tailscale admin" subcommand will probably need it
too.
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This reverts commit 7d16c8228b.
I have no idea how I ended up here. The bug I was fixing with this change
fails to reproduce on Ubuntu 18.04 now, and this change definitely does
break 20.04, 20.10, and Debian Buster. So, until we can reliably reproduce
the problem this was meant to fix, reverting.
Part of #1875
Signed-off-by: David Anderson <dave@natulte.net>
The --advertise-routes and --advertise-exit-node flags both mutating
one pref is the gift that keeps on giving.
I need to rewrite the this up warning code to first map prefs back to
flag values and then just compare flags instead of comparing prefs,
but this is the minimal fix for now.
This also includes work on the tests, to make them easier to write
(and more accurate), by letting you write the flag args directly and
have that parse into the upArgs/MaskedPrefs directly, the same as the
code, rather than them being possibly out of sync being written by
hand.
Fixes https://twitter.com/EXPbits/status/1390418145047887877
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Fields rename only.
Part of the general effort to make our code agnostic about endpoint formatting.
It's just a name, but it will soon be a misleading one; be more generic.
Do this as a separate commit because it generates a lot of whitespace changes.
Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
Upstream wireguard-go renamed the interface method
from CreateEndpoint to ParseEndpoint.
I missed some comments. Fix them.
Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
By using conn.NewDefaultBind, this test requires that our endpoints
be comprehensible to wireguard-go. Instead, use a no-op bind that
treats endpoints as opaque strings.
Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
Instead of calling ParseHex, do the hex.Decode directly.
name old time/op new time/op delta
UnmarshalJSON-8 86.9ns ± 0% 42.6ns ± 0% -50.94% (p=0.000 n=15+14)
name old alloc/op new alloc/op delta
UnmarshalJSON-8 128B ± 0% 0B -100.00% (p=0.000 n=15+15)
name old allocs/op new allocs/op delta
UnmarshalJSON-8 2.00 ± 0% 0.00 -100.00% (p=0.000 n=15+15)
Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
Legacy endpoints (addrSet) currently reconstruct their dst string when requested.
Instead, store the dst string we were given to begin with.
In addition to being simpler and cheaper, this makes less code
aware of how to interpret endpoint strings.
Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
Prefer the error from the actual wireguard-go device method call,
not {To,From}UAPI, as those tend to be less interesting I/O errors.
Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
When wireguard-go's UAPI interface fails with an error, ReconfigDevice hangs.
Fix that by buffering the channel and closing the writer after the call.
The code now matches the corresponding code in DeviceConfig, where I got it right.
Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
It is unused, and has been since early Feb 2021 (Tailscale 1.6).
We can't get delete the DeviceOptions entirely yet;
first #1831 and #1839 need to go in, along with some wireguard-go changes.
Deleting this chunk of code now will make the later commits more clearly correct.
Pingers can now go too.
Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
The earlier eb06ec172f fixed
the flaky SSH issue (tailscale/corp#1725) by making sure that packets
addressed to Tailscale IPs in hybrid netstack mode weren't delivered
to netstack, but another issue remained:
All traffic handled by netstack was also potentially being handled by
the host networking stack, as the filter hook returned "Accept", which
made it keep processing. This could lead to various random racey chaos
as a function of OS/firewalls/routes/etc.
Instead, once we inject into netstack, stop our caller's packet
processing.
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Whenever we dropped a packet due to ACLs, wireguard-go was logging:
Failed to write packet to TUN device: packet dropped by filter
Instead, just lie to wireguard-go and pretend everything is okay.
Fixes#1229
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Upstream wireguard-go renamed the interface method
from CreateEndpoint to ParseEndpoint.
I updated the log call site but not the allowlist.
Signed-off-by: Josh Bleecher Snyder <josharian@gmail.com>
#1817 removed the only place in our CI where we executed our benchmark code.
Fix that by executing it everywhere.
The benchmarks are generally cheap and fast,
so this should add minimal overhead.
Signed-off-by: Josh Bleecher Snyder <josharian@gmail.com>
To prevent issues like #1786, run staticcheck on the primary GOOSes:
linux, mac, and windows.
Windows also has a fair amount of GOARCH-specific code.
If we ever have GOARCH staticcheck failures on other GOOSes,
we can expand the test matrix further.
This requires installing the staticcheck binary so that
we can execute it with different GOOSes.
Signed-off-by: Josh Bleecher Snyder <josharian@gmail.com>
This is needed because the original opts.Prefs field was at some point
subverted for use in frontend->backend state migration for backward
compatibility on some platforms. We still need that feature, but we
also need the feature of providing the full set of prefs from
`tailscale up`, *not* including overwriting the prefs.Persist keys, so
we can't use the original field from `tailscale up`.
`tailscale up` had attempted to compensate for that by doing SetPrefs()
before Start(), but that violates the ipn.Backend contract, which says
you should call Start() before anything else (that's why it's called
Start()). As a result, doing SetPrefs({ControlURL=...,
WantRunning=true}) would cause a connection to the *previous* control
server (because WantRunning=true), and then connect to the *new*
control server only after running Start().
This problem may have been avoided before, but only by pure luck.
It turned out to be relatively harmless since the connection to the old
control server was immediately closed and replaced anyway, but it
created a race condition that could have caused spurious notifications
or rejected keys if the server responded quickly.
As already covered by existing TODOs, a better fix would be to have
Start() get out of the business of state migration altogether. But
we're approaching a release so I want to make the minimum possible fix.
Fixes#1840.
Signed-off-by: Avery Pennarun <apenwarr@tailscale.com>
We were over-eager in running tailscale in GUI mode.
f42ded7acf fixed that by
checking for a variety of shell-ish env vars and using those
to force us into CLI mode.
However, for reasons I don't understand, those shell env vars
are present when Xcode runs Tailscale.app on my machine.
(I've changed no configs, modified nothing on a brand new machine.)
Work around that by adding an additional "only in GUI mode" check.
Signed-off-by: Josh Bleecher Snyder <josharian@gmail.com>
I was going to write a test for this using the tstest/integration test
stuff, but the testcontrol implementation isn't quite there yet (it
always registers nodes and doesn't provide AuthURLs). So, manually
tested for now.
Fixes#1843
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Per discussion, we want to have only one test assertion library,
and we want to start by exploring quicktest.
This was a mostly mechanical translation.
I think we could make this nicer by defining a few helper
closures at the beginning of the test. Later.
Signed-off-by: Josh Bleecher Snyder <josharian@gmail.com>
This fixes#1833 in two ways:
* stop setting NoSNAT on non-Linux. It only matters on Linux and the flag
is hidden on non-Linux, but the code was still setting it. Because of
that, the new pref-reverting safety checks were failing when it was
changing.
* Ignore the two Linux-only prefs changing on non-Linux.
Fixes#1833
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
There's no need to warn that it was not provided on the command line
after doing a sequence of up; logout; up --args. If you're asking for
tailscale to be up, you always mean that you prefer LoggedOut to become
false.
Fixes#1828
Signed-off-by: Avery Pennarun <apenwarr@tailscale.com>
Prior to wireguard-go using printf-style logging,
all wireguard-go logging occurred using format string "%s".
We fixed that but continued to use %s when we rewrote
peer identifiers into Tailscale style.
This commit removes that %sl, which makes rate limiting work correctly.
As a happy side-benefit, it should generate less garbage.
Instead of replacing all wireguard-go peer identifiers
that might occur anywhere in a fully formatted log string,
assume that they only come from args.
Check all args for things that look like *device.Peers
and replace them with appropriately reformatted strings.
There is a variety of ways that this could go wrong
(unusual format verbs or modifiers, peer identifiers
occurring as part of a larger printed object, future API changes),
but none of them occur now, are likely to be added,
or would be hard to work around if they did.
Signed-off-by: Josh Bleecher Snyder <josharian@gmail.com>
The "stop phrases" we use all occur in wireguard-go in the format string.
We can avoid doing a bunch of fmt.Sprintf work when they appear.
Signed-off-by: Josh Bleecher Snyder <josharian@gmail.com>
This removes the NewLocalBackendWithClientGen constructor added in
b4d04a065f and instead adds
LocalBackend.SetControlClientGetterForTesting, mirroring
LocalBackend.SetHTTPTestClient. NewLocalBackendWithClientGen was
weird in being exported but taking an unexported type. This was noted
during code review:
https://github.com/tailscale/tailscale/pull/1818#discussion_r623155669
which ended in:
"I'll leave it for y'all to clean up if you find some way to do it elegantly."
This is more idiomatic.
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Without this, macOS would fail to display its menu state correctly if you
started it while !WantRunning. It relies on the netmap in order to show
the logged-in username.
Signed-off-by: Avery Pennarun <apenwarr@tailscale.com>
There was logic that would make a "down" tailscale backend (ie.
!WantRunning) refuse to do any network activity. Unfortunately, this
makes the macOS and iOS UI unable to render correctly if they start
while !WantRunning.
Now that we have Prefs.LoggedOut, use that instead. So `tailscale down`
will still allow the controlclient to connect its authroutine, but
pause the maproutine. `tailscale logout` will entirely stop all
activity.
This new behaviour is not obviously correct; it's a bit annoying that
`tailsale down` doesn't terminate all activity like you might expect.
Maybe we should redesign the UI code to render differently when
disconnected, and then revert this change.
Signed-off-by: Avery Pennarun <apenwarr@tailscale.com>
EditPrefs should be just a wrapper around the action of changing prefs,
but someone had added a side effect of calling Login() sometimes. The
side effect happened *after* running the state machine, which would
sometimes result in us going into NeedsLogin immediately before calling
cc.Login().
This manifested as the macOS app not being able to Connect if you
launched it with LoggedOut=false and WantRunning=false. Trying to
Connect() would sent us to the NeedsLogin state instead.
Signed-off-by: Avery Pennarun <apenwarr@tailscale.com>
- Switch to our own simpler token bucket, since x/time/rate is missing
necessary stuff (can't provide your own time func; can't check the
current bucket contents) and it's overkill anyway.
- Add tests that actually include advancing time.
- Don't remove the rate limit on a message until there's enough room to
print at least two more of them. When we do, we'll also print how
many we dropped, as a contextual reminder that some were previously
lost. (This is more like how the Linux kernel does it.)
- Reformat the [RATE LIMITED] messages to be shorter, and to not
corrupt original message. Instead, we print the message, then print
its format string.
- Use %q instead of \"%s\", for more accurate parsing later, if the
format string contained quotes.
Fixes#1772
Signed-off-by: Avery Pennarun <apenwarr@tailscale.com>
A very long unit test that verifies the way the controlclient and
ipn.Backend interact.
This is a giant sequential test of the state machine. The test passes,
but only because it's asserting all the wrong behaviour. I marked all
the behaviour I think is wrong with BUG comments, and several
additional test opportunities with TODO.
Note: the new test supercedes TestStartsInNeedsLoginState, which was
checking for incorrect behaviour (although the new test still checks
for the same incorrect behaviour) and assumed .Start() would converge
before returning, which it happens to do, but only for this very
specific case, for the current implementation. You're supposed to wait
for the notifications.
Updates: tailscale/corp#1660
Signed-off-by: Avery Pennarun <apenwarr@tailscale.com>
With this change, shared node names resolve correctly on split DNS-supporting
operating systems.
Fixestailscale/corp#1706
Signed-off-by: David Anderson <danderson@tailscale.com>
Only minimal tailscale + tailscaled for now.
And a super minimal in-memory logcatcher.
No control ... yet.
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Pointer receivers used with MarshalJSON are code rakes.
https://github.com/golang/go/issues/22967https://github.com/dominikh/go-tools/issues/911
I just stepped on one, and it hurt. Turn it over.
While we're here, optimize the code a bit.
name old time/op new time/op delta
MarshalJSON-8 184ns ± 0% 44ns ± 0% -76.03% (p=0.000 n=20+19)
name old alloc/op new alloc/op delta
MarshalJSON-8 184B ± 0% 80B ± 0% -56.52% (p=0.000 n=20+20)
name old allocs/op new allocs/op delta
MarshalJSON-8 4.00 ± 0% 1.00 ± 0% -75.00% (p=0.000 n=20+20)
Signed-off-by: Josh Bleecher Snyder <josharian@gmail.com>
For historical reasons, we ended up with two near-duplicate
copies of curve25519 key types, one in the wireguard-go module
(wgcfg) and one in the tailscale module (types/wgkey).
Then we moved wgcfg to the tailscale module.
We can now remove the wgcfg key type in favor of wgkey.
Signed-off-by: Josh Bleecher Snyder <josharian@gmail.com>
One of the consequences of the bind refactoring in 6f23087175
is that attempting to bind an IPv6 socket will always
result in c.pconn6.pconn being non-nil.
If the bind fails, it'll be set to a placeholder packet conn
that blocks forever.
As a result, we can always run ReceiveIPv6 and health check it.
This removes IPv4/IPv6 asymmetry and also will allow health checks
to detect any IPv6 receive func failures.
Signed-off-by: Josh Bleecher Snyder <josharian@gmail.com>
It must be an IP address; enforce that at the type level.
Suggested-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Signed-off-by: Josh Bleecher Snyder <josharian@gmail.com>
We had two separate code paths for the initial UDP listener bind
and any subsequent rebinds.
IPv6 got left out of the rebind code.
Rather than duplicate it there, unify the two code paths.
Then improve the resulting code:
* Rebind had nested listen attempts to try the user-specified port first,
and then fall back to :0 if that failed. Convert that into a loop.
* Initial bind tried only the user-specified port.
Rebind tried the user-specified port and 0.
But there are actually three ports of interest:
The one the user specified, the most recent port in use, and 0.
We now try all three in order, as appropriate.
* In the extremely rare case in which binding to port 0 fails,
use a dummy net.PacketConn whose reads block until close.
This will keep the wireguard-go receive func goroutine alive.
As a pleasant side-effect of this, if we decide that
we need to resuscitate #1796, it will now be much easier.
Fixes#1799
Co-authored-by: David Anderson <danderson@tailscale.com>
Signed-off-by: Josh Bleecher Snyder <josharian@gmail.com>
Assume it'll stay at 0 forever, so hard-code it
and delete code conditional on it being non-0.
Signed-off-by: Josh Bleecher Snyder <josharian@gmail.com>
It was set to context.Background by all callers, for the same reasons.
Set it locally instead, to simplify call sites.
Signed-off-by: Josh Bleecher Snyder <josharian@gmail.com>
For when we need to tweak behavior or errors as a function of which of
3 macOS Tailscale variants we're using. (more accessors coming later
as needed)
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
The old implementation knew too much about how wireguard-go worked.
As a result, it missed genuine problems that occurred due to unrelated bugs.
This fourth attempt to fix the health checks takes a black box approach.
A receive func is healthy if one (or both) of these conditions holds:
* It is currently running and blocked.
* It has been executed recently.
The second condition is required because receive functions
are not continuously executing. wireguard-go calls them and then
processes their results before calling them again.
There is a theoretical false positive if wireguard-go go takes
longer than one minute to process the results of a receive func execution.
If that happens, we have other problems.
Updates #1790
Signed-off-by: Josh Bleecher Snyder <josharian@gmail.com>
They were not doing their job.
They need yet another conceptual re-think.
Start by clearing the decks.
Signed-off-by: Josh Bleecher Snyder <josharian@gmail.com>
We had a long-standing bug in which our TUN events channel
was being received from simultaneously in two places.
The first is wireguard-go.
At wgengine/userspace.go:366, we pass e.tundev to wireguard-go,
which starts a goroutine (RoutineTUNEventReader)
that receives from that channel and uses events to adjust the MTU
and bring the device up/down.
At wgengine/userspace.go:374, we launch a goroutine that
receives from e.tundev, logs MTU changes, and triggers
state updates when up/down changes occur.
Events were getting delivered haphazardly between the two of them.
We don't really want wireguard-go to receive the up/down events;
we control the state of the device explicitly by calling device.Up.
And the userspace.go loop MTU logging duplicates logging that
wireguard-go does when it received MTU updates.
So this change splits the single TUN events channel into up/down
and other (aka MTU), and sends them to the parties that ought
to receive them.
I'm actually a bit surprised that this hasn't caused more visible trouble.
If a down event went to wireguard-go but the subsequent up event
went to userspace.go, we could end up with the wireguard-go device disappearing.
I believe that this may also (somewhat accidentally) be a fix for #1790.
Signed-off-by: Josh Bleecher Snyder <josharian@gmail.com>
The intention was always that files only get written to *.partial
files and renamed at the end once fully received, but somewhere in the
process that got lost in buffered mode and *.partial files were only
being used in direct receive mode. This fix prevents WaitingFiles
from returning files that are still being transferred.
Updates tailscale/corp#1626
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
If DeleteFile fails on Windows due to another process (anti-virus,
probably) having our file open, instead leave a marker file that the
file is logically deleted, and remove it from API calls and clean it
up lazily later.
Updates tailscale/corp#1626
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
The old decay-based one took a while to converge. This new one (based
very loosely on TCP BBR) seems to converge quickly on what seems to be
the best speed.
Signed-off-by: Avery Pennarun <apenwarr@tailscale.com>
This tries to generate traffic at a rate that will saturate the
receiver, without overdoing it, even in the event of packet loss. It's
unrealistically more aggressive than TCP (which will back off quickly
in case of packet loss) but less silly than a blind test that just
generates packets as fast as it can (which can cause all the CPU to be
absorbed by the transmitter, giving an incorrect impression of how much
capacity the total system has).
Initial indications are that a syscall about every 10 packets (TCP bulk
delivery) is roughly the same speed as sending every packet through a
channel. A syscall per packet is about 5x-10x slower than that.
The whole tailscale wireguard-go + magicsock + packet filter
combination is about 4x slower again, which is better than I thought
we'd do, but probably has room for improvement.
Note that in "full" tailscale, there is also a tundev read/write for
every packet, effectively doubling the syscall overhead per packet.
Given these numbers, it seems like read/write syscalls are only 25-40%
of the total CPU time used in tailscale proper, so we do have
significant non-syscall optimization work to do too.
Sample output:
$ GOMAXPROCS=2 go test -bench . -benchtime 5s ./cmd/tailbench
goos: linux
goarch: amd64
pkg: tailscale.com/cmd/tailbench
cpu: Intel(R) Core(TM) i7-4785T CPU @ 2.20GHz
BenchmarkTrivialNoAlloc/32-2 56340248 93.85 ns/op 340.98 MB/s 0 %lost 0 B/op 0 allocs/op
BenchmarkTrivialNoAlloc/124-2 57527490 99.27 ns/op 1249.10 MB/s 0 %lost 0 B/op 0 allocs/op
BenchmarkTrivialNoAlloc/1024-2 52537773 111.3 ns/op 9200.39 MB/s 0 %lost 0 B/op 0 allocs/op
BenchmarkTrivial/32-2 41878063 135.6 ns/op 236.04 MB/s 0 %lost 0 B/op 0 allocs/op
BenchmarkTrivial/124-2 41270439 138.4 ns/op 896.02 MB/s 0 %lost 0 B/op 0 allocs/op
BenchmarkTrivial/1024-2 36337252 154.3 ns/op 6635.30 MB/s 0 %lost 0 B/op 0 allocs/op
BenchmarkBlockingChannel/32-2 12171654 494.3 ns/op 64.74 MB/s 0 %lost 1791 B/op 0 allocs/op
BenchmarkBlockingChannel/124-2 12149956 507.8 ns/op 244.17 MB/s 0 %lost 1792 B/op 1 allocs/op
BenchmarkBlockingChannel/1024-2 11034754 528.8 ns/op 1936.42 MB/s 0 %lost 1792 B/op 1 allocs/op
BenchmarkNonlockingChannel/32-2 8960622 2195 ns/op 14.58 MB/s 8.825 %lost 1792 B/op 1 allocs/op
BenchmarkNonlockingChannel/124-2 3014614 2224 ns/op 55.75 MB/s 11.18 %lost 1792 B/op 1 allocs/op
BenchmarkNonlockingChannel/1024-2 3234915 1688 ns/op 606.53 MB/s 3.765 %lost 1792 B/op 1 allocs/op
BenchmarkDoubleChannel/32-2 8457559 764.1 ns/op 41.88 MB/s 5.945 %lost 1792 B/op 1 allocs/op
BenchmarkDoubleChannel/124-2 5497726 1030 ns/op 120.38 MB/s 12.14 %lost 1792 B/op 1 allocs/op
BenchmarkDoubleChannel/1024-2 7985656 1360 ns/op 752.86 MB/s 13.57 %lost 1792 B/op 1 allocs/op
BenchmarkUDP/32-2 1652134 3695 ns/op 8.66 MB/s 0 %lost 176 B/op 3 allocs/op
BenchmarkUDP/124-2 1621024 3765 ns/op 32.94 MB/s 0 %lost 176 B/op 3 allocs/op
BenchmarkUDP/1024-2 1553750 3825 ns/op 267.72 MB/s 0 %lost 176 B/op 3 allocs/op
BenchmarkTCP/32-2 11056336 503.2 ns/op 63.60 MB/s 0 %lost 0 B/op 0 allocs/op
BenchmarkTCP/124-2 11074869 533.7 ns/op 232.32 MB/s 0 %lost 0 B/op 0 allocs/op
BenchmarkTCP/1024-2 8934968 671.4 ns/op 1525.20 MB/s 0 %lost 0 B/op 0 allocs/op
BenchmarkWireGuardTest/32-2 1403702 4547 ns/op 7.04 MB/s 14.37 %lost 467 B/op 3 allocs/op
BenchmarkWireGuardTest/124-2 780645 7927 ns/op 15.64 MB/s 1.537 %lost 420 B/op 3 allocs/op
BenchmarkWireGuardTest/1024-2 512671 11791 ns/op 86.85 MB/s 0.5206 %lost 411 B/op 3 allocs/op
PASS
ok tailscale.com/wgengine/bench 195.724s
Updates #414.
Signed-off-by: Avery Pennarun <apenwarr@tailscale.com>
NetworkManager fixed the bug that forced us to use NetworkManager
if it's programming systemd-resolved, and in the same release also
made NetworkManager ignore DNS settings provided for unmanaged
interfaces... Which breaks what we used to do. So, with versions
1.26.6 and above, we MUST NOT use NetworkManager to indirectly
program systemd-resolved, but thankfully we can talk to resolved
directly and get the right outcome.
Fixes#1788
Signed-off-by: David Anderson <danderson@tailscale.com>
The existing implementation was completely, embarrassingly conceptually broken.
We aren't able to see whether wireguard-go's receive function goroutines
are running or not. All we can do is model that based on what we have done.
This commit fixes that model.
Fixes#1781
Signed-off-by: Josh Bleecher Snyder <josharian@gmail.com>
Avery reported a sub-ms health transition from "receiveIPv4 not running" to "ok".
To avoid these transient false-positives, be more precise about
the expected lifetime of receive funcs. The problematic case is one in which
they were started but exited prior to a call to connBind.Close.
Explicitly represent started vs running state, taking care with the order of updates.
Signed-off-by: Josh Bleecher Snyder <josharian@gmail.com>
The connection failure diagnostic code was never updated enough for
exit nodes, so disable its misleading output when the node it picks
(incorrectly) to diagnose is only an exit node.
Fixes#1754
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
The new "tailscale up" checks previously didn't protect against
--advertise-exit-node being omitted in the case that
--advertise-routes was also provided. It wasn't done before because
there is no corresponding pref for "--advertise-exit-node"; it's a
helper flag that augments --advertise-routes. But that's an
implementation detail and we can still help users. We just have to
special case that pref and look whether the current routes include
both the v4 and v6 /0 routes.
Fixes#1767
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This doesn't make --operator implicit (which we might do in the
future), but it at least doesn't require repeating it in the future
when it already matches $USER.
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
It was getting cleared on notify.
Document that authURL is cleared on notify and add a new field that
isn't, using the new field for the JSON status.
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
I've spent two days searching for a theoretical wireguard-go bug
around receive functions exiting early.
I've found many bugs, but none of the flavor we're looking for.
Restore wireguard-go's logging around starting and stopping receive functions,
so that we can definitively rule in or out this particular theory.
Signed-off-by: Josh Bleecher Snyder <josharian@gmail.com>
I see a bunch of these in some logs I'm looking at,
separated only by a few seconds.
Log the error so we can tell what's going on here.
Signed-off-by: Josh Bleecher Snyder <josharian@gmail.com>
These were getting rate-limited for nodes with many peers.
Consolate the output into single lines, which are nicer anyway.
Signed-off-by: Josh Bleecher Snyder <josharian@gmail.com>
With this change, the ipnserver's safesocket.Listen (the localhost
tcp.Listen) happens right away, before any synchronous
TUN/DNS/Engine/etc setup work, which might be slow, especially on
early boot on Windows.
Because the safesocket.Listen starts up early, that means localhost
TCP dials (the safesocket.Connect from the GUI) complete successfully
and thus the GUI avoids the MessageBox error. (I verified that
pacifies it, even without a Listener.Accept; I'd feared that Windows
localhost was maybe special and avoided the normal listener backlog).
Once the GUI can then connect immediately without errors, the various
timeouts then matter less, because the backend is no longer trying to
race against the GUI's timeout. So keep retrying on errors for a
minute, or 10 minutes if the system just booted in the past 10
minutes.
This should fix the problem with Windows 10 desktops auto-logging in
and starting the Tailscale frontend which was then showing a
MessageBox error about failing to connect to tailscaled, which was
slow coming up because the Windows networking stack wasn't up
yet. Fingers crossed.
Fixes#1313 (previously #1187, etc)
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This change implements Windows version of install-system-daemon and
uninstall-system-daemon subcommands. When running the commands the
user will install or remove Tailscale Windows service.
Updates #1232
Signed-off-by: Alex Brainman <alex.brainman@gmail.com>
This used to not be necessary, because MagicDNS always did full proxying.
But with split DNS, we need to know which names to route to our resolver,
otherwise reverse lookups break.
This captures the entire CGNAT range, as well as our Tailscale ULA.
Signed-off-by: David Anderson <danderson@tailscale.com>
Otherwise, the existence of authoritative domains forces full
DNS proxying even when no other DNS config is present.
Signed-off-by: David Anderson <danderson@tailscale.com>
Logout used to be a no-op, so the ipnserver previously synthensized a Logout
on disconnect. Now that Logout actually invalidates the node key that was
forcing all GUI closes to log people out.
Instead, add a method to LocalBackend to specifically mean "the
Windows GUI closed, please forget all the state".
Fixestailscale/corp#1591 (ignoring the notification issues, tracked elsewhere)
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Let caller (macOS) do it so Finder progress bar can be dismissed
without races.
Updates tailscale/corp#1575
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
We were accidentally logging oldPort -> oldPort.
Log oldPort as well as c.port; if we failed to get the preferred port
in a previous rebind, oldPort might differ from c.port.
Signed-off-by: Josh Bleecher Snyder <josharian@gmail.com>
On macOS, we link the CLI into the GUI executable so it can be included in
the Mac App Store build.
You then need to run it like:
/Applications/Tailscale.app/Contents/MacOS/Tailscale <command>
But our old detection of whether you're running that Tailscale binary
in CLI mode wasn't accurate and often bit people. For instance, when
they made a typo, it then launched in GUI mode and broke their
existing GUI connection (starting a new IPNExtension) and took down
their network.
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
It used to just store received files URL-escaped on disk, but that was
a half done lazy implementation, and pushed the burden to callers to
validate and write things to disk in an unescaped way.
Instead, do all the validation in the receive handler and only
accept filenames that are UTF-8 and in the intersection of valid
names that all platforms support.
Fixestailscale/corp#1594
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
So the NetworkMap-from-incremental-MapResponses can be tested easily.
And because direct.go was getting too big.
No change in behavior at this point. Just movement.
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
The ipn.NewPrefs func returns a populated ipn.Prefs for historical
reasons. It's not used or as important as it once was, but it hasn't
yet been removed. Meanwhile, it contains some default values that are
used on some platforms. Notably, for this bug (#1725), Windows/Mac use
its Prefs.RouteAll true value (to accept subnets), but Linux users
have always gotten a "false" value for that, because that's what
cmd/tailscale's CLI default flag is _for all operating systems_. That
meant that "tailscale up" was rightfully reporting that the user was
changing an implicit setting: RouteAll was changing from true with
false with the user explicitly saying so.
An obvious fix might be to change ipn.NewPrefs to return
Prefs.RouteAll == false on some platforms, but the logic is
complicated by darwin: we want RouteAll true on windows, android, ios,
and the GUI mac app, but not the CLI tailscaled-on-macOS mode. But
even if we used build tags (e.g. the "redo" build tag) to determine
what the default is, that then means we have duplicated and differing
"defaults" between both the CLI up flags and ipn.NewPrefs. Furthering
that complication didn't seem like a good idea.
So, changing the NewPrefs defaults is too invasive at this stage of
the release, as is removing the NewPrefs func entirely.
Instead, tweak slightly the semantics of the ipn.Prefs.ControlURL
field. This now defines that a ControlURL of the empty string means
both "we're uninitialized" and also "just use the default".
Then, once we have the "empty-string-means-unintialized" semantics,
use that to suppress "tailscale up"'s recent implicit-setting-revert
checking safety net, if we've never initialized Tailscale yet.
And update/add tests.
Fixes#1725
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Will add more tests later but this locks in all the existing warnings
and errors at least, and some of the existing non-error behavior.
Mostly I want this to exist before I actually fix#1725.
Updates #1725
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
And fix PeerSeenChange bug where it was ignored unless there were
other peer changes.
Updates tailscale/corp#1574
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Track endpoints internally with a new tailcfg.Endpoint type that
includes a typed netaddr.IPPort (instead of just a string) and
includes a type for how that endpoint was discovered (STUN, local,
etc).
Use []tailcfg.Endpoint instead of []string internally.
At the last second, send it to the control server as the existing
[]string for endpoints, but also include a new parallel
MapRequest.EndpointType []tailcfg.EndpointType, so the control server
can start filtering out less-important endpoint changes from
new-enough clients. Notably, STUN-discovered endpoints can be filtered
out from 1.6+ clients, as they can discover them amongst each other
via CallMeMaybe disco exchanges started over DERP. And STUN endpoints
change a lot, causing a lot of MapResposne updates. But portmapped
endpoints are worth keeping for now, as they they work right away
without requiring the firewall traversal extra RTT dance.
End result will be less control->client bandwidth. (despite negligible
increase in client->control bandwidth)
Updates tailscale/corp#1543
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
They were scattered/duplicated in misc places before.
It can't be in the client package itself for circular dep reasons.
This new package is basically tailcfg but for localhost
communications, instead of to control.
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This changes the behavior of "tailscale up".
Previously "tailscale up" always did a new Start and reset all the settings.
Now "tailscale up" with no flags just brings the world [back] up.
(The opposite of "tailscale down").
But with flags, "tailscale up" now only is allowed to change
preferences if they're explicitly named in the flags. Otherwise it's
an error. Or you need to use --reset to explicitly nuke everything.
RELNOTE=tailscale up change
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Some paths already didn't. And in the future I hope to shut all the
notify funcs down end-to-end when nothing is connected (as in the
common case in tailscaled). Then we can save some JSON encoding work.
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
We've been slowly making Start less special and making IPN a
multi-connection "watch" bus of changes, but this Start specialness
had remained.
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Clear LLMNR and mdns flags, update reasoning for our settings,
and set our override priority harder than before when we want
to be primary resolver.
Signed-off-by: David Anderson <danderson@tailscale.com>
Debian resolvconf is not legacy, it's alive and well,
just historically before the other implementations.
Signed-off-by: David Anderson <danderson@tailscale.com>
On FreeBSD, we add the interface IP as a /48 to work around a kernel
bug, so we mustn't then try to add a /48 route to the Tailscale ULA,
since that will fail as a dupe.
Signed-off-by: David Anderson <danderson@tailscale.com>
It was only Linux and BSDs before, but now with netstack mode, it also works on
Windows and darwin. It's not worth limiting it to certain platforms.
Tailscaled itself can complain/fail if it doesn't like the settings
for the mode/OS it's operating under.
Updates #707
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This allows split-DNS configurations to not break clients on OSes that
haven't yet been ported to understand split DNS, by falling back to quad-9
as a global resolver when handed an "impossible to implement"
split-DNS config.
Part of #953. Needs to be removed before shipping 1.8.
Signed-off-by: David Anderson <danderson@tailscale.com>
With this change, all OSes can sort-of do split DNS, except that the
default upstream is hardcoded to 8.8.8.8 pending further plumbing.
Additionally, Windows 8-10 can do split DNS fully correctly, without
the 8.8.8.8 hack.
Part of #953.
Signed-off-by: David Anderson <danderson@tailscale.com>
When searching for the matching client identity, the returned
certificate chain was accidentally set to that of the last identity
returned by the certificate store instead of the one corresponding to
the selected identity.
Also, add some extra error checking for invalid certificate chains, just
in case.
Signed-off-by: Adrian Dewhurst <adrian@tailscale.com>
We already had SetNotifyCallback elsewhere on controlclient, so use
that name.
Baby steps towards some CLI refactor work.
Updates tailscale/tailscale#1436
It seems that all the setups that support split DNS understand
this distinction, and it's an important one when translating
high-level configuration.
Part of #953.
Signed-off-by: David Anderson <danderson@tailscale.com>
Correctly reports that Win7 cannot do split DNS, and has a helper to
discover the "base" resolvers for the system.
Part of #953
Signed-off-by: David Anderson <danderson@tailscale.com>
OS implementations are going to support split DNS soon.
Until they're all in place, hardcode Primary=true to get
the old behavior.
Signed-off-by: David Anderson <danderson@tailscale.com>
This is usually the same as the requested interface, but on some
unixes can vary based on device number allocation, and on Windows
it's the GUID instead of the pretty name, since everything relating
to configuration wants the GUID.
Signed-off-by: David Anderson <danderson@tailscale.com>
wgengine/router.CallbackRouter needs to support both the Router
and OSConfigurator interfaces, so the setters can't both be called
Set.
Signed-off-by: David Anderson <danderson@tailscale.com>
It existed to work around the frequent opening and closing
of the conn.Bind done by wireguard-go.
The preceding commit removed that behavior,
so we can simply close the connections
when we are done with them.
Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
We don't use the port that wireguard-go passes to us (via magicsock.connBind.Open).
We ignore it entirely and use the port we selected.
When we tell wireguard-go that we're changing the listen_port,
it calls connBind.Close and then connBind.Open.
And in the meantime, it stops calling the receive functions,
which means that we stop receiving and processing UDP and DERP packets.
And that is Very Bad.
That was never a problem prior to b3ceca1dd7,
because we passed the SkipBindUpdate flag to our wireguard-go fork,
which told wireguard-go not to re-bind on listen_port changes.
That commit eliminated the SkipBindUpdate flag.
We could write a bunch of code to work around the gap.
We could add background readers that process UDP and DERP packets when wireguard-go isn't.
But it's simpler to never create the conditions in which wireguard-go rebinds.
The other scenario in which wireguard-go re-binds is device.Down.
Conveniently, we never call device.Down. We go from device.Up to device.Close,
and the latter only when we're shutting down a magicsock.Conn completely.
Rubber-ducked-by: Avery Pennarun <apenwarr@tailscale.com>
Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
The shim implements both network and DNS configurators,
and feeds both into a single callback that receives
both configs.
Signed-off-by: David Anderson <danderson@tailscale.com>
Upstream wireguard-go has changed its receive model.
NewDevice now accepts a conn.Bind interface.
The conn.Bind is stateless; magicsock.Conns are stateful.
To work around this, we add a connBind type that supports
cheap teardown and bring-up, backed by a Conn.
The new conn.Bind allows us to specify a set of receive functions,
rather than having to shoehorn everything into ReceiveIPv4 and ReceiveIPv6.
This lets us plumbing DERP messages directly into wireguard-go,
instead of having to mux them via ReceiveIPv4.
One consequence of the new conn.Bind layer is that
closing the wireguard-go device is now indistinguishable
from the routine bring-up and tear-down normally experienced
by a conn.Bind. We thus have to explicitly close the magicsock.Conn
when the close the wireguard-go device.
One downside of this change is that we are reliant on wireguard-go
to call receiveDERP to process DERP messages. This is fine for now,
but is perhaps something we should fix in the future.
Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
The common Linux start-up path (fallback file defined but not
existing) was missing the log print of initializing Prefs. The code
was too twisty. Simplify a bit.
Updates #1573
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
They need some rework to do the right thing, in the meantime the direct
and resolvconf managers will work out.
The resolved implementation was never selected due to control-side settings.
The networkmanager implementation mostly doesn't get selected due to
unforeseen interactions with `resolvconf` on many platforms.
Both implementations also need rework to support the various routing modes
they're capable of.
Signed-off-by: David Anderson <danderson@tailscale.com>
It's only use to skip some optional initialization during cleanup,
but that work is very minor anyway, and about to change drastically.
Signed-off-by: David Anderson <danderson@tailscale.com>
It's currently unused, and no longer makes sense with the upcoming
DNS infrastructure. Keep it in tailcfg for now, since we need protocol
compat for a bit longer.
Signed-off-by: David Anderson <danderson@tailscale.com>
Old macOS clients required we populate this field to a non-null
value so we were unable to remove this field before.
Instead, keep the field but change its type to a custom empty struct
that can marshal/unmarshal JSON. And lock it in with a test.
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
The bool was already called useNetstack at the caller.
isUserspace (to mean netstack) is confusing next to wgengine.NewUserspaceEngine, as that's
a different type of 'userspace'.
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
The code is not obviously better or worse, but this makes the little warning
triangle in my editor go away, and the distraction removal is worth it.
Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
Google Cloud Run does not implement NETLINK_ROUTE RTMGRP.
If initialization of the netlink socket or group membership
fails, fall back to a polling implementation.
Signed-off-by: Denton Gentry <dgentry@tailscale.com>
The resolver still only supports a single upstream config, and
ipn/wgengine still have to split up the DNS config, but this moves
closer to unifying the DNS configs.
As a handy side-effect of the refactor, IPv6 MagicDNS records exist
now.
Signed-off-by: David Anderson <danderson@tailscale.com>
They're only used internally and in tests, and have surprising
semantics in that they only resolve MagicDNS names, not upstream
resolver queries.
Signed-off-by: David Anderson <danderson@tailscale.com>
This adds a new ipn.MaskedPrefs embedding a ipn.Prefs, along with a
bunch of "has bits", kept in sync with tests & reflect.
Then it adds a Prefs.ApplyEdits(MaskedPrefs) method.
Then the ipn.Backend interface loses its weirdo SetWantRunning(bool)
method (that I added in 483141094c for "tailscale down")
and replaces it with EditPrefs (alongside the existing SetPrefs for now).
Then updates 'tailscale down' to use EditPrefs instead of SetWantRunning.
In the future, we can use this to do more interesting things with the
CLI, reconfiguring only certain properties without the reset-the-world
"tailscale up".
Updates #1436
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
We were going to remove this in Tailscale 1.3 but forgot.
This means Tailscale 1.8 users won't be able to downgrade to Tailscale
1.0, but that's fine.
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Adding a subcommand which prints and logs a log marker. This should help
diagnose any issues that users face.
Fixes#1466
Signed-off-by: Maisem Ali <maisem@tailscale.com>
Instead of having the CLI check whether IP forwarding is enabled, ask
tailscaled. It has a better idea. If it's netstack, for instance, the
sysctl values don't matter. And it's possible that only the daemon has
permission to know.
Fixes#1626
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
The call to appendEndpoint updates cpeer.Endpoints.
Then it is overwritten in the next line.
The only errors from appendEndpoint occur when
the host/port pair is malformed, but that cannot happen.
Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
IPv6 Unique Local Addresses are sometimes used with Network
Prefix Translation to reach the Internet. In that respect
their use is similar to the private IPv4 address ranges
10/8, 172.16/12, and 192.168/16.
Treat them as sufficient for AnyInterfaceUp(), but specifically
exclude Tailscale's own IPv6 ULA prefix to avoid mistakenly
trying to bootstrap Tailscale using Tailscale.
This helps in supporting Google Cloud Run, where the addresses
are 169.254.8.1/32 and fddf:3978:feb1:d745::c001/128 on eth1.
Signed-off-by: Denton Gentry <dgentry@tailscale.com>
It can end up executing an a new goroutine,
at which point instead of immediately stopping test execution, it hangs.
Since this is unexpected anyway, panic instead.
As a bonus, it makes call sites nicer and removes a kludge comment.
Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
Without this, `tailscale status` ignores the --socket flag on macOS and
always talks to the IPNExtension, even if you wanted it to inspect a
userspace tailscaled.
Signed-off-by: David Crawshaw <crawshaw@tailscale.com>
So we have a documented & tested way to check whether we're in
netstack mode. To be used by future commits.
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
For discovery when an explicit hostname/IP is known. We'll still
also send it via control for finding peers by a list.
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
The concrete type being encoded changed from a value to pointer
earlier and this was never adjusted.
(People don't frequently use TS_DEBUG_MAP to see requests, so it went
unnoticed until now.)
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
"Fake" doesn't mean a lot any more, given that many components
of the engine can be faked out, including in valid production
configurations like userspace-networking.
Signed-off-by: David Anderson <danderson@tailscale.com>
This makes setup more explicit in prod codepaths, without
requiring a bunch of arguments or helpers for tests and
userspace mode.
Signed-off-by: David Anderson <danderson@tailscale.com>
The Windows CI machine experiences significant random execution delays.
For example, in this code from watchdog.go:
done := make(chan bool)
go func() {
start := time.Now()
mu.Lock()
There was a 500ms delay from initializing done to locking mu.
This test checks that we receive a sufficient number of events quickly enough.
In the face of random 500ms delays, unsurprisingly, the test fails.
There's not much principled we can do about it.
We could build a system of retries or attempt to detect these random delays,
but that game isn't worth the candle.
Skip the test.
Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
This works around the close syscall being slow.
We can revert this if we find a fix or if Apple makes close fast again.
Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
The tstun packagen contains both constructors for generic tun
Devices, and a wrapper that provides additional functionality.
Signed-off-by: David Anderson <danderson@tailscale.com>
IPv4 and IPv6 both work remotely, but IPv6 doesn't yet work from the
machine itself due to routing mysteries.
Untested yet on iOS, but previous prototype worked on iOS, so should
work the same.
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Now callers (wgengine/monitor) don't need to mutate the state to remove
boring interfaces before calling State.Equal. Instead, the methods
to remove boring interfaces from the State are removed, as is
the reflect-using Equal method itself, and in their place is
a new EqualFiltered method that takes a func predicate to match
interfaces to compare.
And then the FilterInteresting predicate is added for use
with EqualFiltered to do the job that that wgengine/monitor
previously wanted.
Now wgengine/monitor can keep the full interface state around,
including the "boring" interfaces, which we'll need for peerapi on
macOS/iOS to bind to the interface index of the utunN device.
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
We have it already but threw it away. But macOS/iOS code will
be needing the interface index, so hang on to it.
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
control/controlclient: sign RegisterRequest
Some customers wish to verify eligibility for devices to join their
tailnets using machine identity certificates. TLS client certs could
potentially fulfill this role but the initial customer for this feature
has technical requirements that prevent their use. Instead, the
certificate is loaded from the Windows local machine certificate store
and uses its RSA public key to sign the RegisterRequest message.
There is room to improve the flexibility of this feature in future and
it is currently only tested on Windows (although Darwin theoretically
works too), but this offers a reasonable starting place for now.
Updates tailscale/coral#6
Signed-off-by: Adrian Dewhurst <adrian@tailscale.com>
So we can empty import the guts of cmd/tailscaled from another
module for go mod tidy reasons.
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This adds an easy and portable way for us to document how to get
your Tailscale IP address.
$ tailscale ip
100.74.70.3
fd7a:115c:a1e0:ab12:4843:cd96:624a:4603
$ tailscale ip -4
100.74.70.3
$ tailscale ip -6
fd7a:115c:a1e0:ab12:4843:cd96:624a:4603
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
e.g.
$ tailscale ping 1.1.1.1
exit node found but not enabled
$ tailscale ping 10.2.200.2
node "tsbfvlan2" found, but not using its 10.2.200.0/24 route
$ sudo tailscale up --accept-routes
$ tailscale ping 10.2.200.2
pong from tsbfvlan2 (100.124.196.94) via 10.2.200.34:41641 in 1ms
$ tailscale ping mon.ts.tailscale.com
pong from monitoring (100.88.178.64) via DERP(sfo) in 83ms
pong from monitoring (100.88.178.64) via DERP(sfo) in 21ms
pong from monitoring (100.88.178.64) via [2604:a880:4:d1::37:d001]:41641 in 22ms
This necessarily moves code up from magicsock to wgengine, so we can
look at the actual wireguard config.
Fixes#1564
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Add proto to flowtrack.Tuple.
Add types/ipproto leaf package to break a cycle.
Server-side ACL work remains.
Updates #1516
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Mash up some code from ffcli and std's flag package to make a default
usage func that's super explicit for those not familiar with the Go
style flags. Only show double hyphens in usage text (but still accept both),
and show default values, and only show the proper usage of boolean flags.
Fixes#1353Fixes#1529
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
"public IP" is defined as an IP address configured on the exit node
itself that isn't in the list of forbidden ranges (RFC1918, CGNAT,
Tailscale).
Fixes#1522.
Signed-off-by: David Anderson <danderson@tailscale.com>
The direct client already logs it in JSON form. Then it's immediately
logged again in an unformatted dump, so this removes that unformatted
one.
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This reverts the revert commit 84aba349d9.
And changes us to use inet.af/netstack.
Updates #1518
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
In f45a9e291b (2021-03-04), I tried to bump CurrentMapRequestVersion
to 12 but only documented the meaning of 12 but forgot to actually
increase it from 11.
Mapver 11 was added in ea49b1e811 (2021-03-03).
Fix this in its own commit so we can cherry-pick it to the 1.6 release
branch.
Should help iOS battery life on NEProvider.wake/skip events
with useless route updates that shouldn't cause re-STUNs.
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
We strip them control-side anyway, and we already strip IPv4 link
local, so there's no point uploading them. And iOS has a ton of them,
which results in somewhat silly amount of traffic in the MapRequest.
We'll be doing same-LAN-inter-tailscaled link-local traffic a
different way, with same-LAN discovery.
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
To atone for 1d7f9d5b4a, the revert of 4224b3f731.
At least it's fast again, even if it's shelling out to cmd.exe (once now).
Updates #1478
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
gVisor fixed their google/gvisor#1446 so we can include gVisor mode
on 32-bit machines.
A few minor upstream API changes, as normal.
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
We basically already had the RIB-parsing Go code for this in both
net/interfaces and wgengine/monitor, for other reasons.
Fixes#1426Fixes#1471
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This change makes it impossible to set your own IP address as the exit node for this system.
Fixes#1489
Signed-off-by: Christine Dodrill <xe@tailscale.com>
So a region can be used if needed, but won't be STUN-probed or used as
its home.
This gives us another possible debugging mechanism for #1310, or can
be used as a short-term measure against DERP flip-flops for people
equidistant between regions if our hysteresis still isn't good enough.
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
IP forwarding is not required when advertising a machine's local IPs
over Tailscale.
Fixes#1435.
Signed-off-by: David Anderson <danderson@tailscale.com>
This reverts commit 08949d4ef1.
I think this code was aspirational. There's no code that sets up the
appropriate NAT code using pfctl/etc. See #911 and #1475.
Updates #1475
Updates #911
No server support yet, but we want Tailscale 1.6 clients to be able to respond
to them when the server can do it.
Updates #1310
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
There was a logical race where Conn.Rebind could acquire the
RebindingUDPConn mutex, close the connection, fail to rebind, release
the mutex, and then because the mutex was no longer held, ReceiveIPv4
wouldn't retry reads that failed with net.ErrClosed, letting that
error back to wireguard-go, which would then stop running that receive
IP goroutine.
Instead, keep the RebindingUDPConn mutex held for the entirety of the
replacement in all cases.
Updates tailscale/corp#1289
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
interfaces.State.String tries to print a concise summary of the
network state, removing any interfaces that don't have any or any
interesting IP addresses. On macOS and iOS, for instance, there are a
ton of misc things.
But the link monitor based its are-there-changes decision on
interfaces.State.Equal, which just used reflect.DeepEqual, including
comparing all the boring interfaces. On macOS, when turning wifi on or off, there
are a ton of misc boring interface changes, resulting in hitting an earlier
check I'd added on suspicion this was happening:
[unexpected] network state changed, but stringification didn't
This fixes that by instead adding a new
interfaces.State.RemoveUninterestingInterfacesAndAddresses method that
does, uh, that. Then use that in the monitor. So then when Equal is
used later, it's DeepEqualing the already-cleaned version with only
interesting interfaces.
This makes cmd/tailscaled debug --monitor much less noisy.
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Windows was only running the localapi on the debug port which was a
stopgap at the time while doing peercreds work. Removed that, and
wired it up correctly, with some more docs.
More clean-up to do after 1.6, moving the localhost TCP auth code into
the peercreds package. But that's too much for now, so the docs will
have to suffice, even if it's at a bit of an awkward stage with the
newly-renamed "NotWindows" field, which still isn't named well, but
it's better than its old name of "Unknown" which hasn't been accurate
since unix sock peercreds work anyway.
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
So the control server can test whether a client's actually present.
Most clients are over HTTP/2, so these pings (to the same host) are
super cheap.
This mimics the earlier goroutine dump mechanism.
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Previously the CLI could only find the HTTP auth token when running
the CLI outside the sandbox, not like
/Applications/Tailscale.app/Contents/MacOS/Tailscale when that was
from the App Store.
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
The debub subcommand was moved in
6254efb9ef because the monitor brought
in tons of dependencies to the cmd/tailscale binary, but there wasn't
any need to remove the whole subcommand itself.
Add it back, with a tool to dump the local daemon's goroutines.
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Not beautiful, but I'm debugging connectivity problems on
NEProvider.sleep+wake and need more clues.
Updates #1426
Updates tailscale/corp#1289
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This is important because some of those v6 sockets are actually
dual-stacked sockets, so this is our only chance of discovering
some services.
Fixes#1443.
Signed-off-by: David Anderson <danderson@tailscale.com>
And if we have over 10,000 CGNAT routes, just route the entire
CGNAT range. (for the hello test server)
Fixes#1450
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
The Engine.LinkChange method was recently removed in
e3df29d488 while misremembering how
Android's link state mechanism worked.
Rather than do some last minute rearchitecting of link state on
Android before Tailscale 1.6, restore the old Engine.LinkChange hook
for now so the Android client doesn't need any changes. But change how
it's implemented to instead inject an event into the link monitor.
Fixes#1427
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This is necessary because either protocol can be disabled globally by a
Windows registry policy, at which point trying to touch that address
family results in "Element not found" errors. This change skips programming
address families that Windows tell us are unavailable.
Fixes#1396.
Signed-off-by: David Anderson <danderson@tailscale.com>
Not great, but lets people working on new ports get going more quickly
without having to do everything up front.
As the link monitor is getting used more, I felt bad having a useless
implementation.
Updates #815
Updates #1427
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
DefaultRouteInterface was previously guarded by build tags such that
it was only accessible to tailscaled-on-macos, but there was no reason
for that. It runs fine in the sandbox and gives better default info,
so merge its file into interfaces_darwin.go.
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
We used to allow that, but now it just crashes.
Separately I need to figure out why it got into this path at all,
which is #1416.
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Tor has a location-hidden service feature that enables users to host services
from inside the Tor network. Each of these gets a unique DNS name that ends with
.onion. As it stands now, if a misbehaving application somehow manages to make
a .onion DNS request to our DNS server, we will forward that to the DNS server,
which could leak that to malicious third parties. See the recent bug Brave had
with this[1] for more context.
RFC 7686 suggests that name resolution APIs and libraries MUST respond with
NXDOMAIN unless they can actually handle Tor lookups. We can't handle .onion
lookups, so we reject them.
[1]: https://twitter.com/albinowax/status/1362737949872431108Fixestailscale/corp#1351
Signed-off-by: Christine Dodrill <xe@tailscale.com>
Part of overall effort to clean up, unify, use link monitoring more,
and make Tailscale quieter when all networks are down. This is especially
bad on macOS where we can get killed for not being polite it seems.
(But we should be polite in any case)
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Don't use os.NewFile or (*os.File).Close on the AF_ROUTE socket. It
apparently does weird things to the fd and at least doesn't seem to
close it. Just use the unix package.
The test doesn't actually fail reliably before the fix, though. It
was an attempt. But this fixes the integration tests.
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Prior to e3df29d488, the Engine.SetLinkChangeCallback fired
immediately, even if there was no change. The ipnlocal code apparently
depended on that, and it broke integration tests (which live in
another repo). So mimic the old behavior and call the ipnlocal
callback immediately at init.
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This is a fork of wireguard-windows's firewall package, with
the firewall rules adjusted to better line up with tailscale's
needs.
The package was taken from commit 3cc76ed5f222ec82748ef3bd8c41d4b059e28cdb
in our fork of wireguard-go.
Signed-off-by: David Anderson <danderson@tailscale.com>
Gets it out of wgengine so the Engine isn't responsible for being a
callback registration hub for it.
This also removes the Engine.LinkChange method, as it's no longer
necessary. The monitor tells us about changes; it doesn't seem to
need any help. (Currently it was only used by Swift, but as of
14dc790137 we just do the same from Go)
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
And add a --socks5-server flag.
And fix a race in SOCKS5 replies where the response header was written
concurrently with the copy from the backend.
Co-authored with Naman Sood.
Updates #707
Updates #504
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Previously tailscaled on macOS was running "/sbin/route monitor" as a
child process, but child processes aren't allowed in the Network
Extension / App Store sandbox. Instead, just do what "/sbin/route monitor"
itself does: unix.Socket(unix.AF_ROUTE, unix.SOCK_RAW, 0) and read that.
We also parse it now, but don't do anything with the parsed results yet.
We will over time, as we have with Linux netlink messages over time.
Currently any message is considered a signal to poll and see what changed.
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Currently it assumes exactly 1 registered callback. This changes it to
support 0, 1, or more than 1.
This is a step towards plumbing wgengine/monitor into more places (and
moving some of wgengine's interface state fetching into monitor in a
later step)
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
For option (d) of #1405.
For an HTTPS request of /bootstrap-dns, this returns e.g.:
{
"log.tailscale.io": [
"2600:1f14:436:d603:342:4c0d:2df9:191b",
"34.210.105.16"
],
"login.tailscale.com": [
"2a05:d014:386:203:f8b4:1d5a:f163:e187",
"3.121.18.47"
]
}
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
UIs need to see the full unedited netmap in order to know what exit nodes they
can offer to the user.
Signed-off-by: David Anderson <danderson@tailscale.com>
* move probing out of netcheck into new net/portmapper package
* use PCP ANNOUNCE op codes for PCP discovery, rather than causing
short-lived (sub-second) side effects with a 1-second-expiring map +
delete.
* track when we heard things from the router so we can be less wasteful
in querying the router's port mapping services in the future
* use portmapper from magicsock to map a public port
Fixes#1298Fixes#1080Fixes#1001
Updates #864
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Importing the non-main package was missing some dependencies that
"go mod tidy" would then cleanup. Also added a non-ignore build tag to
avoid other tools getting upset about importing a main package.
Signed-off-by: Filippo Valsorda <hi@filippo.io>
$ GOOS=openbsd GOARCH=arm64 go install tailscale.com/cmd/...@latest
pkg/mod/github.com/kr/pty@v1.1.4-0.20190131011033-7dc38fb350b1/pty_openbsd.go:24:10: undefined: ptmget
pkg/mod/github.com/kr/pty@v1.1.4-0.20190131011033-7dc38fb350b1/pty_openbsd.go:25:34: undefined: ioctl_PTMGET
"go mod tidy" did some unrelated work in go.sum, maybe because it was
not run with Go 1.16 before.
Signed-off-by: Filippo Valsorda <hi@filippo.io>
This makes cidrDiff do as much as possible before failing, and makes a
delete of an already-deleted rule be a no-op. We should never do this
ourselves, but other things on the system can, and this should help us
recover a bit.
Also adds the start of root-requiring tests.
TODO: hook into wgengine/monitor and notice when routes are changed
behind our back, and invalidate our routes map and re-read from
kernel (via the ip command) at least on the next reconfig call.
Updates tailscale/corp#1338
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
And open up socket permissions like Linux, now that we know who
connections are from.
This uses the new inet.af/peercred that supports Linux and Darwin at
the moment.
Fixes#1347Fixes#1348
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Tangentially related to #987, #177, #594, #925, #505
Motivated by rebooting a launchd-controlled tailscaled and it going
into SetNetworkUp(false) mode immediately because there really is no
network up at system boot, but then it got stuck in that paused state
forever, without a monitor implementation.
The interface.State logging tried to only log interfaces which had
interesting IPs, but the what-is-interesting checks differed between
the code that gathered the interface names to print and the printing
of their addresses.
When a handshake race occurs, a queued data packet can get lost.
TestTwoDevicePing expected that the very first data packet would arrive.
This caused occasional flakes.
Change TestTwoDevicePing to repeatedly re-send packets
and succeed when one of them makes it through.
This is acceptable (vs making WireGuard not drop the packets)
because this only affects communication with extremely old clients.
And those extremely old clients will eventually connect,
because the kernel will retry sends on timeout.
Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
We modified the standard net package to not allocate a *net.UDPAddr
during a call to (*net.UDPConn).ReadFromUDP if the caller's use
of the *net.UDPAddr does not cause it to escape.
That is https://golang.org/cl/291390.
This is the companion change to magicsock.
There are two changes required.
First, call ReadFromUDP instead of ReadFrom, if possible.
ReadFrom returns a net.Addr, which is an interface, which always allocates.
Second, reduce the lifetime of the returned *net.UDPAddr.
We do this by immediately converting it into a netaddr.IPPort.
We left the existing RebindingUDPConn.ReadFrom method in place,
as it is required to satisfy the net.PacketConn interface.
With the upstream change and both of these fixes in place,
we have removed one large allocation per packet received.
name old time/op new time/op delta
ReceiveFrom-8 16.7µs ± 5% 16.4µs ± 8% ~ (p=0.310 n=5+5)
name old alloc/op new alloc/op delta
ReceiveFrom-8 112B ± 0% 64B ± 0% -42.86% (p=0.008 n=5+5)
name old allocs/op new allocs/op delta
ReceiveFrom-8 3.00 ± 0% 2.00 ± 0% -33.33% (p=0.008 n=5+5)
Co-authored-by: Sonia Appasamy <sonia@tailscale.com>
Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
addrSet maintained duplicate lists of netaddr.IPPorts and net.UDPAddrs.
Unify to use the netaddr type only.
This makes (*Conn).ReceiveIPvN a bit uglier,
but that'll be cleaned up in a subsequent commit.
This is preparatory work to remove an allocation from ReceiveIPv4.
Co-authored-by: Sonia Appasamy <sonia@tailscale.com>
Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
I based my estimation of the required timeout based on locally
observed behavior. But CI machines are worse than my local machine.
16s was enough to reduce flakiness but not eliminate it. Bump it up again.
Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
Build tags have been updated to build native Apple M1 binaries, existing build
tags for ios have been changed from darwin,arm64 to ios,arm64.
With this change, running go build cmd/tailscale{,d}/tailscale{,d}.go on an Apple
machine with the new processor works and resulting binaries show the expected
architecture, e.g. tailscale: Mach-O 64-bit executable arm64.
Tested using go version go1.16beta1 darwin/arm64.
Updates #943
Signed-off-by: moncho <50428+moncho@users.noreply.github.com>
It only affects 'go install ./...', etc, and only on darwin/arm64 (M1 Macs) where
the go-ole package doesn't compile.
No need to build it.
Updates #943
This was in place because retrieved allowed_ips was very expensive.
Upstream changed the data structure to make them cheaper to compute.
This commit is an experiment to find out whether they're now cheap enough.
Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
We removed the "fast retry" code from our wireguard-go fork.
As a result, pings can take longer to transit when retries are required.
Allow that.
Fixes#1277
Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
The fix can make this test run unconditionally.
This moves code from 5c619882bc for
testability but doesn't fix it yet. The #1282 problem remains (when I
wrote its wake-up mechanism, I forgot there were N DERP readers
funneling into 1 UDP reader, and the code just isn't correct at all
for that case).
Also factor out some test helper code from BenchmarkReceiveFrom.
The refactoring in magicsock.go for testability should have no
behavior change.
If no exit node is specified, the filter must still run to remove
offered default routes from all peers.
Signed-off-by: David Anderson <danderson@tailscale.com>
This one alone doesn't modify the global dependency map much
(depaware.txt if anything looks slightly worse), but it leave
controlclient as only containing NetworkMap:
bradfitz@tsdev:~/src/tailscale.com/ipn$ grep -F "controlclient." *.go
backend.go: NetMap *controlclient.NetworkMap // new netmap received
fake_test.go: b.notify(Notify{NetMap: &controlclient.NetworkMap{}})
fake_test.go: b.notify(Notify{NetMap: &controlclient.NetworkMap{}})
handle.go: netmapCache *controlclient.NetworkMap
handle.go:func (h *Handle) NetMap() *controlclient.NetworkMap {
Once that goes into a leaf package, then ipn doesn't depend on
controlclient at all, and then the client gets smaller.
Updates #1278
Upstream wireguard-go decided to use errors.Is(err, net.ErrClosed)
instead of checking the error string.
It also provided an unsafe linknamed version of net.ErrClosed
for clients running Go 1.15. Switch to that.
This reduces the time required for the wgengine/magicsock tests
on my machine from ~35s back to the ~13s it was before
456cf8a376.
Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
Magicsock started dropping all traffic internally when Tailscale is
shut down, to avoid spurious wireguard logspam. This made the benchmark
not receive anything. Setting a dummy private key is sufficient to get
magicsock to pass traffic for benchmarking purposes.
Fixes#1270.
Signed-off-by: David Anderson <danderson@tailscale.com>
And move a couple other types down into leafier packages.
Now cmd/tailscale doesn't bring in netlink, magicsock, wgengine, etc.
Fixes#1181
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Unused for now, but I want to backport this commit to 1.4 so 1.6 can
start sending these and then at least 1.4 logs will stringify nicely.
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This reverts commit da4ec54756.
Since v6 got disabled for Windows nodes, I need the debug flag back
to figure out why it was broken.
Signed-off-by: David Anderson <danderson@tailscale.com>
Use tb.Cleanup to simplify both the API and the implementation.
One behavior change: When the number of goroutines shrinks, don't log.
I've never found these logs to be useful, and they frequently add noise.
Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
The code was using a C "int", which is a signed 32-bit integer.
That means some valid IP addresses were negative numbers.
(In particular, the default router address handed out by AT&T
fiber: 192.168.1.254. No I don't know why they do that.)
A negative number is < 255, and so was treated by the Go code
as an error.
This fixes the unit test failure:
$ go test -v -run=TestLikelyHomeRouterIPSyscallExec ./net/interfaces
=== RUN TestLikelyHomeRouterIPSyscallExec
interfaces_darwin_cgo_test.go:15: syscall() = invalid IP, false, netstat = 192.168.1.254, true
--- FAIL: TestLikelyHomeRouterIPSyscallExec (0.00s)
Signed-off-by: David Crawshaw <crawshaw@tailscale.com>
Previously we disabled v6 support if the disable_policy knob was
missing in /proc, but some kernels support policy routing without
exposing the toggle. So instead, treat disable_policy absence as a
"maybe", and make the direct `ip -6 rule` probing a bit more
elaborate to compensate.
Fixes#1241.
Signed-off-by: David Anderson <danderson@tailscale.com>
This is mostly code movement from the wireguard-go repo.
Most of the new wgcfg package corresponds to the wireguard-go wgcfg package.
wgengine/wgcfg/device{_test}.go was device/config{_test}.go.
There were substantive but simple changes to device_test.go to remove
internal package device references.
The API of device.Config (now wgcfg.DeviceConfig) grew an error return;
we previously logged the error and threw it away.
Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
We log lines like this:
c.logf("[v1] magicsock: disco: %v->%v (%v, %v) sent %v", c.discoShort, dstDisco.ShortString(), dstKey.ShortString(), derpStr(dst.String()), disco.MessageSummary(m))
The leading [v1] causes it to get unintentionally rate limited.
Until we have a proper fix, work around it.
Fixes#1216
Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
#### `POST /api/v2/tailnet/:tailnet/acl` - set ACL for a tailnet
Sets the ACL for the given tailnet. HuJSON and JSON are both accepted inputs. An `If-Match` header can be set to avoid missed updates.
Sets the ACL for the given domain.
HuJSON and JSON are both accepted inputs.
An `If-Match` header can be set to avoid missed updates.
Returns error for invalid ACLs.
Returns error if using an `If-Match` header and the ETag does not match.
Returns the updated ACL in JSON or HuJSON according to the `Accept` header on success. Otherwise, errors are returned for incorrectly defined ACLs, ACLs with failing tests on attempted updates, and mismatched `If-Match` header and ETag.
##### Parameters
@@ -380,7 +381,17 @@ Returns error if using an `If-Match` header and the ETag does not match.
`Accept` - Sets the return type of the updated ACL. Response is parsed `JSON` if `application/json` is explicitly named, otherwise HuJSON will be returned.
###### POST Body
ACL JSON or HuJSON (see https://tailscale.com/kb/1018/acls)
The POST body should be a JSON or [HuJSON](https://github.com/tailscale/hujson#hujson---human-json) formatted JSON object.
An ACL policy may contain the following top-level properties:
*`Groups` - Static groups of users which can be used for ACL rules.
*`Hosts` - Hostname aliases to use in place of IP addresses or subnets.
*`ACLs` - Access control lists.
*`TagOwners` - Defines who is allowed to use which tags.
*`Tests` - Run on ACL updates to check correct functionality of defined ACLs.
See https://tailscale.com/kb/1018/acls for more information on those properties.
runSTUN=flag.Bool("stun",false,"also run a STUN server")
meshPSKFile=flag.String("mesh-psk-file",defaultMeshPSKFile(),"if non-empty, path to file containing the mesh pre-shared key file. It should contain some hex string; whitespace is trimmed.")
meshWith=flag.String("mesh-with","","optional comma-separated list of hostnames to mesh with; the server's own hostname can be in the list")
bootstrapDNS=flag.String("bootstrap-dns-names","","optional comma-separated list of hostnames to make available at /bootstrap-dns")
verifyClients=flag.Bool("verify-clients",false,"verify clients to this DERP server through a local tailscaled instance.")
upf.BoolVar(&upArgs.reset,"reset",false,"reset unspecified settings to their default values")
upf.StringVar(&upArgs.server,"login-server",ipn.DefaultControlURL,"base URL of control server")
upf.BoolVar(&upArgs.acceptRoutes,"accept-routes",false,"accept routes advertised by other Tailscale nodes")
upf.BoolVar(&upArgs.acceptDNS,"accept-dns",true,"accept DNS configuration from the admin panel")
upf.BoolVar(&upArgs.singleRoutes,"host-routes",true,"install host routes to other Tailscale nodes")
upf.StringVar(&upArgs.exitNodeIP,"exit-node","","Tailscale IP of the exit node for internet traffic")
upf.BoolVar(&upArgs.exitNodeAllowLANAccess,"exit-node-allow-lan-access",false,"Allow direct access to the local network when routing traffic via an exit node")
upf.StringVar(&upArgs.advertiseTags,"advertise-tags","","comma-separated ACL tags to request; each must start with \"tag:\" (e.g. \"tag:eng,tag:montreal,tag:ssh\")")
upf.StringVar(&upArgs.hostname,"hostname","","hostname to use instead of the one provided by the OS")
upf.StringVar(&upArgs.advertiseRoutes,"advertise-routes","","routes to advertise to other nodes (comma-separated, e.g. \"10.0.0.0/8,192.168.0.0/24\")")
upf.BoolVar(&upArgs.advertiseDefaultRoute,"advertise-exit-node",false,"offer to be an exit node for internet traffic for the tailnet")
ifsafesocket.GOOSUsesPeerCreds(goos){
upf.StringVar(&upArgs.opUser,"operator","","Unix username to allow to operate on tailscaled without sudo")
}
switchgoos{
case"linux":
upf.BoolVar(&upArgs.snat,"snat-subnet-routes",true,"source NAT traffic to local routes advertised with --advertise-routes")
upf.StringVar(&upArgs.netfilterMode,"netfilter-mode",defaultNetfilterMode(),"netfilter mode (one of on, nodivert, off)")
case"windows":
upf.BoolVar(&upArgs.forceDaemon,"unattended",false,"run in \"Unattended Mode\" where Tailscale keeps running even after the current GUI user logs out (Windows-only)")
returnnil,fmt.Errorf("invalid IP address %q for --exit-node: %v",upArgs.exitNodeIP,err)
}
}elseifupArgs.exitNodeAllowLANAccess{
returnnil,fmt.Errorf("--exit-node-allow-lan-access can only be used with --exit-node")
}
ifupArgs.exitNodeIP!=""{
for_,ip:=rangest.TailscaleIPs{
ifexitNodeIP==ip{
returnnil,fmt.Errorf("cannot use %s as the exit node as it is a local IP address to this machine, did you mean --advertise-exit-node?",upArgs.exitNodeIP)
flag.Var(flagtype.PortValue(&args.port,magicsock.DefaultPort),"port","UDP port to listen on for WireGuard and peer-to-peer traffic; 0 means automatically select")
flag.StringVar(&args.socksAddr,"socks5-server","",`optional [ip]:port to run a SOCK5 server (e.g. "localhost:1080")`)
flag.StringVar(&args.tunname,"tun",defaultTunName(),`tunnel interface name; use "userspace-networking" (beta) to not use TUN`)
flag.Var(flagtype.PortValue(&args.port,0),"port","UDP port to listen on for WireGuard and peer-to-peer traffic; 0 means automatically select")
flag.StringVar(&args.statepath,"state",paths.DefaultTailscaledStateFile(),"path of state file")
flag.StringVar(&args.socketpath,"socket",paths.DefaultTailscaledSocket(),"path of the service unix socket")
flag.BoolVar(&printVersion,"version",false,"print version information and exit")
err:=fixconsole.FixConsoleIfNeeded()
iferr!=nil{
log.Fatalf("fixConsoleOutput: %v",err)
iflen(os.Args)>1{
sub:=os.Args[1]
iffp,ok:=subCommands[sub];ok{
if*fp==nil{
log.SetFlags(0)
log.Fatalf("%s not available on %v",sub,runtime.GOOS)
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.