Skip to content

juicefs trying connect to an address which is not set in configuration #4970

@greensea

Description

@greensea

What happened:

I have a etcd cluster with 3 nodes. The nodes are in the same VPN (100.80.x.x). And the nodes also have an LAN address (192.168.0.x).
The nodes are located in different LAN, they can't communicate to each other directly, they have to communicate to each other via VPN(100.80.x.x). Also, I built the cluster within the VPN.

I created an juicefs storage:

juicefs format --storage etcd  --bucket etcd://ss.ts.bbxy.net:2379/ etcd://ss.ts.bbxy.net:2379/myjfs-meta myjfs

Note: ss.ts.bbxy.net is resolved to 100.80.x.x

Then mount it:

juicefs mount  etcd://ss.ts.bbxy.net:2379/myjfs-meta mnt --verbose

Now copy a large file into mnt, the command stucked. juicefs printed some error logs:

...
2024/06/23 17:55:25.744427 juicefs[161011] <DEBUG>: txn with 1 conds and 1 ops took 22.128292ms [tkv_etcd.go:191]
2024/06/23 17:55:25.792084 juicefs[161011] <DEBUG>: txn with 1 conds and 1 ops took 19.145083ms [tkv_etcd.go:191]
{"level":"warn","ts":"2024-06-23T17:55:33.625558+0800","logger":"etcd-client","caller":"v3@v3.5.9/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0xc000dea380/ss.ts.bbxy.net:2380","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = latest balancer error: last connection error: connection error: desc = \"transport: Error while dialing: dial tcp 192.168.0.44:2379: i/o timeout\""}
{"level":"info","ts":"2024-06-23T17:55:33.625667+0800","logger":"etcd-client","caller":"v3@v3.5.9/client.go:210","msg":"Auto sync endpoints failed.","error":"context deadline exceeded"}
{"level":"warn","ts":"2024-06-23T17:55:33.645211+0800","logger":"etcd-client","caller":"v3@v3.5.9/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0xc0017fca80/ss.ts.bbxy.net:2380","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = latest balancer error: last connection error: connection error: desc = \"transport: Error while dialing: dial tcp 192.168.0.33:2379: i/o timeout\""}
{"level":"info","ts":"2024-06-23T17:55:33.645272+0800","logger":"etcd-client","caller":"v3@v3.5.9/client.go:210","msg":"Auto sync endpoints failed.","error":"context deadline exceeded"}
2024/06/23 17:55:37.963580 juicefs[161011] <DEBUG>: txn with 1 conds and 1 ops took 20.836128ms [tkv_etcd.go:191]
2024/06/23 17:56:18.342892 juicefs[158671] <WARNING>: Upload chunks/0/4/4107_0_4194304: timeout after 1m0s: function timeout (try 7) [cached_store.go:407]
2024/06/23 17:56:18.376189 juicefs[158671] <WARNING>: Upload chunks/0/4/4107_1_4194304: timeout after 1m0s: function timeout (try 7) [cached_store.go:407]
2024/06/23 17:56:18.400583 juicefs[158671] <WARNING>: Upload chunks/0/4/4107_2_4194304: timeout after 1m0s: function timeout (try 7) [cached_store.go:407]
...

The logs shows that juicefs is trying to connect to 192.168.0.44 (LAN address of ss.ts.bbxy.net) and 192.168.0.33 (an other etcd node), which is a LAN address, and I can't connect to this LAN address because it's an other LAN. I think this is the cause of the copy file stuck.

The wired things is, I configure the etcd cluster and the juicefs within the VPN (100.80.x.x), it should not known there is any LAN address(192.168.0.x) and should not try to connect to such address.

What you expected to happen:

cp not stuck. And juicefs not trying to connect to a LAN address (192.168.0.x)

How to reproduce it (as minimally and precisely as possible):

Already describe above.

Anything else we need to know?

No

Environment:

  • JuiceFS version (use juicefs --version) or Hadoop Java SDK version: juicefs version 1.2.0+2024-06-18.873c47b

  • Cloud provider or hardware configuration running JuiceFS: self maintained

  • OS (e.g cat /etc/os-release): Debian trixie/sid

  • Kernel (e.g. uname -a):Linux 6.8.12-amd64 update reader length after write #1 SMP PREEMPT_DYNAMIC Debian 6.8.12-1 (2024-05-31) x86_64 GNU/Linux

  • Object storage (cloud provider and region, or self maintained): self maintained

  • Metadata engine info (version, cloud provider managed or self maintained): etcd Version: 3.4.33

  • Network connectivity (JuiceFS to metadata engine, JuiceFS to object storage): tailscaled VPN

  • Others:

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions