Skip to content

Add copy_symlinks keyword to Tar.tree_hash#167

Open
mkitti wants to merge 6 commits into
JuliaIO:masterfrom
mkitti:mkitti-copy-symlinks
Open

Add copy_symlinks keyword to Tar.tree_hash#167
mkitti wants to merge 6 commits into
JuliaIO:masterfrom
mkitti:mkitti-copy-symlinks

Conversation

@mkitti

@mkitti mkitti commented Jan 8, 2024

Copy link
Copy Markdown
Member

Implement copy_symlinks keyword for Tar.tree_hash

For JuliaLang/Pkg.jl#3643
https://github.com/JuliaBinaryWrappers/P4est_jll.jl/releases/download/P4est-v2.8.1+2/P4est.v2.8.1.x86_64-w64-mingw32-mpi+microsoftmpi.tar.gz

julia> open(GzipDecompressorStream, "P4est.v2.8.1.x86_64-w64-mingw32-mpi+microsoftmpi.tar.gz") do io
           Tar.tree_hash(io, copy_symlinks=false)
       end
"89a337ea6f60a4fd58999ab73dea099e41032138"

julia> open(GzipDecompressorStream, "P4est.v2.8.1.x86_64-w64-mingw32-mpi+microsoftmpi.tar.gz") do io
           Tar.tree_hash(io, copy_symlinks=true)
       end
"ed75b82e0dd9b53c4ac4e70376f3e6f330c72767"

For JuliaPackaging/Yggdrasil#7888
https://github.com/JuliaBinaryWrappers/iso_codes_jll.jl/releases/download/iso_codes-v4.11.0+0/iso_codes.v4.11.0.any.tar.gz

julia> open(GzipDecompressorStream, "iso_codes.v4.11.0.any.tar.gz") do io
           Tar.tree_hash(io, copy_symlinks=false)
       end
"71f68a3d55d73f2e15a3969c241fae2349b1feb5"

julia> open(GzipDecompressorStream, "iso_codes.v4.11.0.any.tar.gz") do io
           Tar.tree_hash(io, copy_symlinks=true)
       end
"409d6ac4c02dae43ff4fe576b5c5820d0386fb3f"

@codecov

codecov Bot commented Jan 8, 2024

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 97.64%. Comparing base (6269b5b) to head (0bcb514).
⚠️ Report is 19 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master     #167      +/-   ##
==========================================
+ Coverage   97.28%   97.64%   +0.36%     
==========================================
  Files           4        4              
  Lines         810      851      +41     
==========================================
+ Hits          788      831      +43     
+ Misses         22       20       -2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@mkitti mkitti marked this pull request as ready for review January 8, 2024 10:51
Comment thread test/data/iso_codes.v4.11.0.any.tar.gz Outdated
@nhz2

nhz2 commented Jan 13, 2024

Copy link
Copy Markdown
Member

Can the julia compat and CI tests be bumped to 1.6?

Comment thread test/runtests.jl
Tar.tree_hash(hdr->false, tar, algorithm="git-sha256", copy_symlinks=true)
end
NON_STDLIB_TESTS && begin
iso_codes_tarball = Downloads.download("https://github.com/JuliaBinaryWrappers/iso_codes_jll.jl/releases/download/iso_codes-v4.11.0+0/iso_codes.v4.11.0.any.tar.gz")

@nhz2 nhz2 Jan 13, 2024

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test should use a generated test tar file, like in the "copy symlinks" testset below. The tests will be more reliable if they don't require internet access.

@nhz2

nhz2 commented Jan 13, 2024

Copy link
Copy Markdown
Member

Adding the following test results in a stack overflow.

    tarball, io = mktemp()
    tar_write_link(io, "a", "b/q")
    tar_write_link(io, "b", "a/q")
    close(io)
    Tar.tree_hash(tarball; copy_symlinks=true)
Details
StackOverflowError:
  Stacktrace:
    [1] exec(re::Ptr{Nothing}, subject::String, offset::Int64, options::UInt32, match_data::Ptr{Nothing})
      @ Base.PCRE ./pcre.jl:203
    [2] exec_r_data
      @ Base.PCRE ./pcre.jl:220 [inlined]
    [3] _findnext_re(re::Regex, str::String, idx::Int64, match_data::Ptr{Nothing})
      @ Base ./regex.jl:442
    [4] findnext
      @ Base ./regex.jl:431 [inlined]
    [5] iterate
      @ Base ./strings/util.jl:556 [inlined]
    [6] iterate
      @ Base ./strings/util.jl:555 [inlined]
    [7] _collect(cont::UnitRange{Int64}, itr::Base.SplitIterator{String, Regex}, ::Base.HasEltype, isz::Base.SizeUnknown)
      @ Base ./array.jl:770
    [8] collect
      @ ./array.jl:759 [inlined]
    [9] #split#487
      @ ./strings/util.jl:628 [inlined]
   [10] split
      @ ./strings/util.jl:626 [inlined]
   [11] link_target(paths::Dict{String, Any}, path::String, link::String)
      @ Tar ~/github/Tar.jl/src/extract.jl:175
   [12] link_target(paths::Dict{String, Any}, path::String, link::String) (repeats 13767 times)
      @ Tar ~/github/Tar.jl/src/extract.jl:193
   [13] git_tree_hash(predicate::Function, tar::IOStream, ::Type{SHA.SHA1_CTX}, skip_empty::Bool, copy_symlinks::Bool; buf::Vector{UInt8})
      @ Tar ~/github/Tar.jl/src/extract.jl:264
   [14] git_tree_hash
      @ ~/github/Tar.jl/src/extract.jl:208 [inlined]
   [15] #98
      @ ~/github/Tar.jl/src/Tar.jl:408 [inlined]
   [16] open(f::Tar.var"#98#100"{Bool, Bool, Tar.var"#1#2"}, args::String; kwargs::@Kwargs{lock::Bool})
      @ Base ./io.jl:396
   [17] open_nolock
      @ ~/packages/julias/julia-1.10/share/julia/stdlib/v1.10/ArgTools/src/ArgTools.jl:35 [inlined]
   [18] arg_read
      @ ~/packages/julias/julia-1.10/share/julia/stdlib/v1.10/ArgTools/src/ArgTools.jl:74 [inlined]
   [19] tree_hash(predicate::Function, tarball::String; algorithm::String, skip_empty::Bool, copy_symlinks::Bool)
      @ Tar ~/github/Tar.jl/src/Tar.jl:407
   [20] tree_hash
      @ Tar ~/github/Tar.jl/src/Tar.jl:398 [inlined]
   [21] #tree_hash#102
      @ Tar ~/github/Tar.jl/src/Tar.jl:425 [inlined]
   [22] macro expansion
      @ ~/github/Tar.jl/test/runtests.jl:12 [inlined]
   [23] macro expansion
      @ ~/packages/julias/julia-1.10/share/julia/stdlib/v1.10/Test/src/Test.jl:1577 [inlined]
   [24] top-level scope
      @ ~/github/Tar.jl/test/runtests.jl:4
   [25] include(fname::String)
      @ Base.MainInclude ./client.jl:489
   [26] top-level scope
      @ none:6
   [27] eval
      @ Core ./boot.jl:385 [inlined]
   [28] exec_options(opts::Base.JLOptions)
      @ Base ./client.jl:291
   [29] _start()
      @ Base ./client.jl:552
Test Summary:   | Error  Total  Time
copy symlinks 2 |     1      1  1.6s
ERROR: LoadError: Some tests did not pass: 0 passed, 0 failed, 1 errored, 0 broken.
in expression starting at /home/nathan/github/Tar.jl/test/runtests.jl:3
ERROR: Package Tar errored during testing

@nhz2

nhz2 commented Jan 13, 2024

Copy link
Copy Markdown
Member

The following test results in a stack overflow from a different function:

    tarball, io = mktemp()
    tar_write_dir(io,  "foo/bar")
    tar_write_link(io, "foo/bar/b", "../a")
    tar_write_link(io, "foo/a", "bar/")
    close(io)
    Tar.tree_hash(tarball; copy_symlinks=true)
Details
StackOverflowError:
  Stacktrace:
       [1] (::Tar.var"#34#44"{Dict{String, Any}, Tar.var"#by#43"})(io::IOBuffer)
         @ Tar ~/github/Tar.jl/src/extract.jl:329
       [2] sprint(::Function; context::Nothing, sizehint::Int64)
         @ Base ./strings/io.jl:114
       [3] sprint
         @ ./strings/io.jl:107 [inlined]
       [4] git_object_hash(emit::Function, kind::String, ::Type{SHA.SHA1_CTX})
         @ Tar ~/github/Tar.jl/src/extract.jl:347
       [5] (::Tar.var"#hash_tree#42"{SHA.SHA1_CTX})(node::Dict{String, Any})
         @ Tar ~/github/Tar.jl/src/extract.jl:328
       [6] (::Tar.var"#34#44"{Dict{String, Any}, Tar.var"#by#43"})(io::IOBuffer)
         @ Tar ~/github/Tar.jl/src/extract.jl:330
  --- the last 5 lines are repeated 6881 more times ---
   [34412] sprint(::Function; context::Nothing, sizehint::Int64)
         @ Base ./strings/io.jl:114
   [34413] sprint
         @ ./strings/io.jl:107 [inlined]
   [34414] git_object_hash(emit::Function, kind::String, ::Type{SHA.SHA1_CTX})
         @ Tar ~/github/Tar.jl/src/extract.jl:347
   [34415] (::Tar.var"#hash_tree#42"{SHA.SHA1_CTX})(node::Dict{String, Any})
         @ Tar ~/github/Tar.jl/src/extract.jl:328
   [34416] git_tree_hash(predicate::Function, tar::IOStream, ::Type{SHA.SHA1_CTX}, skip_empty::Bool, copy_symlinks::Bool; buf::Vector{UInt8})
         @ Tar ~/github/Tar.jl/src/extract.jl:338
   [34417] git_tree_hash
         @ ~/github/Tar.jl/src/extract.jl:209 [inlined]
   [34418] #98
         @ ~/github/Tar.jl/src/Tar.jl:408 [inlined]
   [34419] open(f::Tar.var"#98#100"{Bool, Bool, Tar.var"#1#2"}, args::String; kwargs::@Kwargs{lock::Bool})
         @ Base ./io.jl:396
   [34420] open_nolock
         @ ~/packages/julias/julia-1.10/share/julia/stdlib/v1.10/ArgTools/src/ArgTools.jl:35 [inlined]
   [34421] arg_read
         @ ~/packages/julias/julia-1.10/share/julia/stdlib/v1.10/ArgTools/src/ArgTools.jl:74 [inlined]
   [34422] tree_hash(predicate::Function, tarball::String; algorithm::String, skip_empty::Bool, copy_symlinks::Bool)
         @ Tar ~/github/Tar.jl/src/Tar.jl:407
   [34423] tree_hash
         @ Tar ~/github/Tar.jl/src/Tar.jl:398 [inlined]
   [34424] #tree_hash#102
         @ Tar ~/github/Tar.jl/src/Tar.jl:425 [inlined]
   [34425] macro expansion
         @ ~/github/Tar.jl/test/runtests.jl:13 [inlined]
   [34426] macro expansion
         @ ~/packages/julias/julia-1.10/share/julia/stdlib/v1.10/Test/src/Test.jl:1577 [inlined]
   [34427] top-level scope
         @ ~/github/Tar.jl/test/runtests.jl:6
   [34428] include(fname::String)
         @ Base.MainInclude ./client.jl:489
   [34429] top-level scope
         @ none:6
   [34430] eval
         @ Core ./boot.jl:385 [inlined]
   [34431] exec_options(opts::Base.JLOptions)
         @ Base ./client.jl:291
Test Summary:   | Error  Total   Time
copy symlinks 2 |     1      1  59.1s
ERROR: LoadError: Some tests did not pass: 0 passed, 0 failed, 1 errored, 0 broken.
in expression starting at /home/nathan/github/Tar.jl/test/runtests.jl:3
ERROR: Package Tar errored during testing

<\details>

@mkitti

mkitti commented Jan 13, 2024

Copy link
Copy Markdown
Member Author

What does extract do in this circumstance?

@nhz2

nhz2 commented Jan 14, 2024

Copy link
Copy Markdown
Member

extract has a stack overflow in the first case and goes into an infinite loop in the second case.

@mkitti

mkitti commented Jan 16, 2024

Copy link
Copy Markdown
Member Author

We should determine the correct behavior for extract first. My sense is that it should produce a more intuitive and descriptive error.

@nhz2

nhz2 commented Jan 16, 2024

Copy link
Copy Markdown
Member

That makes sense, though sometimes extract will ignore symlinks that it cannot copy. This might be the same issue as #77 . Maybe these issues could be resolved while doing the refactoring described in #64

@StefanKarpinski

StefanKarpinski commented Jan 16, 2024

Copy link
Copy Markdown
Member

Yeah, this is a known issue for extract: #77. I just never got around to fixing it because it seems deeply unlikely in real-world tarballs, but of course, we should handle it correctly. But it's very annoying to fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants