Skip to content

Beam search logit refactor#771

Open
rhenry-nv wants to merge 7 commits into
marian-nmt:masterfrom
rhenry-nv:beam_search_logit_refactor
Open

Beam search logit refactor#771
rhenry-nv wants to merge 7 commits into
marian-nmt:masterfrom
rhenry-nv:beam_search_logit_refactor

Conversation

@rhenry-nv

Copy link
Copy Markdown
Contributor

Description

Refactors the beam search when --n-best is specified so the retrieval of logits from the GPU is batched.

On a standard transformer with 3 decoder layers, this change saw improvements of up tp 10% when --n-best is specified. It also has the benefit of reducing CPU - GPU communication.

This does not contain a table similar to other PRs from #743 since the model motivating this change is different from the proxy model used in #743.

List of changes:

  • Refactors beam search and adds new tensor operators to support batched retrieval.

Added dependencies: none

How to test

Ran regression tests and they passed.

CMake command: cmake .. -DCOMPILE_CPU=on -DCOMPILE_CUDA=on -DUSE_SENTENCEPIECE=on -DUSE_STATIC_LIBS=off -DCOMPILE_SERVER=off -DUSE_FBGEMM=on -DCOMPILE_CUDA_SM35=off -DCOMPILE_CUDA_SM50=off -DCOMPILE_CUDA_SM60=off -DCOMPILE_CUDA_SM70=on -DCOMPILE_CUDA_SM75=off -DCOMPILE_TESTS=on

Ubuntu - 18.04.3 LTS
nvcc - 10.1.243
gcc - 7.5.0

Checklist

  • I have tested the code manually
  • I have run regression tests
  • I have read and followed CONTRIBUTING.md
  • I have updated CHANGELOG.md

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants