Skip to content

Speaker diarization final#12

Open
Martaesplo wants to merge 20 commits into
mainfrom
speaker-diarization-final
Open

Speaker diarization final#12
Martaesplo wants to merge 20 commits into
mainfrom
speaker-diarization-final

Conversation

@Martaesplo

Copy link
Copy Markdown

#10
The main additions that are to be reviewed are the files: helpers.py, diarize.py and torun.py. Moreover, the notebook which has the same contents as the helpers.py and diarize.py files.

The rest of the files are directly copied from the example worker and not modified.

@Martaesplo Martaesplo requested a review from Veldhoen May 1, 2024 09:54
Comment thread base_util.py Outdated

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove anything you have not worked on from the PR/branch

@Veldhoen Veldhoen left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey Marta,

Thanks for your PR! I have left quite some remarks. Though I know you won't be able to work on this anymore, it would be great if you can make sure to leave this repo in a neat state. Also, I think it would be educational for you as a developer to go through the process of submitting a PR until having it approved. It's such a pity we haven't gone through this process yet!

Comment thread config.yml Outdated

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove anything you have not worked on from the PR/branch

Comment thread config/config.yml Outdated

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove anything you have not worked on from the PR/branch

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove anything you have not worked on from the PR/branch

Comment thread .gitignore

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add data and config dirs

Comment thread diarize.py Outdated
Comment on lines +81 to +82
del msdd_model
torch.cuda.empty_cache()

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes sense in a local environment, when deploying the model you probably do NOT want to remove the model from memory - so at least make this configurable

Comment thread models.py Outdated

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove anything you have not worked on from the PR/branch

Comment thread pyproject.toml Outdated

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please specify the requirements in the pyproject.toml
(Or leave it out completetly: remove anything you have not worked on from the PR/branch)

Comment thread worker.py Outdated

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove anything you have not worked on from the PR/branch

Comment thread diarize.py
from omegaconf import OmegaConf


def config_setup(output_dir):

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider supplying your own, augmented version of https://raw.githubusercontent.com/NVIDIA/NeMo/main/examples/speaker_tasks/diarization/conf/inference/diar_infer_{DOMAIN_TYPE}.yaml
So you can separate any config from the code.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Including it as a file in the repository?
There are three different types of configuration files, depending on the domain of the audio file. So I should put all three files in the repo? And you change the configurable parameters from the file itself?

Comment thread Dockerfile Outdated

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove anything you have not worked on from the PR/branch

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants