Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions doc/dpdispatcher_on_yarn.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

## Background

Currently, DPGen(or other DP softwares) supports for HPC systems like Slurm, PBS, LSF and cloud machines. In order to run DPGen jobs on ByteDance internal platform, we need to extend it to support yarn resources. Hadoop Ecosystem is a very commonly used platform to process the big data, and in the process of developing the new interface, we found it can be implemented by only using hadoop opensource components. So for the convenience of the masses, we decided to contribute the codes to opensource community.
Currently, DPGen(or other DP software) supports for HPC systems like Slurm, PBS, LSF and cloud machines. In order to run DPGen jobs on ByteDance internal platform, we need to extend it to support yarn resources. Hadoop Ecosystem is a very commonly used platform to process the big data, and in the process of developing the new interface, we found it can be implemented by only using hadoop opensource components. So for the convenience of the masses, we decided to contribute the codes to opensource community.

## Design

Expand Down Expand Up @@ -95,7 +95,7 @@ class DistributedShell(Machine):
pass

def gen_script_command(self, job):
""" Generate the shell script to be executed in DistibutedShell container
""" Generate the shell script to be executed in DistributedShell container

Parameters
----------
Expand All @@ -115,7 +115,7 @@ The following is an example of generated shell script. It will be executed in a
```
#!/bin/bash

## set envionment variables
## set environment variables
source /opt/intel/oneapi/setvars.sh

## download the tar file from hdfs which contains forward files
Expand Down
2 changes: 1 addition & 1 deletion doc/examples/expanse.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

[Expanse](https://www.sdsc.edu/support/user_guides/expanse.html) is a cluster operated by the San Diego Supercomputer Center. Here we provide an example to run jobs on the expanse.

The machine parameters are provided below. Expanse uses the SLURM workload manager for job scheduling. {ref}`remote_root <machine/remote_root>` has been created in advance. It's worth metioned that we do not recommend to use the password, so [SSH keys](https://www.ssh.com/academy/ssh/key) are used instead to improve security.
The machine parameters are provided below. Expanse uses the SLURM workload manager for job scheduling. {ref}`remote_root <machine/remote_root>` has been created in advance. It's worth mentioned that we do not recommend to use the password, so [SSH keys](https://www.ssh.com/academy/ssh/key) are used instead to improve security.

```{literalinclude} ../../examples/machine/expanse.json
---
Expand Down
2 changes: 1 addition & 1 deletion doc/getting-started.md
Original file line number Diff line number Diff line change
Expand Up @@ -111,7 +111,7 @@ and `task.json` is
}
```

You may also submit mutiple GPU jobs:
You may also submit multiple GPU jobs:
complex resources example

```python3
Expand Down
Loading