Skip to content

Generate consensus repeat library from multiple species #299

Description

@tododge

We are interested in generating a repeat library that can be used to mask any species in a genus (of non-model fishes). Ideally, elements in this library would have common family names that could be used for comparative purposes. We have >10 high quality assemblies that could be used as input.

Some options could be to 1) combine the assemblies and run repeat modeler to generate a single library, or 2) run repeat modeler for each species and then combine the libraries and remove duplicate sequences.

Both approaches seems to have pros and cons. Building a library from a concatenated fasta would mean that even non-repeated elements are present ~10 times. But combining libraries from each species meaning similar repeat families wouldn't share common names. Also, deciding how to remove duplicates seems like an issue in itself.

We would very much appreciate any thoughts or advice from the experts. Thank you!

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions