openvariant.tasks package#

Submodules#

openvariant.tasks.cat module#

Cat task#

A core functionality to execute cat task.

openvariant.tasks.cat.cat(base_path: str, annotation_path: Optional[str] = None, where: Optional[str] = None, header_show: bool = True, output: Optional[str] = None, skip_files: bool = False) None[source]

Print on the stdout/”output” the parsed files.

It will parse the input files with its proper annotation schema, and it’ll show the result on the stdout. It can be printed with or without header. Can be added a ‘where’ expression.

Parameters
  • base_path (srt) – Base path of input files.

  • annotation_path (str or None) – Path of annotation file.

  • where (str) – Conditional statement.

  • header_show (bool) – Shows header on the output.

  • output (str or None) – Save output on a file.

  • skip_files (bool) – Skip unreadable files and directories.

openvariant.tasks.count module#

Count task#

A core functionality to execute count task.

openvariant.tasks.count.count(base_path: str, annotation_path: str, group_by: Optional[str] = None, where: Optional[str] = None, cores: int = 2, quite: bool = False, skip_files: bool = False) Tuple[int, Optional[dict]][source]

Print on the stdout the count result.

It’ll parse the input files with its proper annotation schema, and it’ll show the count result on the stdout. Can be grouped by a field and can be added a ‘where’ expression.

Parameters
  • base_path (srt) – Base path of input files.

  • annotation_path (str or None) – Path of annotation file.

  • group_by (str) – Field to group the result.

  • where (str) – Conditional statement.

  • quite (bool) – Discard progress bar.

  • cores (int) – Number of cores to parallelize the task.

  • skip_files (bool) – Skip unreadable files and directories.

Returns

  • int – The total number of rows.

  • dict – A schema with separate groups and the numbers of rows for each.

openvariant.tasks.groupby module#

Group by task#

A core functionality to execute group by task.

openvariant.tasks.groupby.group_by(base_path: str, annotation_path: str, script: str, key_by: str, where: Optional[str] = None, cores=2, quite=False, header: bool = False, skip_files: bool = False) Generator[Tuple[str, List, bool], None, None][source]#

Print on the stdout the group by result.

It’ll parse the input files with its proper annotation schema, and it’ll show the parsed result separated for each group by value. It’ll be grouped by a field and can be added a ‘where’ expression. Also, the result can be executed thought a bash script.

Parameters
  • base_path (srt) – Base path of input files.

  • annotation_path (str or None) – Path of annotation file.

  • script (str or None) – Path of annotation file.

  • key_by (str) – Field to group the result.

  • where (str) – Conditional statement.

  • quite (bool) – Discard progress bar.

  • cores (int) – Number of cores to parallelize the task.

  • header (bool) – Number of cores to parallelize the task.

  • skip_files (bool) – Skip unreadable files and directories.

Returns

  • int – The total number of rows.

  • dict – A schema with separate groups and the numbers of rows for each.

openvariant.tasks.plugin module#

Plugin task#

A core functionality to execute different plugin tasks.

Module contents#

openvariant.tasks.cat(base_path: str, annotation_path: Optional[str] = None, where: Optional[str] = None, header_show: bool = True, output: Optional[str] = None, skip_files: bool = False) None[source]#

Print on the stdout/”output” the parsed files.

It will parse the input files with its proper annotation schema, and it’ll show the result on the stdout. It can be printed with or without header. Can be added a ‘where’ expression.

Parameters
  • base_path (srt) – Base path of input files.

  • annotation_path (str or None) – Path of annotation file.

  • where (str) – Conditional statement.

  • header_show (bool) – Shows header on the output.

  • output (str or None) – Save output on a file.

  • skip_files (bool) – Skip unreadable files and directories.

openvariant.tasks.count(base_path: str, annotation_path: str, group_by: Optional[str] = None, where: Optional[str] = None, cores: int = 2, quite: bool = False, skip_files: bool = False) Tuple[int, Optional[dict]][source]#

Print on the stdout the count result.

It’ll parse the input files with its proper annotation schema, and it’ll show the count result on the stdout. Can be grouped by a field and can be added a ‘where’ expression.

Parameters
  • base_path (srt) – Base path of input files.

  • annotation_path (str or None) – Path of annotation file.

  • group_by (str) – Field to group the result.

  • where (str) – Conditional statement.

  • quite (bool) – Discard progress bar.

  • cores (int) – Number of cores to parallelize the task.

  • skip_files (bool) – Skip unreadable files and directories.

Returns

  • int – The total number of rows.

  • dict – A schema with separate groups and the numbers of rows for each.

openvariant.tasks.group_by(base_path: str, annotation_path: str, script: str, key_by: str, where: Optional[str] = None, cores=2, quite=False, header: bool = False, skip_files: bool = False) Generator[Tuple[str, List, bool], None, None][source]#

Print on the stdout the group by result.

It’ll parse the input files with its proper annotation schema, and it’ll show the parsed result separated for each group by value. It’ll be grouped by a field and can be added a ‘where’ expression. Also, the result can be executed thought a bash script.

Parameters
  • base_path (srt) – Base path of input files.

  • annotation_path (str or None) – Path of annotation file.

  • script (str or None) – Path of annotation file.

  • key_by (str) – Field to group the result.

  • where (str) – Conditional statement.

  • quite (bool) – Discard progress bar.

  • cores (int) – Number of cores to parallelize the task.

  • header (bool) – Number of cores to parallelize the task.

  • skip_files (bool) – Skip unreadable files and directories.

Returns

  • int – The total number of rows.

  • dict – A schema with separate groups and the numbers of rows for each.