With directory path#

A Simple example on how find files task works for a directory path.

[1] - 
from os import getcwd
from os.path import dirname
from openvariant import findfiles

dataset_folder = f'{dirname(getcwd())}/datasets/sample1'

It will get any type of file that matches with any annotation file that is on the same folder or in a child folder.

find_files function parameters:

  • base_path - Base path of input folder.

  • annotation_path - Path of annotation file.

  • skip_files - Skip unreadable files and directories.

As we see, the output has two types of pattern *.vcf.gz and *.maf.gz.

[2] - 
for file_path, annotation in findfiles(base_path=dataset_folder):
    print(f'File path: {file_path}')
    print(f'Annotation object: {annotation}')
    print("-------------------------------------")
File path: /home/dmartinez/openvariant/examples/datasets/sample1/5a3a743.wxs.maf.gz
Annotation object: <openvariant.annotation.annotation.Annotation object at 0x7f5e90363940>
-------------------------------------
File path: /home/dmartinez/openvariant/examples/datasets/sample1/22f5b2f.wxs.maf.gz
Annotation object: <openvariant.annotation.annotation.Annotation object at 0x7f5e90363940>
-------------------------------------
File path: /home/dmartinez/openvariant/examples/datasets/sample1/345c90e.raw_somatic_mutation.vcf.gz
Annotation object: <openvariant.annotation.annotation.Annotation object at 0x7f5e90720160>
-------------------------------------
File path: /home/dmartinez/openvariant/examples/datasets/sample1/de46011.raw_somatic_mutation.vcf.gz
Annotation object: <openvariant.annotation.annotation.Annotation object at 0x7f5e90720160>
-------------------------------------
File path: /home/dmartinez/openvariant/examples/datasets/sample1/sample1_1/3a70e22.raw_somatic_mutation.vcf.gz
Annotation object: <openvariant.annotation.annotation.Annotation object at 0x7f5e90720160>
-------------------------------------
File path: /home/dmartinez/openvariant/examples/datasets/sample1/sample1_1/4c0b87e.raw_somatic_mutation.vcf.gz
Annotation object: <openvariant.annotation.annotation.Annotation object at 0x7f5e90720160>
-------------------------------------

In the following example, we will get files with a fixed Annotation file.

All the files that we will be able to detect will follow the pattern described on annotations.

[3] - 
annotation_file = f'{dirname(getcwd())}/datasets/sample1/annotation_maf.yaml'

for file_path, annotation in findfiles(base_path=dataset_folder, annotation_path=annotation_file):
    print(f'File path: {file_path}')
    print(f'Annotation object: {annotation}')
    print("-------------------------------------")
File path: /home/dmartinez/openvariant/examples/datasets/sample1/5a3a743.wxs.maf.gz
Annotation object: <openvariant.annotation.annotation.Annotation object at 0x7f5e90382e80>
-------------------------------------
File path: /home/dmartinez/openvariant/examples/datasets/sample1/22f5b2f.wxs.maf.gz
Annotation object: <openvariant.annotation.annotation.Annotation object at 0x7f5e90382e80>
-------------------------------------