With directory path#
A Simple example on how find files task works for a directory path.
[1] -
from os import getcwd
from os.path import dirname
from openvariant import findfiles
dataset_folder = f'{dirname(getcwd())}/datasets/sample1'
It will get any type of file that matches with any annotation file that is on the same folder or in a child folder.
find_files function parameters:
base_path- Base path of input folder.annotation_path- Path of annotation file.skip_files- Skip unreadable files and directories.
As we see, the output has two types of pattern *.vcf.gz and *.maf.gz.
[2] -
for file_path, annotation in findfiles(base_path=dataset_folder):
print(f'File path: {file_path}')
print(f'Annotation object: {annotation}')
print("-------------------------------------")
File path: /home/dmartinez/openvariant/examples/datasets/sample1/5a3a743.wxs.maf.gz
Annotation object: <openvariant.annotation.annotation.Annotation object at 0x7f5e90363940>
-------------------------------------
File path: /home/dmartinez/openvariant/examples/datasets/sample1/22f5b2f.wxs.maf.gz
Annotation object: <openvariant.annotation.annotation.Annotation object at 0x7f5e90363940>
-------------------------------------
File path: /home/dmartinez/openvariant/examples/datasets/sample1/345c90e.raw_somatic_mutation.vcf.gz
Annotation object: <openvariant.annotation.annotation.Annotation object at 0x7f5e90720160>
-------------------------------------
File path: /home/dmartinez/openvariant/examples/datasets/sample1/de46011.raw_somatic_mutation.vcf.gz
Annotation object: <openvariant.annotation.annotation.Annotation object at 0x7f5e90720160>
-------------------------------------
File path: /home/dmartinez/openvariant/examples/datasets/sample1/sample1_1/3a70e22.raw_somatic_mutation.vcf.gz
Annotation object: <openvariant.annotation.annotation.Annotation object at 0x7f5e90720160>
-------------------------------------
File path: /home/dmartinez/openvariant/examples/datasets/sample1/sample1_1/4c0b87e.raw_somatic_mutation.vcf.gz
Annotation object: <openvariant.annotation.annotation.Annotation object at 0x7f5e90720160>
-------------------------------------
In the following example, we will get files with a fixed Annotation file.
All the files that we will be able to detect will follow the pattern described on annotations.
[3] -
annotation_file = f'{dirname(getcwd())}/datasets/sample1/annotation_maf.yaml'
for file_path, annotation in findfiles(base_path=dataset_folder, annotation_path=annotation_file):
print(f'File path: {file_path}')
print(f'Annotation object: {annotation}')
print("-------------------------------------")
File path: /home/dmartinez/openvariant/examples/datasets/sample1/5a3a743.wxs.maf.gz
Annotation object: <openvariant.annotation.annotation.Annotation object at 0x7f5e90382e80>
-------------------------------------
File path: /home/dmartinez/openvariant/examples/datasets/sample1/22f5b2f.wxs.maf.gz
Annotation object: <openvariant.annotation.annotation.Annotation object at 0x7f5e90382e80>
-------------------------------------