Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ghost_reads.fa.separate_flanking.fa': No such file or directory #127

Open
ucsfpan opened this issue Sep 10, 2024 · 0 comments
Open

ghost_reads.fa.separate_flanking.fa': No such file or directory #127

ucsfpan opened this issue Sep 10, 2024 · 0 comments

Comments

@ucsfpan
Copy link

ucsfpan commented Sep 10, 2024

Hi,
Thank you for your great tools! I am trying to run xTea in long reads sequencing in PACBIO ccs data. And I follow the instructions from the readme. My command is " xtea_long -i sample_id.txt -b long_read_bam_list.txt -p /bastianlab/data1/hpan/xTea -o submit_jobs.sh --rmsk /bastianlab/data1/hpan/xTea/rep_lib_annotation/LINE/hg38/hg38_L1_larger_500_with_all_L1HS.out -r /bastianlab/data1/Shared_datasets/Database/References/ucsc_hg38.bwa-index/hg38.fa --cns /bastianlab/data1/hpan/xTea/rep_lib_annotation/consensus/LINE1.fa --rep /bastianlab/data1/hpan/xTea/rep_lib_annotation --xtea /c4/home/ucsf-pan/software/xTea/xtea_long -f 31 -y 15 -n 8 -m 32 --slurm -q long -t 2-0:0:0"

But error occurs like this:
clip cutoff is: 0
Loaded consensus file list: ['/bastianlab/data1/hpan/xTea/rep_lib_annotation/consensus_mask_lrd/polyA.fa', '/bastianlab/data1/hpan/xTea/rep_lib_annotation/consensus_mask_lrd/LINE1.fa', '/bastianlab/data1/hpan/xTea/rep_lib_annotation/consensus_mask_lrd/ALU.fa', '/bastianlab/data1/hpan/xTea/rep_lib_annotation/consensus_mask_lrd/HERV.fa', '/bastianlab/data1/hpan/xTea/rep_lib_annotation/consensus_mask_lrd/SVA_ori.fa']
Begin to construct the TE kmer library!
The TE kmer library is constructed/loaded!

Error: File None doesn't exist!!!

Running command: minimap2 -x ava-pb -c -a -t 8 /bastianlab/data1/hpan/xTea/MaMel-144al/ghost_reads.fa.separate_flanking.fa /bastianlab/data1/hpan/xTea/MaMel-144al/ghost_reads.fa.separate_flanking.fa | samtools view -hSb - | samtools sort -o /bastianlab/data1/hpan/xTea/MaMel-144al/ghost_reads.fa.separate_flanking.fa.algn_2_itself.sorted.bam -

[ERROR] failed to open file '/bastianlab/data1/hpan/xTea/MaMel-144al/ghost_reads.fa.separate_flanking.fa': No such file or directory
Traceback (most recent call last):
File "/c4/home/ucsf-pan/software/xTea/xtea_long/l_main.py", line 352, in
i_max_clip, i_min_overlap, iset_cutoff, s_cluster_folder)
File "/c4/home/ucsf-pan/software/xTea/xtea_long/l_ghost_TE.py", line 111, in cluster_reads_by_flank_region
m_info, m_reads, l_reads = self._parse_self_aligned_reads(sf_algnmt, i_max_clip)
File "/c4/home/ucsf-pan/software/xTea/xtea_long/l_ghost_TE.py", line 392, in _parse_self_aligned_reads
samfile = pysam.AlignmentFile(sf_bam, "rb")
File "pysam/libcalignmentfile.pyx", line 741, in pysam.libcalignmentfile.AlignmentFile.cinit
File "pysam/libcalignmentfile.pyx", line 990, in pysam.libcalignmentfile.AlignmentFile._open
ValueError: file has no sequences defined (mode='rb') - is it SAM/BAM format? Consider opening with check_sq=False
Running command: minimap2 -ax asm5 -t 8 /bastianlab/data1/Shared_datasets/Database/References/ucsc_hg38.bwa-index/hg38.fa /bastianlab/data1/hpan/xTea/MaMel-144al/all_ins_seqs.fa | samtools view -hSb - | samtools sort -o /bastianlab/data1/hpan/xTea/MaMel-144al/tmp/classification/all_tei_seq_2_ref.bam -

^CTraceback (most recent call last):
File "/c4/home/ucsf-pan/software/xTea/xtea_long/l_main.py", line 311, in
lrc.classify_ins_seqs(sf_rep_ins, sf_ref, flk_lenth, sf_rslt)
File "/c4/home/ucsf-pan/software/xTea/xtea_long/l_rep_classification.py", line 121, in classify_ins_seqs
self.classify_from_ref_algnmt(sf_ref, sf_rep_ins, sf_rslt)
File "/c4/home/ucsf-pan/software/xTea/xtea_long/l_rep_classification.py", line 114, in classify_from_ref_algnmt
xtea_contig.align_contigs_2_reference_genome(sf_ref, sf_rep_ins, self.n_jobs, sf_algnmt)
File "/c4/home/ucsf-pan/software/xTea/xtea_long/x_contig.py", line 106, in align_contigs_2_reference_genome
self.run_cmd(cmd)
File "/c4/home/ucsf-pan/software/xTea/xtea_long/x_contig.py", line 47, in run_cmd
self.cmd_runner.run_cmd_small_output(cmd)
File "/c4/home/ucsf-pan/software/xTea/xtea_long/cmd_runner.py", line 13, in run_cmd_small_output
subprocess.Popen(cmd, shell=True, stdout=subprocess.PIPE).communicate()
File "/c4/home/ucsf-pan/miniconda3/envs/svim/lib/python3.7/subprocess.py", line 951, in communicate
stdout = self.stdout.read()
KeyboardInterrupt
(svim) [ucsf-pan@c4-n25 MaMel-144al]$ sh run_xTEA_pipeline.sh
Ave coverage is 0: using parameters clip with value 1

Ave coverage is 0: using parameters clip with value 1

clip cutoff is: 0
Loaded consensus file list: ['/bastianlab/data1/hpan/xTea/rep_lib_annotation/consensus_mask_lrd/polyA.fa', '/bastianlab/data1/hpan/xTea/rep_lib_annotation/consensus_mask_lrd/LINE1.fa', '/bastianlab/data1/hpan/xTea/rep_lib_annotation/consensus_mask_lrd/ALU.fa', '/bastianlab/data1/hpan/xTea/rep_lib_annotation/consensus_mask_lrd/HERV.fa', '/bastianlab/data1/hpan/xTea/rep_lib_annotation/consensus_mask_lrd/SVA_ori.fa']
Begin to construct the TE kmer library!

The TE kmer library is constructed/loaded!

Error: File None doesn't exist!!!

Running command: minimap2 -x ava-pb -c -a -t 8 /bastianlab/data1/hpan/xTea/MaMel-144al/ghost_reads.fa.separate_flanking.fa /bastianlab/data1/hpan/xTea/MaMel-144al/ghost_reads.fa.separate_flanking.fa | samtools view -hSb - | samtools sort -o /bastianlab/data1/hpan/xTea/MaMel-144al/ghost_reads.fa.separate_flanking.fa.algn_2_itself.sorted.bam -

[ERROR] failed to open file '/bastianlab/data1/hpan/xTea/MaMel-144al/ghost_reads.fa.separate_flanking.fa': No such file or directory
Traceback (most recent call last):
File "/c4/home/ucsf-pan/software/xTea/xtea_long/l_main.py", line 352, in
i_max_clip, i_min_overlap, iset_cutoff, s_cluster_folder)
File "/c4/home/ucsf-pan/software/xTea/xtea_long/l_ghost_TE.py", line 111, in cluster_reads_by_flank_region
m_info, m_reads, l_reads = self._parse_self_aligned_reads(sf_algnmt, i_max_clip)
File "/c4/home/ucsf-pan/software/xTea/xtea_long/l_ghost_TE.py", line 392, in _parse_self_aligned_reads
samfile = pysam.AlignmentFile(sf_bam, "rb")
File "pysam/libcalignmentfile.pyx", line 741, in pysam.libcalignmentfile.AlignmentFile.cinit
File "pysam/libcalignmentfile.pyx", line 990, in pysam.libcalignmentfile.AlignmentFile._open
ValueError: file has no sequences defined (mode='rb') - is it SAM/BAM format? Consider opening with check_sq=False
Running command: minimap2 -ax asm5 -t 8 /bastianlab/data1/Shared_datasets/Database/References/ucsc_hg38.bwa-index/hg38.fa /bastianlab/data1/hpan/xTea/MaMel-144al/all_ins_seqs.fa | samtools view -hSb - | samtools sort -o /bastianlab/data1/hpan/xTea/MaMel-144al/tmp/classification/all_tei_seq_2_ref.bam -

[M::mm_idx_gen::53.8711.58] collected minimizers
[M::mm_idx_gen::75.459
1.80] sorted minimizers
[M::main::75.4591.80] loaded/built the index for 455 target sequence(s)
[M::mm_mapopt_update::102.087
1.59] mid_occ = 144
[M::mm_idx_stat] kmer size: 19; skip: 19; is_hpc: 0; #seq: 455
[M::mm_idx_stat::106.027*1.57] distinct minimizers: 214834535 (90.55% are singletons); average occurrences: 1.424; average spacing: 10.491; total length: 3209286105
ERROR: failed to open file '/bastianlab/data1/hpan/xTea/MaMel-144al/all_ins_seqs.fa': No such file or directory
ERROR: failed to map the query file
Running command: samtools index /bastianlab/data1/hpan/xTea/MaMel-144al/tmp/classification/all_tei_seq_2_ref.bam
Working on polyA with contigs /bastianlab/data1/hpan/xTea/MaMel-144al/all_ins_seqs.fa and consensus /bastianlab/data1/hpan/xTea/rep_lib_annotation/consensus_mask_lrd/polyA.fa

Running command: minimap2 -k11 -w5 --sr --frag=yes -A2 -B4 -O4,8 -E2,1 -r150 -p.5 -N5 -n1 -m20 -s30 -g200 -2K50m --MD --heap-sort=yes --secondary=no --cs -a -t 8 /bastianlab/data1/hpan/xTea/rep_lib_annotation/consensus_mask_lrd/polyA.fa /bastianlab/data1/hpan/xTea/MaMel-144al/all_ins_seqs.fa | samtools view -hSb - | samtools sort -o /bastianlab/data1/hpan/xTea/MaMel-144al/tmp/classification/polyA_cns.bam -
[M::mm_idx_gen::0.0021.85] collected minimizers
[M::mm_idx_gen::0.003
3.76] sorted minimizers
[M::main::0.0033.73] loaded/built the index for 1 target sequence(s)
[M::mm_mapopt_update::0.003
3.69] mid_occ = 219
[M::mm_idx_stat] kmer size: 11; skip: 5; is_hpc: 0; #seq: 1
[M::mm_idx_stat::0.003*3.65] distinct minimizers: 1 (0.00% are singletons); average occurrences: 218.000; average spacing: 1.050; total length: 229
ERROR: failed to open file '/bastianlab/data1/hpan/xTea/MaMel-144al/all_ins_seqs.fa': No such file or directory
ERROR: failed to map the query file
Running command: samtools index /bastianlab/data1/hpan/xTea/MaMel-144al/tmp/classification/polyA_cns.bam
Traceback (most recent call last):
File "/c4/home/ucsf-pan/software/xTea/xtea_long/l_main.py", line 311, in
lrc.classify_ins_seqs(sf_rep_ins, sf_ref, flk_lenth, sf_rslt)
File "/c4/home/ucsf-pan/software/xTea/xtea_long/l_rep_classification.py", line 175, in classify_ins_seqs
self.get_unmasked_seqs(sf_rep_ins_tmp, sf_tmp_out, sf_new_tmp)
File "/c4/home/ucsf-pan/software/xTea/xtea_long/l_rep_classification.py", line 297, in get_unmasked_seqs
with pysam.FastxFile(sf_ori) as fin_ori, open(sf_new, "w") as fout_new:
File "pysam/libcfaidx.pyx", line 550, in pysam.libcfaidx.FastxFile.cinit
File "pysam/libcfaidx.pyx", line 580, in pysam.libcfaidx.FastxFile._open
OSError: file /bastianlab/data1/hpan/xTea/MaMel-144al/all_ins_seqs.fa not found

I have no idea how to solve this. Could you please guide me how to solve this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant