Troubleshooting Guide¶
Common Issues and Solutions¶
Setup and Installation Issues¶
Git Configuration Problems¶
Problem: Git not configured properly
# Check current configuration
git config --list
# Error: Please tell me who you are
git config --global user.name "Your Name"
git config --global user.email "your.email@example.com"
Problem: SSH key authentication fails
# Generate new SSH key
ssh-keygen -t ed25519 -C "your.email@example.com"
# Add to SSH agent
eval "$(ssh-agent -s)"
ssh-add ~/.ssh/id_ed25519
# Test connection
ssh -T git@github.com
Permission Denied Errors¶
Problem: Cannot execute scripts or access files
# Fix script permissions
chmod +x script.sh
# Fix directory permissions
chmod 755 directory/
chmod -R 755 directory/ # Recursive
# Fix SSH key permissions
chmod 600 ~/.ssh/id_ed25519
chmod 644 ~/.ssh/id_ed25519.pub
Data Analysis Issues¶
FastQC Problems¶
Problem: FastQC fails to run
# Check Java installation
java -version
# Install Java if missing (Ubuntu/Debian)
sudo apt install default-jdk
# Run FastQC with memory limit
fastqc --memory 4096 *.fastq.gz
Problem: Out of memory errors
# Increase memory allocation
fastqc --memory 8192 file.fastq.gz
# Process files individually
for file in *.fastq.gz; do
fastqc --memory 4096 "$file"
done
Assembly Issues¶
Problem: SPAdes assembly fails
# Check available memory
free -h
# Run with memory limit
spades.py --memory 32 -1 R1.fastq.gz -2 R2.fastq.gz -o output/
# Try different k-mer sizes
spades.py -k 21,33,55 -1 R1.fastq.gz -2 R2.fastq.gz -o output/
Problem: Poor assembly quality (high fragmentation)
# Check input data quality first
fastqc input_files.fastq.gz
# Try more aggressive trimming
trimmomatic PE input_R1.fastq.gz input_R2.fastq.gz \
output_R1.fastq.gz output_R1_unpaired.fastq.gz \
output_R2.fastq.gz output_R2_unpaired.fastq.gz \
LEADING:10 TRAILING:10 SLIDINGWINDOW:4:20 MINLEN:50
# Use careful mode in SPAdes
spades.py --careful -1 trimmed_R1.fastq.gz -2 trimmed_R2.fastq.gz -o careful_assembly/
Tool Installation and Dependencies¶
Conda/Mamba Issues¶
Problem: Environment creation fails
# Update conda
conda update conda
# Clear package cache
conda clean --all
# Create environment with specific Python version
conda create -n genomics python=3.9
# Use mamba for faster solving
mamba create -n genomics python=3.9
Problem: Package conflicts
# Create minimal environment first
conda create -n clean_env python=3.9
# Activate and install packages one by one
conda activate clean_env
conda install -c bioconda fastqc
conda install -c bioconda spades
Docker/Singularity Issues¶
Problem: Permission denied with Docker
# Add user to docker group
sudo usermod -aG docker $USER
# Log out and back in, then test
docker run hello-world
Problem: Singularity image won't run
# Pull image explicitly
singularity pull docker://biocontainers/fastqc:v0.11.9_cv8
# Run with specific bind paths
singularity exec -B /data:/data image.sif fastqc --version
# Check image integrity
singularity verify image.sif
HPC and Remote Access Issues¶
SSH Connection Problems¶
Problem: Connection timed out
# Test basic connectivity
ping hostname
# Try different port
ssh -p 2222 username@hostname
# Use verbose mode for debugging
ssh -v username@hostname
Problem: Key exchange failed
# Generate compatible key
ssh-keygen -t rsa -b 4096
# Specify key explicitly
ssh -i ~/.ssh/specific_key username@hostname
# Check SSH config
cat ~/.ssh/config
SLURM Job Issues¶
Problem: Job stuck in queue
# Check queue status
squeue -u $USER
# Check job details
scontrol show job JOBID
# Check partition availability
sinfo
Problem: Job fails with memory errors
# Check job output
cat slurm-JOBID.out
# Increase memory request
#SBATCH --mem=32G
# Use multiple cores if available
#SBATCH --cpus-per-task=8
Data Processing Errors¶
File Format Issues¶
Problem: Unexpected file format
# Check file type
file filename
head filename
# Convert line endings if needed
dos2unix filename
# Check compression
gunzip -t file.gz
Problem: Corrupt or truncated files
# Check file integrity
md5sum file.fastq.gz
# Compare with provided checksum
# Test gzip integrity
gunzip -t file.fastq.gz
# Repair if possible (may lose data)
gzip -d file.fastq.gz
gzip file.fastq
Large File Handling¶
Problem: Running out of disk space
# Check disk usage
df -h
du -sh directory/
# Clean up temporary files
rm -rf temp/
rm *.tmp
# Compress large files
gzip *.fastq
tar -czf archive.tar.gz directory/
Problem: Processing very large files
# Process in chunks
split -l 4000000 large_file.fastq chunk_
# Process each chunk separately
# Use streaming where possible
zcat file.fastq.gz | head -n 1000000 | tool
# Use efficient tools
seqtk sample file.fastq.gz 10000 > sample.fastq
Analysis and Interpretation Issues¶
Resistance Gene Detection¶
Problem: No resistance genes found (expected some)
# Check assembly quality
quast.py assembly.fasta
# Try multiple databases
abricate --db resfinder assembly.fasta
abricate --db card assembly.fasta
abricate --db argannot assembly.fasta
# Reduce stringency
abricate --minid 80 --mincov 60 assembly.fasta
Problem: Too many false positives
# Increase stringency
abricate --minid 95 --mincov 90 assembly.fasta
# Verify hits manually
blast -query resistance_gene.fasta -subject assembly.fasta
# Check for truncated genes
abricate --mincov 95 assembly.fasta
Phylogenetic Analysis¶
Problem: Tree looks wrong or unrealistic
# Check sequence alignment quality
aliview alignment.fasta
# Remove problematic sequences
seqtk subseq sequences.fasta good_ids.txt > clean.fasta
# Try different tree method
FastTree -nt alignment.fasta > tree.newick
iqtree -s alignment.fasta -m TEST
Problem: Low bootstrap support
# Increase bootstrap replicates
iqtree -s alignment.fasta -bb 1000
# Check for recombination
gubbins alignment.fasta
# Use only core SNPs
snp-sites -c alignment.fasta > core_snps.fasta
Performance and Resource Issues¶
Memory Management¶
Problem: Out of memory errors
# Check memory usage
free -h
top
# Limit memory usage
ulimit -v 8000000 # Limit to ~8GB
# Use memory-efficient tools
minimap2 instead of BWA-MEM for large references
Problem: Process running too slowly
# Use multiple cores
tool -t 8 input output
# Optimize I/O
# Use local storage instead of network drives
cp data /tmp/
cd /tmp/
# Run analysis
cp results back/to/network/storage
Storage Management¶
Problem: Quota exceeded
# Find large files
find . -size +100M -ls
# Clean up intermediate files
rm *.sam # Keep only BAM files
rm temp_*
# Compress old data
tar -czf old_analysis.tar.gz old_directory/
rm -rf old_directory/
Getting Help¶
Before Asking for Help¶
- Check error messages carefully - Often contain specific solutions
- Search documentation - Tool manuals usually have troubleshooting sections
- Try simple test cases - Use small datasets to isolate problems
- Check system resources - Memory, disk space, permissions
How to Ask for Help¶
Include Essential Information¶
- Exact error message (copy-paste, don't retype)
- Command that failed (exact command with parameters)
- System information (OS, tool versions)
- Input file details (size, format, sample content)
Good Help Request Example¶
Subject: SPAdes assembly fails with error code 1
I'm running SPAdes on paired-end M. tuberculosis data:
Command: spades.py -1 sample_R1.fastq.gz -2 sample_R2.fastq.gz -o spades_out/
Error message:
"== Error == system call for: ['/usr/bin/python3', '/opt/spades/bin/spades_init.py'] finished abnormally, err code: 1"
System: Ubuntu 20.04, SPAdes v3.15.3
Input files: 2x150bp Illumina, ~50x coverage, 2.3GB total
Available memory: 32GB
Disk space: 500GB free
I've tried with --careful flag and different k-mer sizes but get the same error.
Support Resources¶
Course Support¶
- Instructors: Available during course hours
- Slack Channel:
#troubleshooting
- Office Hours: Daily 17:00-18:00 (course week)
- Peer Support: Encouraged among participants
Online Resources¶
- Biostars: General bioinformatics Q&A
- Stack Overflow: Programming and command line issues
- Tool Documentation: Always check official documentation
- Galaxy Training: Alternative tutorials and explanations
Emergency Contacts¶
- Technical Issues: tech-support@course.org
- Data Access Problems: data-admin@course.org
- General Questions: instructors@course.org
Prevention Tips¶
Best Practices¶
- Test with small datasets first
- Keep detailed logs of commands
- Use version control for scripts
- Regular backups of important results
- Document your workflow steps
Common Pitfalls to Avoid¶
- Running analysis without checking input quality
- Using inappropriate parameters for your data type
- Ignoring error messages and logs
- Not checking intermediate results
- Working in directories with spaces in names
- Not backing up important data
Remember: Most bioinformatics problems have been encountered before. Don't hesitate to search online and ask for help - the community is generally very supportive!