pyranges.get_fasta
¶
Module Contents¶
-
pyranges.get_fasta.
get_fasta
(gr, path)¶ Get fasta sequence.
Parameters: - gr (PyRanges) – Coordinates.
- path (str) – Path to fasta file
Returns: Sequences, one per interval.
Return type: Series
Note
Sorting the PyRanges is likely to improve the speed.
Warning
Note that the names in the fasta header and gr must be the same.
Examples
>>> gr = pr.from_dict({"Chromosome": ["chr1", "chr1"], ... "Start": [5, 0], "End": [8, 5]})
>>> gr +--------------+-----------+-----------+ | Chromosome | Start | End | | (category) | (int32) | (int32) | |--------------+-----------+-----------| | chr1 | 5 | 8 | | chr1 | 0 | 5 | +--------------+-----------+-----------+ Unstranded PyRanges object has 2 rows and 3 columns from 1 chromosomes. For printing, the PyRanges was sorted on Chromosome.
>>> tmp_handle = open("temp.fasta", "w+") >>> _ = tmp_handle.write("> chr1\n") >>> _ = tmp_handle.write("ATTACCAT") >>> tmp_handle.close()
>>> seq = pr.get_fasta(gr, "temp.fasta")
>>> seq 0 CAT 1 ATTAC dtype: object
>>> gr.seq = seq >>> gr +--------------+-----------+-----------+------------+ | Chromosome | Start | End | seq | | (category) | (int32) | (int32) | (object) | |--------------+-----------+-----------+------------| | chr1 | 5 | 8 | CAT | | chr1 | 0 | 5 | ATTAC | +--------------+-----------+-----------+------------+ Unstranded PyRanges object has 2 rows and 4 columns from 1 chromosomes. For printing, the PyRanges was sorted on Chromosome.