Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat!: change name #20

Merged
merged 2 commits into from
Jan 30, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/release.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -103,7 +103,7 @@ jobs:
runs-on: ubuntu-latest
environment:
name: pypi
url: https://pypi.org/p/chainlifter
url: https://pypi.org/p/agct
permissions:
id-token: write # IMPORTANT: mandatory for trusted publishing
if: "startsWith(github.ref, 'refs/tags/')"
Expand Down
12 changes: 7 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,20 +1,22 @@
Drop-in replacement for the [pyliftover](https://github.com/konstantint/pyliftover) tool. Name forthcoming.
# agct: Another Genome Conversion Tool

Status: very, very preliminary.
Drop-in replacement for the [pyliftover](https://github.com/konstantint/pyliftover) tool, using the St. Jude's [chainfile](https://docs.rs/chainfile/latest/chainfile/) crate. Enables significantly faster chainfile loading from cold start (see `analysis/`).

Status: alpha.

## Usage

Initialize a class instance:

```python3
from chainlifter.lifter import ChainLifter
ch = ChainLifter("hg38", "hg19")
from agct import Converter
c = Converter("hg38", "hg19")
```

Call ``convert_coordinate()``:

```python3
ch.convert_coordinate("chr7", 140453136, "+")
c.convert_coordinate("chr7", 140453136, "+")
# [['chr7', '140152936', '+']]
```

Expand Down
46 changes: 23 additions & 23 deletions analysis/speed_test.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@
"outputs": [],
"source": [
"from pyliftover import LiftOver \n",
"from chainlifter.lifter import ChainLifter"
"from agct import Converter"
]
},
{
Expand Down Expand Up @@ -45,7 +45,7 @@
"name": "stdout",
"output_type": "stream",
"text": [
"1.33 s ± 161 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)\n"
"1.11 s ± 26.1 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)\n"
]
}
],
Expand All @@ -64,13 +64,13 @@
"name": "stdout",
"output_type": "stream",
"text": [
"234 ms ± 7.96 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)\n"
"217 ms ± 9.13 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)\n"
]
}
],
"source": [
"%%timeit\n",
"ch = ChainLifter(\"hg38\", \"hg19\")"
"converter = Converter(\"hg38\", \"hg19\")"
]
},
{
Expand All @@ -91,7 +91,7 @@
"name": "stdout",
"output_type": "stream",
"text": [
"1.2 s ± 27.8 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)\n"
"1.09 s ± 14.8 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)\n"
]
}
],
Expand All @@ -111,14 +111,14 @@
"name": "stdout",
"output_type": "stream",
"text": [
"233 ms ± 5.7 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)\n"
"215 ms ± 6.9 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)\n"
]
}
],
"source": [
"%%timeit\n",
"ch = ChainLifter(\"hg38\", \"hg19\")\n",
"ch.convert_coordinate(\"chr5\", 1404391, \"+\")"
"converter = Converter(\"hg38\", \"hg19\")\n",
"converter.convert_coordinate(\"chr5\", 1404391, \"+\")"
]
},
{
Expand All @@ -138,7 +138,7 @@
"source": [
"# load beforehand\n",
"pyl = LiftOver(\"hg38\", \"hg19\")\n",
"ch = ChainLifter(\"hg38\", \"hg19\")"
"converter = Converter(\"hg38\", \"hg19\")"
]
},
{
Expand All @@ -151,7 +151,7 @@
"name": "stdout",
"output_type": "stream",
"text": [
"2.07 µs ± 67.4 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)\n"
"1.97 µs ± 72.9 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)\n"
]
}
],
Expand All @@ -170,13 +170,13 @@
"name": "stdout",
"output_type": "stream",
"text": [
"1.97 µs ± 12.6 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)\n"
"2.77 µs ± 103 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)\n"
]
}
],
"source": [
"%%timeit\n",
"ch.convert_coordinate(\"chr5\", 1404391, \"+\")"
"converter.convert_coordinate(\"chr5\", 1404391, \"+\")"
]
},
{
Expand Down Expand Up @@ -205,7 +205,7 @@
"name": "stdout",
"output_type": "stream",
"text": [
"335 ms ± 16.9 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)\n"
"303 ms ± 13.1 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)\n"
]
}
],
Expand All @@ -224,13 +224,13 @@
"name": "stdout",
"output_type": "stream",
"text": [
"63.2 ms ± 773 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)\n"
"62.6 ms ± 2.99 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)\n"
]
}
],
"source": [
"%%timeit\n",
"ch = ChainLifter(\"hg19\", \"hg38\")"
"converter = Converter(\"hg19\", \"hg38\")"
]
},
{
Expand All @@ -251,7 +251,7 @@
"name": "stdout",
"output_type": "stream",
"text": [
"321 ms ± 6.01 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)\n"
"318 ms ± 15.5 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)\n"
]
}
],
Expand All @@ -271,14 +271,14 @@
"name": "stdout",
"output_type": "stream",
"text": [
"63.5 ms ± 806 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)\n"
"57.8 ms ± 742 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)\n"
]
}
],
"source": [
"%%timeit\n",
"ch = ChainLifter(\"hg19\", \"hg38\")\n",
"ch.convert_coordinate(\"chr5\", 1404391, \"+\")"
"converter = Converter(\"hg19\", \"hg38\")\n",
"converter.convert_coordinate(\"chr5\", 1404391, \"+\")"
]
},
{
Expand All @@ -298,7 +298,7 @@
"source": [
"# load beforehand\n",
"pyl = LiftOver(\"hg19\", \"hg38\")\n",
"ch = ChainLifter(\"hg19\", \"hg38\")"
"converter = Converter(\"hg19\", \"hg38\")"
]
},
{
Expand All @@ -311,7 +311,7 @@
"name": "stdout",
"output_type": "stream",
"text": [
"2.02 µs ± 11.6 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)\n"
"2.16 µs ± 232 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)\n"
]
}
],
Expand All @@ -330,13 +330,13 @@
"name": "stdout",
"output_type": "stream",
"text": [
"2.02 µs ± 56 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)\n"
"2.87 µs ± 65 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)\n"
]
}
],
"source": [
"%%timeit\n",
"ch.convert_coordinate(\"chr5\", 1404391, \"+\")"
"converter.convert_coordinate(\"chr5\", 1404391, \"+\")"
]
}
],
Expand Down
18 changes: 8 additions & 10 deletions pyproject.toml
Original file line number Diff line number Diff line change
@@ -1,18 +1,16 @@
[project]
name = "chainlifter"
name = "agct"
version = "0.1.0"
authors = [
{name = "James Stevenson"}
]
description = "Python frontend to Rust chainfile crate"
description = "Another Genome Conversion Tool: Python frontend to Rust chainfile crate"
readme = "README.md"
license = {file = "LICENSE"}
requires-python = ">=3.8"
classifiers = [
"Development Status :: 3 - Alpha",
"Programming Language :: Rust",
"Programming Language :: Python :: Implementation :: CPython",
"Programming Language :: Python :: Implementation :: PyPy",
"Programming Language :: Python",
"Programming Language :: Python :: 3",
"Programming Language :: Python :: 3 :: Only",
Expand All @@ -38,19 +36,19 @@ dev = [
]

[project.urls]
Homepage = "https://github.com/genomicmedlab/chainlifter"
Documentation = "https://github.com/genomicmedlab/chainlifter"
Changelog = "https://github.com/genomicmedlab/chainlifter/releases"
Source = "https://github.com/genomicmedlab/chainlifter"
"Bug Tracker" = "https://github.com/genomicmedlab/chainlifter/issues"
Homepage = "https://github.com/genomicmedlab/agct"
Documentation = "https://github.com/genomicmedlab/agct"
Changelog = "https://github.com/genomicmedlab/agct/releases"
Source = "https://github.com/genomicmedlab/agct"
"Bug Tracker" = "https://github.com/genomicmedlab/agct/issues"

[build-system]
requires = ["maturin>=1.2,<2.0"]
build-backend = "maturin"

[tool.maturin]
features = ["pyo3/extension-module"]
module-name = "chainlifter._core"
module-name = "agct._core"
python-source = "src"

[tool.pytest.ini_options]
Expand Down
4 changes: 2 additions & 2 deletions rust/Cargo.toml
Original file line number Diff line number Diff line change
@@ -1,10 +1,10 @@
[package]
name = "chainlifter"
name = "agct"
version = "0.1.0"
edition = "2021"

[lib]
name = "chainlifter"
name = "agct"
crate-type = ["cdylib"]

[dependencies]
Expand Down
22 changes: 11 additions & 11 deletions rust/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -7,21 +7,21 @@ use pyo3::prelude::*;
use std::fs::File;
use std::io::BufReader;

create_exception!(chainlifter, NoLiftoverError, PyException);
create_exception!(chainlifter, ChainfileError, PyException);
create_exception!(chainlifter, StrandValueError, PyException);
create_exception!(agct, NoLiftoverError, PyException);
create_exception!(agct, ChainfileError, PyException);
create_exception!(agct, StrandValueError, PyException);

/// Define core ChainLifter class to be used by Python interface.
/// Define core Converter class to be used by Python interface.
/// Effectively just a wrapper on top of the chainfile crate's Machine struct.
#[pyclass]
pub struct ChainLifter {
pub struct Converter {
pub machine: chain::liftover::machine::Machine,
}

#[pymethods]
impl ChainLifter {
impl Converter {
#[new]
pub fn new(chainfile_path: &str) -> PyResult<ChainLifter> {
pub fn new(chainfile_path: &str) -> PyResult<Converter> {
let Ok(chainfile_file) = File::open(chainfile_path) else {
return Err(PyFileNotFoundError::new_err(format!(
"Unable to open chainfile located at \"{}\"",
Expand All @@ -36,7 +36,7 @@ impl ChainLifter {
&chainfile_path
)));
};
Ok(ChainLifter { machine })
Ok(Converter { machine })
}

/// Perform liftover
Expand Down Expand Up @@ -83,11 +83,11 @@ impl ChainLifter {
}
}

/// ChainLifter Python module. Collect Python-facing methods.
/// agct._core Python module. Collect Python-facing methods.
#[pymodule]
#[pyo3(name = "_core")]
fn chainlifter(_py: Python<'_>, m: &PyModule) -> PyResult<()> {
m.add_class::<ChainLifter>()?;
fn agct(_py: Python<'_>, m: &PyModule) -> PyResult<()> {
m.add_class::<Converter>()?;
m.add("NoLiftoverError", _py.get_type::<NoLiftoverError>())?;
m.add("ChainfileError", _py.get_type::<ChainfileError>())?;
m.add("StrandValueError", _py.get_type::<StrandValueError>())?;
Expand Down
4 changes: 4 additions & 0 deletions src/agct/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
"""Provide fast liftover in Python via the ``chainfile`` crate."""
from agct.converter import Converter, Genome, Strand

__all__ = ["Converter", "Strand", "Genome"]
14 changes: 7 additions & 7 deletions src/chainlifter/lifter.py → src/agct/converter.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@
from wags_tails.utils.downloads import download_http, handle_gzip
from wags_tails.utils.storage import get_data_dir

import chainlifter._core as _core
import agct._core as _core

_logger = logging.getLogger(__name__)

Expand All @@ -31,7 +31,7 @@ class Genome(str, Enum):
HG19 = "hg19"


class ChainLifter:
class Converter:
"""Chainfile-based liftover provider for a single sequence to sequence
association.
"""
Expand Down Expand Up @@ -59,7 +59,7 @@ def __init__(self, from_db: Genome, to_db: Genome) -> None:
)
file, _ = data_handler.get_latest()
try:
self._chainlifter = _core.ChainLifter(str(file.absolute()))
self._converter = _core.Converter(str(file.absolute()))
except FileNotFoundError as e:
_logger.error("Unable to open chainfile located at %s", file.absolute())
raise e
Expand Down Expand Up @@ -100,10 +100,10 @@ def convert_coordinate(

.. code-block:: python

from chainlifter.lifter import ChainLifter, Strand
from agct import Converter, Strand

lifter = ChainLifter("hg19", "hg38")
lifter.convert_coordinate("chr7", 140453136, Strand.POSITIVE)
c = Converter("hg19", "hg38")
c.convert_coordinate("chr7", 140453136, Strand.POSITIVE)
# returns [['chr7', '140753336', '+']]


Expand All @@ -113,7 +113,7 @@ def convert_coordinate(
:return: list of coordinate matches (possibly empty)
"""
try:
results = self._chainlifter.lift(chrom, pos, strand)
results = self._converter.lift(chrom, pos, strand)
except _core.NoLiftoverError:
results = []
except _core.ChainfileError:
Expand Down
1 change: 0 additions & 1 deletion src/chainlifter/__init__.py

This file was deleted.

2 changes: 0 additions & 2 deletions src/chainlifter/version.py

This file was deleted.

Loading
Loading