Skip to content

Commit

Permalink
chore: update Hugging Face sync action and add line ending conversion…
Browse files Browse the repository at this point in the history
… script
  • Loading branch information
titusz committed Aug 14, 2024
1 parent b5d6bb1 commit 0e65d50
Show file tree
Hide file tree
Showing 5 changed files with 49 additions and 28 deletions.
16 changes: 8 additions & 8 deletions .github/workflows/huggingface.yml
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
name: Publish on Hugging Face Hub
name: Sync with Hugging Face Space
on:
push:
branches:
Expand All @@ -7,12 +7,12 @@ jobs:
build:
runs-on: ubuntu-latest
steps:
- name: Sync with Hugging Face
uses: nateraw/huggingface-sync-action@v0.0.5
- name: Sync with HF
uses: alex-bene/huggingface-space-sync-action@v0.1
with:
github_repo_id: iscc/iscc-sct
huggingface_repo_id: iscc/iscc-sct
repo_type: space
space_sdk: gradio
private: false
github_repo_id: 'iscc/iscc-sct'
github_branch: 'huggingface'
yaml_header_path: 'space.yml'
huggingface_repo_id: 'iscc/iscc-sct'
hf_username: 'titusz'
hf_token: ${{ secrets.HF_TOKEN }}
20 changes: 2 additions & 18 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,15 +1,3 @@
---
title: ISCC-LAB - Semantic-Code Text
emoji: ▶️
colorFrom: red
colorTo: blue
sdk: gradio
sdk_version: 4.41.0
pinned: true
license: CC-BY-NC-SA-4.0
short_description: Cross Lingual Similarity Preserving Text Simprints
---

# ISCC - Semantic Text-Code

[![Tests](https://github.com/iscc/iscc-sct/actions/workflows/tests.yml/badge.svg)](https://github.com/iscc/iscc-core/actions/workflows/tests.yml)
Expand Down Expand Up @@ -188,9 +176,5 @@ simprints based on larger chunks of text.
## Acknowledgements

- Text Chunking: [text-splitter](https://github.com/benbrandt/text-splitter)
- Text Embedding:
[Sentence-Transformer](https://www.sbert.net/docs/sentence_transformer/pretrained_models.html#original-models)

## License

This project is licensed under the CC-BY-NC-SA-4.0 International License.
- Text Embeddings:
[Sentence-Transformers](https://huggingface.co/sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2)
22 changes: 22 additions & 0 deletions iscc_sct/dev.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
import pathlib


HERE = pathlib.Path(__file__).parent.absolute()


def convert_lf(): # pragma: no cover
"""Convert line endings to LF"""
crlf = b"\r\n"
lf = b"\n"
extensions = {".py", ".toml", ".lock", ".txt", ".yml", ".sh", ".md"}
n = 0
for fp in HERE.parent.glob("**/*"):
if fp.suffix in extensions:
with open(fp, "rb") as infile:
content = infile.read()
if crlf in content:
content = content.replace(crlf, lf)
with open(fp, "wb") as outfile:
outfile.write(content)
n += 1
print(f"{n} files converted to LF")
10 changes: 8 additions & 2 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -84,11 +84,17 @@ line-length = 119
[tool.ruff.format]
line-ending = "lf"

[tool.coverage.run]
omit = ["iscc_sct/dev.py", "tests/"]

[tool.poe.tasks]
format-code = { cmd = "ruff format", help = "Code style formating with ruff" }
format-markdown = { cmd = "mdformat --wrap 119 --end-of-line lf README.md", help = "Markdown formating with mdformat" }
test = { cmd = "pytest --cov=iscc_sct --cov-fail-under=100 --cov-report=term-missing --color=yes", help = "Run tests with coverage" }
all = ["format-code", "format-markdown", "test"]
convert-lf = { script = "iscc_sct.dev:convert_lf", help = "Convert line endings to LF"}
test = { cmd = "pytest --cov=iscc_sct --cov-fail-under=100", help = "Run tests with coverage" }
update-dependencies = { cmd = "poetry update", help = "Update dependencies" }
all = ["format-code", "format-markdown", "convert-lf", "test"]
update = ["update-dependencies", "all"]

[build-system]
requires = ["poetry-core>=1.0.0"]
Expand Down
9 changes: 9 additions & 0 deletions space.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
title: ISCC-LAB - Semantic-Code Text
emoji: ▶️
colorFrom: red
colorTo: blue
sdk: gradio
sdk_version: 4.41.0
pinned: true
license: CC-BY-NC-SA-4.0
short_description: Cross Lingual Similarity Preserving Text Simprints

0 comments on commit 0e65d50

Please sign in to comment.