Skip to content

Commit

Permalink
Merge branch 'main' of https://github.com/BGU-CS-VIL/pdc-dp-means int…
Browse files Browse the repository at this point in the history
…o main
  • Loading branch information
dinarior committed May 29, 2023
2 parents bc826ca + 0b0eb54 commit 58d833a
Show file tree
Hide file tree
Showing 3 changed files with 55 additions and 6 deletions.
28 changes: 27 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,32 @@ number of clusters is unknown. Unlike K-means, however, DP-means is hard to para
### Installation
`pip install pdc-dp-means`

Installation requires `scikit-learn>=1.2,<1.3` and `numpy >= 1.23.0`.
### Quick Start

from sklearn.datasets import make_blobs
from pdc_dp_means import DPMeans

# Generate sample data
X, y_true = make_blobs(n_samples=300, centers=4, cluster_std=0.60, random_state=0)

# Apply DPMeans clustering
dpmeans = DPMeans(n_clusters=1,n_init=10, delta=10) # n_init and delta parameters
dpmeans.fit(X)

# Predict the cluster for each data point
y_dpmeans = dpmeans.predict(X)

# Plotting clusters and centroids
import matplotlib.pyplot as plt

plt.scatter(X[:, 0], X[:, 1], c=y_dpmeans, s=50, cmap='viridis')
centers = dpmeans.cluster_centers_
plt.scatter(centers[:, 0], centers[:, 1], c='black', s=200, alpha=0.5)
plt.show()

One thing to note is that we replace the `\lambda` parameter from the paper with `delta` in the code, as `lambda` is a reserved word in python.

### Usage
Please refer to the documentation: https://pdc-dp-means.readthedocs.io/en/latest/

Expand Down Expand Up @@ -72,4 +98,4 @@ If you use this code for your work, please cite the following:
}
```
### License
Our code is licensed under the BDS-3-Clause license.
Our code is licensed under the BDS-3-Clause license.
4 changes: 2 additions & 2 deletions docs/requirements.txt
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
numpy<=1.3.0
numpy>=1.3.0
scikit-learn==1.2.2
pdc-dp-means
pdc-dp-means
29 changes: 26 additions & 3 deletions docs/source/index.rst
Original file line number Diff line number Diff line change
@@ -1,12 +1,35 @@
Welcome to PDC-DP-Means documentation!
======================================

**PDC-DP-Means** is a Python library for running fast, scalable DP-Means or Mini-Batch DP-Means. It is built on top scikit-learn and numpy.
**PDC-DP-Means** is a Python library for running fast, scalable DP-Means or Mini-Batch DP-Means. It is built on top of scikit-learn and numpy.

Check out the :doc:`usage` section for further information, including
how to :ref:`installation` the project.
how to :ref:`install <installation>` the project.

Quickstart
----------
.. code-block:: python
from sklearn.datasets import make_blobs
from pdc_dp_means import DPMeans
# Generate sample data
X, y_true = make_blobs(n_samples=300, centers=4, cluster_std=0.60, random_state=0)
# Apply DPMeans clustering
dpmeans = DPMeans(n_clusters=1,n_init=10, delta=10) # n_init and delta parameters
dpmeans.fit(X)
# Predict the cluster for each data point
y_dpmeans = dpmeans.predict(X)
# Plotting clusters and centroids
import matplotlib.pyplot as plt
plt.scatter(X[:, 0], X[:, 1], c=y_dpmeans, s=50, cmap='viridis')
centers = dpmeans.cluster_centers_
plt.scatter(centers[:, 0], centers[:, 1], c='black', s=200, alpha=0.5)
plt.show()
Contents
--------
Expand All @@ -29,4 +52,4 @@ If you use this package for your reseach, please cite the following paper:
author={Dinari, Or and Freifeld, Oren},
booktitle={The 38th Conference on Uncertainty in Artificial Intelligence},
year={2022}
}
}

0 comments on commit 58d833a

Please sign in to comment.