Skip to content

Commit

Permalink
Update paper.md
Browse files Browse the repository at this point in the history
I added some copy-editing corrections to the body of the article.
  • Loading branch information
teonbrooks committed Aug 9, 2024
1 parent eb3f97c commit 166f68b
Showing 1 changed file with 12 additions and 12 deletions.
24 changes: 12 additions & 12 deletions article/paper.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
title: 'Statmanager-kr: A User-friendly Statistical Package for Python in Pandas'
tags:
- Python
- statistic analysis
- statistical analysis
- social science
- null-hypothesis
- user-friendly
Expand All @@ -21,17 +21,17 @@ bibliography: paper.bib

Python is one of the most popular and easy-to-use programming languages. Despite the fact that many researchers use Python for data preprocessing and statistical analysis, there are few statistical packages that have inherited the simple and easy-to-use nature of Python. As a result, people with statistical skills but no familiarity with programming languages continue to rely on other expensive software. This is because researchers who are not familiar with programming may not know how to use different methods and adjust parameters effectively.

The `Statmanager-kr` has been designed to provide easy-to-use statistical functions especially for people with little knowledge of programming languages. The `Statmanager-kr` was designed to be compatible with `Pandas.DataFrame`. In addition, the `Statmanager-kr` was designed so that the analysis is performed using only one method with as few parameters as possible.
`Statmanager-kr` has been designed to provide easy-to-use statistical functions especially for people with little knowledge of programming languages. `Statmanager-kr` was designed to be compatible with `Pandas.DataFrame`. In addition, the `Statmanager-kr` was designed so that the analysis is performed using only one method with as few parameters as possible.

Additionally, `Statmanager-kr` relies on `Scipy` and `Statsmodels` for valid analysis. The `Statmanager-kr` provides methods for testing the normality and homoscedasticity assumptions, comparing between-group and within-group differences, performing regression analysis, and data visualization.
Additionally, `Statmanager-kr` relies on `Scipy` and `Statsmodels` for valid analysis. `Statmanager-kr` provides methods for testing the normality and homoscedasticity assumptions, comparing between-group and within-group differences, performing regression analysis, and data visualization.

# Statement of need

The `statmanager-kr` is a statistical package for Python in `Pandas`. This package provides methods commonly used for null hypothesis significance testing (NHST), which is of interest to researchers in various fields [@Moon2020]. It is also possible to test for normality or equivariance using the Shapiro-Wilk, Levene, or Fmax tests.
`Statmanager-kr` is a statistical package for Python in `Pandas`. This package provides methods commonly used for null hypothesis significance testing (NHST), which is of interest to researchers in various fields [@Moon2020]. It is also possible to test for normality or equivariance using the Shapiro-Wilk, Levene, or Fmax tests.

Most of the statistical software available today is difficult to use, as a previous study reported that one of the challenges students face in statistics courses was "using software" [@Murtonen:2003]. Although there are basic statistical libraries in Python, such as Scipy [@seabold:2010] and Statsmodels [@Virtanen:2020], they are quite complex. While some studies require complex and detailed statistical modeling and analysis, there are also many studies that require only a few hypothesis tests. Therefore, the development of an easy-to-use statistical package would be of great benefit to these researchers.

To achieve this, the `statmanager-kr` has been designed to run analyses with only three lines of code: 1. read data as a `Pandas.DataFrame`, 2. create a `Stat_Manager` object, 3. execute the `.progress()` method. Therefore, users can use the `Statmanager-kr` as long as they know the Pandas methods to read the data, such as `.read_csv()` or `.read_excel()`. It also includes functions to visualize the results depending on the analysis method.
To achieve this, `Statmanager-kr` has been designed to run analyses with only three lines of code: 1. read data as a `Pandas.DataFrame`, 2. create a `Stat_Manager` object, 3. execute the `.progress()` method. Therefore, users can use `Statmanager-kr` as long as they know `Pandas` methods to read the data, such as `.read_csv()` or `.read_excel()`. It also includes functions to visualize the results depending on the analysis method.


# Related Work
Expand All @@ -40,13 +40,13 @@ Recent advances in the field of statistics have been achieved through the emerge

However, `Statmanager-kr` and `Pingouin` differ in their target users. Since `Statmanager-kr` is designed for researchers with limited programming experience, it focuses on keeping the workflow short and concise; therefore, `Statmanager-kr` was designed to allow users to apply analyses and obtain results by always running a single method, `.progress()`, in a similar way. On the other hand, `Pingouin` was developed for users with a relatively high level of programming knowledge and experience; therefore, in terms of workflow, `Pingouin` offers more comprehensive and fine-tunable analysis methods and provides more detailed analysis results. Also, `Statmanager-kr` only works with `Pandas.DataFrame`, while `Pingouin` has the advantage of being compatible with a wider range of datasets.

Another difference is related to visualization and post-hoc. The `statmanager-kr` performs post-hoc by adding the parameter `posthoc` to the `.progress()`. In addition, it is possible to visualize the results by using `.figure()` as a method chaining. Although `Pingouin` does not provide the ability to directly visualize the results of an analysis, it does support the generation of graphs that are very useful from a statistical perspective, such as paired plots, shift plots, and circle mean plots. In addition, Pingouin has the advantage of supporting a wider range of post-hoc tests.
Another difference is related to visualization and post-hoc. `Statmanager-kr` performs post-hoc by adding the parameter `posthoc` to the `.progress()`. In addition, it is possible to visualize the results by using `.figure()` as a method chaining. Although `Pingouin` does not provide the ability to directly visualize the results of an analysis, it does support the generation of graphs that are very useful from a statistical perspective, such as paired plots, shift plots, and circular mean plots. In addition, Pingouin has the advantage of supporting a wider range of post-hoc tests.

In conclusion, depending on the researcher's programming experience and the purpose of the study, `Statmanager-kr` and `Pingouin` can be used differently. Researchers who are familiar with programming may be better suited to use `Pingouin` as it supports more analysis methods and customization. On the other hand, `Statmanager-kr` is designed to be used by researchers who are not familiar with programming and coding, but want to get quick results.

# Features

The `Statmanager-kr` was designed to be compatible with the wide range form of `Pandas.DataFrame`.
`Statmanager-kr` was designed to be compatible with the wide range form of `Pandas.DataFrame`.

## User-friendly Features

Expand All @@ -65,7 +65,7 @@ Users can search for a specific usage by calling the `.howtouse()` method. It ca

## Statistical Test

The implementation of analysis in statmanager-kr can be summarized as follows.
The implementation of analysis in `Statmanager-kr` can be summarized as follows.

| Objective | Analysis |
| ------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------- |
Expand All @@ -84,10 +84,10 @@ The implementation of analysis in statmanager-kr can be summarized as follows.

```python
import pandas as pd
from stamanager import Stat_Manager
from statmanager import Stat_Manager

# 1. Reading the data
df = pd.read_csv(r'../testdata.csv', index_col = 'name')
df = pd.read_csv(r'testdata.csv', index_col = 'name')

# 2. Creating object of Stat_Manager class
sm = Stat_Manager(df)
Expand Down Expand Up @@ -139,7 +139,7 @@ The method-specific information needed to use the `.progress()` method can be fo
| `rm_ancova` | One-way Repeated Measures ANCOVA | `vars` |
| `cronbach` | Calculating Cronbach's Alpha | `vars` |

Also `Statmanager-kr` provides two posthoc methods. It can be run by providing the key of the `posthoc_method` parameter as follows:
Also `Statmanager-kr` provides two post-hoc methods. It can be run by providing the key of the `posthoc_method` parameter as follows:

| Key of `posthoc_method` | Method |
| ----------------------- | --------------------- |
Expand All @@ -160,4 +160,4 @@ sm.progress(method = 'ttest_ind', vars = 'weight', group_vars = 'sex').figure()

Author declares no conflicts of interests.

# References
# References

0 comments on commit 166f68b

Please sign in to comment.