Fine-Tuning a Vision Transformer (Swin-Tiny) for Detection and Classification of AI-generated Images

The notebooks in this repository focus primarily on fine-tuning a pre-trained vision transformer (Swin-Tiny) to extend a binary classification problem: identifying whether an image is created by generative AI. The work here expands the scope of this baseline into a multiclass classification problem: identifying whether an image is authentic (human-generated) or generated by one of a series of text-to-image AI generators (i.e., Stable Diffusion, Midjourney, and DALL-E).

The goal was to tackle the multiclass classification problem using three separate approaches to transfer learning:

The first experiment used the model as a feature extractor. Extracted outputs were passed to a logistic regressor implemented in Scikit-learn (LogisticRegressionCV) to classify the images.
The second experiment was fine-tuning with frozen layers. It involved freezing all of the parameters up until the final linear layer, and then adding our own linear layer that transformed the output dimensions and handed off to a softmax for the classification.
The third experiment was selective fine-tuning: a natural extension to experiment 2 where we froze every layer except the last one (specifically Stage 3, Block 1), which would remain unfrozen and trainable. As with the previous experiment, we added a trainable linear layer with a softmax for classification.

Read the full report here.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
assets		assets
root		root
src		src
.gitignore		.gitignore
1_swin_feature_extract_sklearn.ipynb		1_swin_feature_extract_sklearn.ipynb
2_swin_transfer_learn_frozen.ipynb		2_swin_transfer_learn_frozen.ipynb
3_swin_finetune_unfreeze.ipynb		3_swin_finetune_unfreeze.ipynb
README.md		README.md
report.pdf		report.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Fine-Tuning a Vision Transformer (Swin-Tiny) for Detection and Classification of AI-generated Images

About

Releases

Packages

Languages

Kim-Sha/ai-image-detection

Folders and files

Latest commit

History

Repository files navigation

Fine-Tuning a Vision Transformer (Swin-Tiny) for Detection and Classification of AI-generated Images

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages