Differentiating AI vs. Human Art: A Convolutional Neural Network Approach

Juneeta Lakshmi Vangala – Applied Science

Abstract

Artificial Intelligence (AI) impact on the art industry has grown rapidly in recent years. This project aims to differentiate between AI-generated and human-made art. It uses Convolutional Neural Networks (CNNs) to classify images, using a dataset customized from Kaggle. Through model training, the CNN demonstrates promising performance in accurately distinguishing between AI-generated and real art, with increasing accuracy and decreasing loss over iterations. However, the study acknowledges several limitations, including the dataset’s limited size and computational resource constraints, which may affect the model’s generalization and accuracy. Future research should involve expanding the dataset to encompass a broader range of art styles and leveraging more powerful computational resources. Despite its limitations, this project showcases the importance of developing technological solutions that support human creativity rather than negatively affecting their income.

Introduction

Artificial Intelligence (AI) has been a part of the art scene since the 1960s (Offert, 2021). Early examples of AI art included computer-generated shapes and patterns. Artists such as Luke DuBois, Sam Lavigne, Sven König, Parag Kumar Mital and Kyle McDonald and others used AI technologies to create generative and interactive works (Grba, 2022).

Figure 1: “Missed Connections” – Luke DuBois (2012) The artist created an algorithm that sifts through the “Missed Connections” section of Craigslist for nine different cities including New York, San Francisco, and London.

In recent years, AI’s impact on the art industry has increased exponentially. The special issue of Artnodes Journal, “AI, Arts & Design: Questioning Learning Machines”, addresses the issues of authorship and creative patency (West & Burbano, 2020). It probes into the ability of AI and machine learning in creating art infused with emotion, highlighting the importance of distinguishing between art “inspired by AI” and “created by AI.”

Figure 2: The number of patents filed in 2021 is 30x higher than in 2015, showing a compound annual growth rate of 76.9%. (Source: 2022 AI Index Report)

With the increase of AI generators such as DALL-E, Midjourney, IMG2GO, GAN and Stability AI, the creation of AI art has been rampant. This influx of AI art and AI-generated images makes AI image detectors pivotal.

Figure 3: The search of the term “AI art” has increased in the past nine years (Source: Google Trends)

Most artists across the world make very little money. The US and UK showed that over 75% of artists made less than USD 10,000 while about 50% of artists made no more than USD 5,000 per year. Asian, African American, Hispanic, Indigenous and female artists fared even worse (The Artfinder independent Art Market Report, 2017).

Figure 4: Annual net income of artists (Source: The Artfinder independent Art Market Report, 2017)

Marginalized communities, peoples or populations are groups and communities that experience discrimination and exclusion (social, political, and economic) because of unequal power relationships across economic, political, social, and cultural dimensions (National Collaborating Centre for Determinants of Health). Traditional artists from marginalized communities are struggling to make a living due to the overshadowing presence of AI-generated artwork and limited resources (Placido, 2024).

Technology has the power to either enable or impede an artist. While it helps artists to create, collaborate and commercialize their work easily with a global audience, it can also undermine their work through digital piracy, copyright infringement and art generated by AIleading to detrimental consequences to the artist’s financial and mental health. AI can not only demoralize the artist but negatively impact the entire art ecosystem of academia, distributors, collectors, funders, patrons, and the public.

Figure 5: AI – generated Indigenous art (Generated by img2go.com)

Figure 6: The Artist With Old Friends by Norval Morrisseau (McMichael Canadian Art Collection)

This project aims to distinguish between authentic artwork created by humans and those generated by AI, thereby empowering artists to commercialize their work at a higher value and protect themselves from cheap AI-generated art. This contributes to building a sustainable ecosystem by providing reliable verification for art enthusiasts to invest in and support the traditional art industry. The project will be trained using Indigenous artwork because the overshadowing presence of AI artwork presents a foreboding picture for traditional artists from marginalized communities. These artists are underrepresented in mainstream art and lack the resources to protect their work. They have limited opportunity to reach new audiences and expand their artistic horizons. (Placido, 2024)

Since the tool does not generate art, training on preexisting art does not add to the database of AI art. It will only help detect and classify art better.

Materials and Methods

Figure 7: Flowchart of the process

This project aims to develop a Convolutional Neural Network (CNN) to classify images into categories of “AI art” and “Real art.” The datasets were obtained from Kaggle, a resource hub for data science, and customized to meet project requirements. The images were validated to exclude unsupported formats (.TIFF, .GIF) and those smaller than 10 KB to ensure quality. The project was implemented in Python, utilizing the following key libraries:

TensorFlow and Keras: For building and training neural networks.
OpenCV: For image processing.
Matplotlib: For data visualization.
Pandas: For data manipulation and preprocessing.

The images were normalized from the range [0, 255] to [0, 1] by dividing by 255, ensuring efficient training of deep learning models. This normalization of data to [0, 1] is critical for training deep learning neural network models efficiently.

Python is used on Jupyter Notebook for implementation. The classification is done using an end-to-end pipeline. An end-to-end pipeline is a processing method that automates machine learning workflows by processing and integrating datasets into a model (Lev, 2023).

The Convolutional Neural Network (CNN), used in this project, is a specialized type of deep learning algorithm designed for object recognition – image classification, detection, and segmentation. It autonomously extracts features from data and identifies patterns regardless of position, scale and orientation. Architectural layers are stacked together and each layer extracts simple features. ReLU activation functions were used in convolutional layers to introduce non-linearity.

The model architecture included a sequential stack of layers such as Conv2D, MaxPooling2D, Dense and Flatten.

Figure 8: Code for the architectural layers

The model was compiled with a binary cross-entropy loss function and optimized using the Adam optimizer. Binary Crossentropy is a loss function commonly used for binary classification tasks. It measures the performance of a classification model whose output is a probability value between 0 and 1. The goal of the model during training is to minimize this loss function.

Figure 9: Code for the compilation of code

The training data is then split into three sections: training, validation and testing. 70% of the dataset is used to train the model while 20% is used for validation and 10% is used for testing the model. The validation set contains examples that the model hasn’t seen during training, serving as an independent benchmark for evaluating its generalization ability. The validation set helped generate an unbiased estimate of the accuracy of the model (Brownlee, 2020).

Figure 10: Code for splitting data

A log directory was created to record training events and potential issues for debugging. The model was trained using the following settings:

Figure 11: Code for training data

Results

The model’s performance can be plotted using matplotlib. The two plots used for this project are the accuracy and the loss plots. The loss and accuracy graphs are vital tools in ML and deep learning to visualize the performance of a model during training and validation phases.

Figure 12: Loss plot of the model

The loss graph shows how the model’s error decreases over time. The error, or loss, is the difference between the model’s predictions and the actual target values. A lesser loss value indicates a better model. The X-axis is the number of iterations over which the model is trained. An iteration is one complete pass through the training dataset. The Y-axis represents the loss value. Both the X and Y axis have no units.

Figure 13: Accuracy plot of the model

The accuracy graph measures how often the model’s predictions match the target labels. It typically increases over time as the model learns. The X-axis here is the number of iterations the model runs. The Y-axis for the accuracy graph typically does not have a ‘physical’ unit but is represented as a percentage (%). The accuracy percentage is a ratio, comparing the number of correct predictions to the total number of predictions made, multiplied by 100 to convert it into a percentage.

Discussions

The model developed in this study distinguishes between AI-generated art and human-created art, focusing on Indigenous artwork. The model’s performance, with an increasing accuracy and decreasing loss over iterations, suggests that the Convolutional Neural Network (CNN) model is a good model for this task.

The implications of these findings are expected to help for the art industry, especially for traditional artists from marginalized communities. By providing a reliable method to authenticate human-made art, this model can empower artists to commercialize their work at a higher value and protect themselves from cheap AI-generated art. This technology could serve as a tool for art distributors, and collectors aiming to support and invest in authentic, human-made art. It helps preserve traditional art.

Despite the promising results, there are several limitations to this study. First, the dataset was obtained from a specific source (Kaggle) and customized for this project, which may not fully represent the diversity of AI-generated and human-made art. Second, the model was trained and validated on images that were pre-processed to meet specific criteria (size, format, and normalization), which might not encompass the variability found in real-world scenarios. Additionally, the training data is limited. The number of AI images used to train the model is 186 while the number of Indigenous artworks are compared is 180. For a CNN training with 100 images is quite low and is not effective for training the model. To train a CNN algorithm it is crucial to have a data set with over 5,000 samples for effective generalization of the problem. (Zhou, 2019)

Another crucial limitation is the computational resources used for this project. Due to the low GPU power, only a small subset of images was utilized for plotting the results. As a result, the accuracy may appear higher than it actually is. Future research should aim to leverage more powerful computational resources to fully utilize the available data and improve model accuracy.

To build on the findings of this study, future research should focus on expanding the dataset to include a wider variety of art styles and sources. It is also important to use a stronger GPU. Future research should also explore alternative ML algorithms. Integrating different ML approaches could increase model accuracy.

This study has demonstrated that a CNN-based model can effectively differentiate between AI-generated and human-created art, particularly Indigenous artwork. These results contribute to our understanding of how technology can be leveraged to support traditional artists and protect their work from being overshadowed by AI-generated content. However, further research is necessary to enhance the model’s accuracy and applicability in diverse and real-world contexts. This project showcases the importance of developing technological solutions that support human creativity rather than negatively affecting their income.

References

Aw, B. (2024, March 20). How does AI detection work: You will definitely be shocked. Brendan Aw. https://brendanaw.com/how-does-ai-detection-work

Brownlee, J. (2020, August 14). What is the difference between test and validation datasets?. MachineLearningMastery.com. https://machinelearningmastery.com/difference-test-validation-datasets/

Chai, L., Bau, D., Lim, S.-N., & Isola, P. (2020). What makes fake images detectable? understanding properties that generalize. Computer Vision – ECCV 2020, 103–120. https://doi.org/10.1007/978-3-030-58574-7_7

Grba, D. (2022). Deep else: A critical framework for AI art. Digital, 2(1), 1–32. https://doi.org/10.3390/digital2010001

Michael Filimowicz, P. (2024, January 20). The history and evolution of AI-generated art. Medium. https://medium.com/higher-neurons/the-history-and-evolution-of-ai-generated-art-e5ccca5a8e83

Placido, D. D. (2024, January 17). The problem with AI-generated art, explained. Forbes. https://www.forbes.com/sites/danidiplacido/2023/12/30/ai-generated-art-was-a-mistake-and-heres-why/?sh=42f955153fef

Syed, A. H. (2022, December 27). Logging in machine learning. Medium. https://syedabis98.medium.com/logging-in-machine-learning-cf69df7ccf3c#:~:text=ML%20log%20files%20are%20a,training%20parameters%2C%20and%20performance%20metrics.

West, R., & Burbano, A. (2020). Editorial. Ai, Arts & Design: Questioning learning machines. Artnodes, (26). https://doi.org/10.7238/a.v0i26.3390

Zhou, V. (2019, June 6). Training a convolutional neural network from scratch. Medium. https://towardsdatascience.com/training-a-convolutional-neural-network-from-scratch-2235c2a25754