Building AI keywording tool for microstocks yourself in under 10 minutes

updated on August 31, 2024 / by Taras Kushnir

Feature image

In the previous blogpost about AI keywording services, there was one service, which we did not discuss. An elephant in the room, one might say. Using AI (ChatGPT/Claude) directly. Or, even better, building a whole “AI keywording service” yourself.

It took me way more time to write this blogpost rather than build the thing. It was that fast! The result works with multiple images at the same time and exports a CSV file that you can import to microstocks. Voilà!

Table of Contents

TL;DR;

If you’re here just to see the tool, click here. Note, that I made small style changes for it to fit into Xpiks website. Everybody else, please read on.

Building the thing

Disclaimer about time

It took me less than 10 minutes because I happened to have some experience in software engineering. Claude made a few mistakes and while it has fixed them fully on its own eventually, sailing wouldn’t be as smooth if I was a complete newbie.

Another disclaimer is that I had an OpenAI API key already so if you’re new to this, it might take a couple of minutes more just to register.

On being lazy

If you’re using ChatGPT, it gives you the source code of whatever you’re building, but it’s not interactive. To test it, you need to copy the code back and forth to where you can try it. This is something I did not want to bother with.

However, there was an exciting development recently, called Claude Artifacts. When you generate a piece of software, Claude (the company behind models such as Claude Sonnet 3.5) can host the code for you in some form that allows its execution. At least that is valid for simple web pages.

That was enough of a reason to use Claude for this task, even though I did not have a paid account with them. Everything below was created using a free Claude account.

Prompting and testing

I signed up with Claude, confirmed my email and I was ready to start prompting. Here’s the first prompt I used:

You’re an expert web programmer. I would like you to create a web page as an artifact that allows to select a local image and this image will be sent to ChatGPT API. The API will return me a Title, Description and Keywords (comma-separated list) that will be good for microstock photography websites. Description should be below 200 characters and Title should be below 70. Website artifact that you will create should allow to customize the prompt sent to OpenAI and also to set my own API Key

Claude prompt and app
My first failed prompt, but the app (on the right) looks decent!

Yes, I realize it’s cruel to make Claude AI create a page to use ChatGPT (its arch-competitor) but I did not have Claude API Key and I did have an OpenAI API Key so this was a necessary sacrifice to make this work.

Claude made 2 mistakes:

It used the now defunct gpt-4-vision-preview model
It messed something up with the image encoding in a request to OpenAI.

I did not want to read the code, so just asked it to “fix it”, literally:

Fixing Claude error
Just ask it to fix it!

After this, the webpage was fully functional, and, although it has an obvious problem with the right margin and text overflow, it does work!

Result of ChatGPT
Crude, but working!

There were two more important things missing:

I wanted to be able to upload many images at once
I needed a CSV file in the end, not a list of text with metadata

So this is pretty much what I asked it to do, one thing at a time.

Generating for multiple files

While I was testing this, I figured that I also want to be able to stop execution, if I realize a mistake in the prompt. So additionally I asked Claude to add a “Stop” button. It complied.

The final page looked a bit off in terms of styles, but it was tiresome to ask to fix the small things as Claude was regenerating the whole thing again and again. So I downloaded the file and made a couple of edits myself:

fixed webpage style to make those margins behave
added a spinner to spin while metadata is generating
disabled buttons after clicking on them

Final result
Final.final.html

Limitations

There are a couple of things to keep in mind:

whole files are always sent to OpenAI, so it’s not the most efficient way (especially, if you have many files)
files are sent one by one so it can take more time to get metadata (OpenAI has API rate limits anyway so sending everything in parallel is also not possible)
only photos are supported at the moment, videos and vectors are not
metadata is not written back to files, only a CSV file is generated. So it’s best to use software like Xpiks to import it back
there’s a limited number of keywords generated and they are not the best, because LLMs actually don’t produce good keywords (Xpiks uses other machine learning models for that)

Integrating into this website

While this separate page was working perfectly, to make anybody use it, I wanted to integrate it into this website. So I had to actually work manually a bit more to align the styles of buttons and text fields.

Integrated into Xpiks website

Except for the FAQ items that I added later, it’s the original tool, created by Claude. It may look bare-bones, but it has everything to get you started!

Try it out!

Evaluating

Metadata

If you’ve read the previous blogpost, you know that I was evaluating AI keywording services on my best-seller picture:

Mountaineers
Image that was used for tests

So obviously, for comparison, we also need to use our shiny new tool by Claude. Here’s the metadata it generated:

Field	ChatGPT + Claude	The Original
Title	Climbers Conquering a Snow-Capped Mountain Ridge	Tied climbers climbing mountain with snow field tied with a rope with ice axes and helmets
Description	A group of climbers traverses a snow-covered ridge against a clear blue sky. They are roped together, showcasing teamwork in a challenging alpine environment.	Tied climbers climbing mountain with snow field tied with a rope with ice axes and helmets

As for the actual keywords, here’s the list: mountaineering, snow, climbers, teamwork, alpine, adventure, outdoor sports, rugged terrain, winter, hiking, summit, peak, rope, landscape, cold weather. Not many “money keywords” included, but this is a super rough sketch and with some prompt engineering you can get better results.

No fuss, but by tweaking the prompt, I’m sure it will become on par with other services (maybe sans the keywords).

Price

The biggest advantage here is that having an API Key is way cheaper than subscribing to ChatGPT Pro. OpenAI has a couple of options that affect pricing, such as resolution of the patches that are used to “survey” an image, but the resulting price should be around 3-5 times cheaper than any solution mentioned in the previous blogpost (especially Pixify). The reason to pay more is, of course, the convenience.

What’s next

Here’s the actual code that Claude has generated, with a few cosmetic improvements. You can just download the file to your computer and open it in any browser. It can be taken quite far by asking for more edits and improvements from Claude and in one evening you can have your own bare-bones PhotoTag or VisualMind clone.

Although it works just fine, some corners were cut (see Limitations). If you wish to use a more professional tool, check out auto-keywording plugin for Xpiks.