DIY: Image Recognition
You have a farm, and you need to count the number of cows in the field to make sure none are missing. A machine can do this for you:
Real-life examples: count customers who entered a store today, sort fruits, or count cars in parking.
We will create a simple image analyzer to count the number of cows in a field and automatically highlight them in a photo. No programming experience required. If you don’t understand something, just ask ChatGPT.
The code is universal, so you can detect people or other objects just by editing a single line of code. We’ll use YOLOv8, a cutting-edge, highly accurate image recognition library that’s free to use.
Requirements:
- Computer (Mac or Windows or Linux)
- Internet
- Python (we will install it)
- AI model YOLOv8 (we will download it)
- Image of cows (we will download it)
Setup the Workspace
The coding process involves two parts: writing code and executing it. Typically, we write code in a text editor, and we execute it using a tool like “Terminal” on a Mac. However, there are applications specifically designed for coding, like Visual Studio Code, which lets you write and execute code on the same page.
Visual Studio will also help you install Python. You can also install extensions that make coding faster (such as Python autocomplete and Copilot) or an extension that stores your code online (like a Git repository).
Install Python
Python is a programming language. We’ll use it to write code and install the AI models. To run the Python code, we’ll use Visual Studio Code. To install Python, simply open “Extensions” on the left bar, type “Python” in the search field, and install the Python extension.
Install Python on Mac
During installation, you may encounter errors. Ask ChatGPT for help resolving them (e. g., copy and paste the error message).
To install Python on a Mac, you need to download Homebrew, which is like an app store for programmers. Open the “Terminal” application (press Command+Space and type “Terminal”), then copy and paste the code below into the “Terminal”:
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"
Press enter and do everything that the program says.
After installing Brew, it’s recommended to run
brew update
This ensures you have the latest version of Homebrew and that you’ll download the most recent version of Python.
Next, download Python by typing
brew install python
Follow the instructions and ask ChatGPT if you encounter any errors.
Install on Windows
Visit the o official Python installation page and select the latest Python installer for Windows.
Installation is straightforward, but you might encounter some problems. If you do, explain the errors to ChatGPT (e. g., copy and paste the error message).
Write Code
Create a new document by pressing Command+N. Next, save the file using Command+S as “analyze_image.py”, where “analyze_image” is the name, and “.py” indicates a Python file.
Now, write a simple code that prints “Hi!”:
print("Hi!")
Save a code by pressing command+s
Run Code
To run the code, click the play button. You will see the output in the “Terminal” window at the bottom.
(Note: your terminal may look different from mine, but it doesn’t make any difference)
you will get an answer here
You may also press the F5 button. The first time you use it, you will be asked which debugger (same as running the code) to use. Select the first one.
Alternatively, you can run a program using the terminal in Visual Studio Code or the “Terminal” on Mac. This is slightly more challenging. First, you should open the folder where the Python file is located. This is displayed at the top of Visual Studio Code.
To open it, type the path and the “cd” function (change directory) in the command line. Don’t include the name of the file; stop at the last folder.
cd /Users/daniil_kovekh/Desktop/bot/Image/find_cow
Now, you can list all the files in the folder.
ls
To run the Python code, type “python” and the name of the file with “.py” at the end.
python analyse_image.py
That’s it. If you encounter an error, ask ChatGPT for help. There may be a mistake in the code.
Download Libraries
Basics
A library is code written by other developers. Here’s how to use one:
For example, if you want to create a graph showing the number of customers who visited your store last week, you could use an existing graphing function like “Matplotlib.” This library can draw any graph you need.
First, download the library from the internet. Use the command line in Visual Studio Code for this (you can also use the “Terminal” application on Mac). We’ll use the Python installer, called “pip.”
pip install matplotlib
Press “Enter” on your keyboard.
After it’s downloaded, simply write this code.
the “#” symbol is used for comments. The text after it on the line will be
visible to you, but not to the machine.
# Import the library and the module pyplot. The module contains functions.
import matplotlib.pyplot
# Define the data. We store data as lists - the elements that have the same format.
# days is a list that contains the text - it is a "string" format
days = ['Mon', 'Tue', 'Wed', 'Thu', 'Fri', 'Sat', 'Sun']
# customers is a list containing numbers - "integer" format
customers = [120, 140, 110, 100, 170, 200, 210]
# Add the data to the graph (X, Y).
# We use a bar graph - the function is also named "bar".
matplotlib.pyplot.bar(days, customers)
# Draw and display the graph.
matplotlib.pyplot.show()
Save the code with the Command+S button, and click the play button to run the code (or use other methods that we learned previously). The graph appears in another window, or maybe on another desktop.
Free or Paid
Most libraries are free, but some require payment. For example, Yahoo Finance offers free access to stock price data. However, OpenAI, the provider of ChatGPT, charges $0.06 for every 1000 words printed using ChatGPT-4.
The YOLOv8 – the image analyzer is free to use.
Internet Access
Some libraries require internet access. Yahoo Finance and OpenAI both work with the internet. Yahoo Finance uses the internet to access its database of stock prices. OpenAI uses the internet for another reason: it receives the text you wrote, runs the AI program on its server, and sends you the answer.
YOLOv8 works both online and offline. YOLOv8 needs the internet to access the database with trained models that identify what’s in a picture. If you train the model yourself, by uploading hundreds of photos and naming what is drawn on it, then it will work offline.
Import YOLOv8 Library
The YOLO library is developed by Ultralytics. To install it, simply run the following command:
pip install ultralytics
You might encounter errors regarding the need to download other libraries or outdated packages. Just copy-paste those error messages here in ChatGPT.
Get Images of Cows
We’ll find cows in meadows by using these images:
https://koveh.com/img/cows.jpg
https://koveh.com/img/cows2.jpg
https://koveh.com/img/cows3.jpg
Download these images and place them in the folder containing the “analyse_image.py” script.
Write Cow Detection Code.
Let’s write the basic code to detect cows in the images. First, set the path to the source image. Then, define the model – we’ll use a simple universal pre-trained model called “yolov8n.pt”. Next, determine whether to save a new image with highlighted cows. Also, set the confidence threshold for the computer to identify a cow (0 to 1). The lower the confidence level, the more chances are that the cow will be found, but the risk that another object will be selected as a cow will also be higher (e. g., a car may become a cow). Set the confidence threshold to 0.4.
Additional settings can be found on the official webpage
Note: We don’t specifically define that we’re searching for cows; the code simply identifies the most obvious objects in the picture. We’ll fine-tune this later.
from ultralytics import YOLO
source = "/Users/daniil_kovekh/Desktop/bot/Image/find_cow/cows.jpg"
# Load a YOLOv8 model from a pre-trained weights file
model = YOLO('yolov8n.pt')
# Find the cow with confidence 0.4 and save the image
model.predict(source, save=True, conf=0.4)
Here are the results of an image with 2 cows on it:
Results with a large number of cows:
Improve Results – Decrease Confidence.
To improve the results, we can adjust the confidence threshold and Intersection Over Union (IOU) – which accounts for overlapping cows, such as when a cow hides behind a tree. Change these settings within a range of 0 to 1.
from ultralytics import YOLO
source = "/Users/daniil_kovekh/Desktop/bot/Image/find_cow/cows.jpg"
# Load a YOLOv8 model from a pre-trained weights file
model = YOLO('yolov8n.pt')
# Find the cow with confidence 0.4 and save the image
model.predict(source, save=True, conf=0.4, iou= 0.4)
The results may still not be ideal, because when the confidence threshold is low, the AI might misidentify other objects, like cars, as cows.
Improve Results – Increase Image Size
By default, the image width is 640 pixels. The size of cows3.jpg is approximately 5000x3000 pixels, while cows.jpg has a different size. Determine the appropriate image size for each picture using the Python Imaging Library (PIL).
Install PIL using the terminal and pip command:
pip install Pillow
Now, write Python code in Visual Studio:
from PIL import Image
source = "/Users/daniil_kovekh/Desktop/bot/Image/find_cow/cows3.jpg"
# Load the image and get its width
image = Image.open(source)
width, _ = image.size # _ means the height, that we dont need.
# Load a YOLOv8 model from a pre-trained weights file
model = YOLO('yolov8n.pt')
# use imgsz equal to width
model.predict(source, imgsz=width, save=True, conf=0.4, iou= 0.4)
This approach yields better results, but feel free to experiment with the settings to further improve them.
Count the Cows
To count the cows, we’ll tally the number of “cow” indices. Cow indices are stored in a dictionary, similar to the example below:
animals = {"cow": 0.8, "cow":0.5, "dog":0.4, "cow":0.66, "duck":0.4}
This dictionary records each detected object in the image and its confidence as the specified animal. We’ll count all the cows present in the dictionary. In the example above, there are three cows.
from ultralytics import YOLO
from PIL import Image
source = "/Users/daniil_kovekh/Desktop/bot/Image/find_cow/cows3.jpg"
# Load the image and get its width
image = Image.open(source)
width, _ = image.size
# Load a YOLOv8 model from a pre-trained weights file
model = YOLO('yolov8n.pt')
# Define the cow class index
cow_class_index = None # default, if no cows are on the picture
for key, value in model.names.items():
if value == "cow":
cow_class_index = key # key is the first element in a Python dictionary
break
# Run prediction on the first image
results = model.predict(source, save=True, imgsz=width, conf=0.3, iou = 0.5)
boxes = results[0].boxes
cow_detections = [box for box in boxes if int(box.cls) == cow_class_index]
print(f"Number of cows in {source}: {len(cow_detections)}")
Analyze Anything Else
You’re now equipped to write object detection code for various purposes. If you want to detect people, simply change the value from “cow” to “person” or “cup”.
Explore the documentation and analyze whatever you’d like – you can analyze videos or live streams, detect a person’s movements, or even draw an individual’s skeleton to assess the movements of athletes.
If you require an AI application, website, bot for Telegram, Google extension, automation tool, or web scraper, feel free to contact me at daniil@koveh.com. My team and I are eager to assist you.