By Oscar Frith-Macdonald, 30 July 2024
Before we launch into this...
Allow me to introduce Oscar Frith-Macdonald, one of our distinguished developers at DF. Oscar has studied the use of Machine Learning to facilitate image content searching for the UK National Archives. Who better to educate our team on the inner workings of artificial intelligence? His explanations are so excellent that we’ve invited him to write for the esteemed Weetbicks blog.
Take it away Oscar...
For a while now, you have been able to access AI through the insert from URL script step and standard API calls (see Cath’s blog article). But FM 2024 is integrating a lot more of this directly into FileMaker, making it easier than ever to use AI in your solutions.
Now, there is a lot of great information on the cool things AI can do and also on the rather funny times it gets things wrong. However, we feel there is a lack of information on exactly what AI is. So this article is to help give you an understanding of how AI works, and this should help prevent you from falling into some of the pitfalls:
We will cover what AI is, some of the dangers of using AI, and how it can be useful. Hopefully, this will give you an idea of how best to use the new features safely.
The best way to think of AI is as the outer shell of a Russian doll made up of deep learning (deep neural networks), artificial neural networks, and machine learning.
To break it down:
In essence, each layer builds on the previous one, advancing the capabilities and complexity of AI.
An artificial neural network (ANN) is an interconnected group of nodes, similar to the vast network of neurones in a brain.
Here, each circular node represents an artificial neurone, and an arrow represents a connection from the output of one artificial neurone to the input of another.
An ANN is composed of layers of interconnected nodes, or neurones.
These layers are typically organised as follows:
Each connection between nodes has a “weight” to it, which is adjusted during the training process to minimise errors in the output. This process is similar to learning in biological brains, where synaptic strengths change to improve performance.
Deep Neural Networks (DNN) are a specific type of artificial neural network, with multiple layers in between the input and output layers.
A simple ANN may have just one hidden layer (shown in the diagram above), and a DNN adds many more of those hidden layers.
These additional layers enable DNNs to model intricate patterns and representations, making them highly effective for a wide range of tasks, particularly those involving large and complex datasets.
So what's happening in these hidden layers?
Each hidden layer transforms the input features into increasingly higher-level abstractions. Initially, the network might extract simple features (e.g., edges in an image). As the data passes through more layers, the network can recognise more complex patterns and structures (e.g., shapes and objects).
For instance, in image processing:
Machine learning (ML) focuses on developing algorithms that enable computers to learn from and make predictions or decisions based on data.
Unlike traditional programming, where explicit instructions are given to perform a task, machine learning allows systems to learn and improve from experience without being explicitly programmed.
Machine learning consists of three main points:
Data vectorisation is a key component of many AI models, though not all, which is why it wasn't included in the earlier Russian doll analogy.
Data vectorisation involves transforming raw data into a numerical format suitable for machine learning algorithms. Essentially, it converts data into numerical vectors, which are arrays of numerical values representing various features or attributes of the data. This numerical representation allows qualitative information, such as meaning to be quantified and compared to find similarities.
When data is vectorised, the numerical values generated are referred to as "dimensions." The more dimensions a vector has, the more accurately it can represent the data and facilitate comparisons.
Consider a simple example where the only two words in your vocabulary are "door" and "wall." In this case, the vectors might only have two dimensions:
In other words, this limited vocabulary would be represented as [0, 0]. Comparing "door" and "wall," we see they are distinctly different in this simple two-dimensional space. If we introduce the word "window," its vector might look like this: "window": [0.6, 0]. Comparing "window" to "door" and "wall," we can see that "window" is more similar to "door" than "wall."
In reality, vectors typically have many dimensions, representing various attributes or features. These dimensions can be imagined in a virtual space, where the vector indicates a position within this space.
Cosine similarity is then used to determine how close two vectors are within this multi-dimensional space. By measuring the cosine of the angle between two vectors, we can quantify their similarity, ranging from -1 (completely opposite) to 1 (identical).
Now that you have an understanding of the different components that make up AI, let’s dive deeper into what AI truly is.
Knowing that a house is made of bricks and cement doesn't fully convey what a house is; it’s the same with AI.
When you hear the term ‘AI,’ you might think of iconic characters like HAL 9000 from 2001: A Space Odyssey or Data from Star Trek. These examples represent machines with the capacity for reason and autonomy, known as artificial general intelligence (AGI). However, this is not what most modern AI refers to. Today, when we talk about AI, we're actually discussing Narrow Artificial Intelligence (NAI).
NAI is essentially an excellent piece of prediction software designed to perform specific tasks. Unlike AGI, which has the ability to understand, learn, and apply knowledge across a wide range of tasks at a human level, NAI is specialised. It's important to remember that it will only be an excellent piece of prediction software for the inputs it was trained for.
Here are some key points to understand NAI better:
The performance of NAI systems is tightly linked to the data they are trained on:
There are three main forms of training, and all are used within the AI space.
1. Supervised Learning:
Supervised learning plays a significant role in training models like GPT. During the initial pre-training phase, the model learns from a large corpus of text data that includes pairs of input (e.g., text snippets) and output (e.g., the next word or sequence of words). This helps the model understand language patterns, grammar, and context. Key points include:
2. Unsupervised Learning:
While the initial phase of pre-training can be considered supervised learning due to the nature of the task (predicting the next word), it also has aspects of unsupervised learning:
3. Reinforcement Learning:
Reinforcement learning has been notably applied in models like ChatGPT for specific aspects of training, such as fine-tuning the model's responses to be more aligned with human preferences.
It's pretty difficult to talk about AI and not mention OpenAI and GPT at all, so how does GPT fit into the world of NAI? Let's start with what GPT stands for:
Generative Pre-trained Transformer
GPT is actually an excellent example of NAI, as it's very obvious that none of the GPT models can do everything. They have had to be narrowed down to specific tasks such as ChatGPT, ImageGPT, etc. As their names suggest, they have similar base GPT models but have then been trained for specific tasks.
These different models can work in conjunction with each other to provide a “wider” AI experience. This is why ImageGPT can generate an image from a text input.
Example Workflow
1. User Input: The user provides a text prompt, e.g., "a two-story pink house shaped like a shoe with a small garden in front."
2. Text Processing: A language model (like ChatGPT) processes the text to ensure it is clear and extract essential elements, e.g., "two-story pink house," "shoe shape," "small garden in front."
3. Image Generation: The processed text is fed into a text-to-image model (like ImageGPT), which then generates an image based on the description.
While AI is a powerful tool, there are important considerations and potential pitfalls to be aware of. Understanding these can help you better navigate the challenges and maximise the benefits of AI.
Pitfalls and Issues:
Hallucinations are probably one of the biggest issues with AI, especially when the AI will provide an answer that seems plausible and will also present it as fact.
Hallucinations can happen for several reasons:
To mitigate these issues, it’s important to:
To enhance the reliability and accuracy of AI responses, consider the following strategies:
1. Use High-Quality Data: Ensure the training data is high-quality, representative, and as free from biases as possible. Regularly update the data to keep the AI model current. Use your own existing data and examples when asking a question about AI; this will help to ensure it's only using information you consider to be true.
2. Implement Human Oversight: Use AI as a tool to speed up human decision-making rather than replacing it entirely. Always have a human look over the result to ensure it is correct.
3. Clarify Queries: When interacting with AI, especially language models, frame your queries clearly and precisely. The more specific the input, the more accurate the output is likely to be. Don't be afraid to send pages of information in your prompt to the AI, but be aware there is usually a cost for this.
4. Employ Ensemble Methods: Combine multiple models or methods to cross-verify results. Ensemble techniques can help improve accuracy and reduce the likelihood of errors. It's also important to use the correct model for your task.
5. Educate Users: Ensure that users understand the capabilities and limitations of AI. Providing training on how to interact with AI systems and interpret their outputs can significantly improve outcomes. One of the most important points is to ensure your users are aware that the AI could be wrong.
Something to say? Post a comment...
Comments
No one has commented on this page yet.
RSS feed for comments on this page | RSS feed for all comments