Google Gemini 1.5 Pro personal test: powerful and fragile at the same time
Overview
Some time ago, I applied for Gemini 1.5 Pro on my wishlist. After that, I forgot about it. Today, I logged into Google AI Studio and found that I can use Gemini 1.5 pro. So I tested it. Later, I plan to switch from Gemini 1.0 pro to Gemini 1.5 pro.
Gemini 1.5 pro can support text, pictures, videos, files, and folders as prompt input.
Enter text
Nothing too special though.
Enter picture + text
After inputting the picture, Gemini1.5 pro takes more than 30 seconds to return the result.
I specifically told it that it was wrong, and it admitted it. It seems that Gemini is not very good at distinguishing right from wrong.
Input video + text
When inputting a video, the time it takes for Gemini1.5 pro to return the result is more than 200 seconds.
Input file + text
After inputting the file, Gemini1.5 pro takes more than 200 seconds to return the result.
Enter folder + text
The input folder has too much content, and combined with the previous content, the prompt token exceeds the limit and cannot return results.
Summarize
As a large multi-modal model, the most obvious feature of Gemini 1.5 pro is that it can input more comprehensive types of data than 1.0: text, pictures, videos, files and folders.
However, it seems that there has been no significant improvement in reasoning ability. At least, it is still unable to distinguish right from wrong.