I tried 8 of Google's newest AI products and updates at I/O 2024


The improved long context window can even pull information from multiple documents when responding to a single prompt. In the side panel in Docs, I asked for help writing a sample letter to a potential job candidate — in the prompt I linked to the job description document and the applicant’s PDF portfolio, both of which were in my Drive — and instantly received a email draft, which factored in relevant details from both documents.

Gemini 1.5 Pro isn’t our only shiny new model, though: I also got to try the freshly-announced Imagen 3, our highest-quality text-to-image model yet. One of the new abilities I was excited about was its ability to generate decorative text and letters, so I put it through its paces. I started by asking for a stylized alphabet — like letters spelled out in jam on toast, or with silver balloons floating in the sky. Imagen 3 generated a full alphabet of letters, which I could then use to type out my own (delicious) menus.

After my Imagen 3 interlude, I continued with more Gemini demos. In one of them, I could pull up Gemini’s overlay on an Android phone and ask questions about anything on the screen. This really showed how we’re not only expanding what you can ask Gemini, but we’re also making Gemini context aware, so it can anticipate your needs and provide helpful suggestions.

The use case here was a lengthy oven manual. Whether it’s a demo or real life, that’s not something I’d be excited about reading. Instead of skimming through the document, I pulled up Gemini and immediately got an “Ask this PDF” suggestion. I tested questions like “how do I update the clock” and quickly got accurate answers. It worked just as well with YouTube videos. Instead of watching a 20-minute workout video, I asked a quick question about how to modify planks, got an answer, and was on my way onto the next demo, where I tested a new conversation mode called Gemini Live that lets you talk with Gemini in the app, no typing required.

Speaking with Gemini was a different experience than the traditional chatbot interface: Gemini’s answers are a lot more conversational than the paragraphs of texts and bullet-pointed lists you might usually get. In my demo, I learned you could even cut off Gemini in the middle of an answer. After asking for a list of kid’s activities for a summer vacation, I was able to interrupt a list of suggestions to dive in deeper on what materials I’d need for tie-dying a shirt.

The Project Astra — or “advanced seeing and talking responsive agent” — demo took things a step further to show the cutting edge of where our conversational AI projects are heading.



Source link

Post a Comment

أحدث أقدم