Gemini Pro can refer to audio without the need for a transcript

Google Gemini has sprouted ears. The company announced that its Gemini 1.5 Pro model has been released for public preview in over 180 countries, and it brings with it a feature that could enhance convenience and efficiency. Gemini Pro is its mid-tier model of the Gemini suite, which includes Gemini Ultra, its most powerful model. Along with a number of API updates and improvements, the new feature, which Google call its first-ever native audio (speech) understanding capability, allows users to refer Gemini to audio files, even audio from video, to complete instructions. This means Gemini can do things like parse through and summarize business calls or any recorded video– all without the need of transcription.

The following was originally published February 22, 2024:

UPDATE: On February 22, The Verge reported that Google apologized for what it describes as “inaccuracies in some historical image generation depictions” with its Gemini AI tool including images that depicted Nazis as people of color; Google has suspended aspects of image generation.

Generative AI has struggled with amplifying gender and racial stereotypes, in part because the source material that is used to train AI is rife with ingrained biases. Apparently, Google’s attempt to offset the accumulation of these stereotypes rather amplified them in some cases. Ironically images of the Founding Fathers and Nazi officers “corrected” with color-blind diversity code, merely proved how difficult these stereotypes are to correct for. At least (or especially) with an algorithm.

“We’re working to improve these kinds of depictions immediately,” Google said in a statement. “Gemini’s AI image generation does generate a wide range of people. And that’s generally a good thing because people around the world use it. But it’s missing the mark here.”

The controversy appears to be something that is an equal-opportunity offender and Google is taking flak from all sides for applying “diversity” as a blunt, inaccurate instrument. Also, if you read between the lines, some of the problem lies in the kind of search prompts that are used. Regardless, it’s a small bump in the road for AI on the back of what is a much more deeply rutted superhighway of history. If only AI could solve that. It won’t but it is the future, for better or worse, or something in between. Eventually Google will turn the image generator will be turned back on. -Cynthia Wisehart

UPDATE: Since publication, Google has made an official announcement launching Gemini Business, which costs $20 a month on top of the $6 Workspace subscription.

It looks as though Google is not done revealing its plans for Gemini, the recently revealed AI model that replaced Bard in the company’s catalog. Now, it looks as though Gemini is also replacing Duet AI as the enterprise-focused AI product for Google Workspace users.

Google Workspace customers can expect an option to allow Gemini help them in their work tasks in the near future, as references to “Gemini Business” and “Gemini Enterprise” have begun popping up. Premature patch notes were reported by 9to5Google’s Dylan Roussei as well as TestingCatalog.eth that shows descriptions of the two products, promising enhanced enterprise-grade security features as well as full integration into Google Workspace.

Google announces copyright protection for its generative AI users

In addition to enhanced security, the patch notes promise users that their conversations are not used to train Gemini AI models.

Also revealed in the uncovered data is that the Gemini Enterprise and Gemini Business products have been optimized for English and are to be available in over 100 countries and territories. No pricing has been revealed, though it is worth noting that Duet AI, the product that Gemini Enterprise is replacing, was priced at an additional $30 a month. While the patch notes reference 2.21.2024, Google has not commented on an official release date as of press time.

Derek Wiley

Derek Wiley is the content producer for Sound & Contractor and TWICE. Chiefly having a production background, Derek is quickly becoming a Pro AV enthusiast. When not writing, he enjoys all things gaming, music, and live events.

Your browser is out-of-date!

Gemini Pro can refer to audio without the need for a transcript

Featured Articles

Shure Introduces Microflex Advance MXA901 Conferencing Ceiling Array Microphone Alongside Preview of Designer 6.0 Configuration Software

Extron issues supply chain statement

Google rebrands its AI offerings, releases Gemini Advanced to the public

WATCH: Google unveils Google Vids, its newest workspace app

Google keeps AI announcements rolling with new product, Gemma

MAXHUB to Exhibit at Enterprise Connect 2024

ZIXI AVAILABLE ON GOOGLE CLOUD MARKETPLACE

From The Wire

Bolin Partners with NDI to Integrate Advanced Connectivity Across Product Line

Connect Series with the Assist: Varsity Ice House Sports Bar Integrates LEA Professional Technology into the Customer Experience

LEA Professional Amplifies Middle Eastern Presence With Zuhal Muzik Partnership

Sharks & The Silver Screen: LEA Professional Launches Cinema Digital Series