- By Ashish Singh
- Wed, 15 Mar 2023 03:50 PM (IST)
- Source:JND
OPENAI, a San-Francisco based technology startup backed by technology giant Microsoft, which is the developer of the popular ChatGPT chatbot on Wednesday announced the next generation large multimodal GPT-4 which accepts the image and text inputs.
How GPT 4 Outperforms GPT 3.5:
The current language mode GPT-4 from OpenAI, is the replacement for the GPT 3.5 which is the current iteration. The users would be able to get more precise answers to the challenging questions with the new GPT 4. For those who are unfamiliar, the GPT stands for Generative Pre-trained Transformer, a deep learning method that makes use of artificial neural networks to generate text that resembles that of a human.
Open AI asserts that with the release of GPT4, AI generative tools will be more advanced in three key categories, including creativity, visual comprehension, and context handling. The chatbot may now produce more artistic works, such as technical writing, music, or screenplays.
"We've launched GPT-4, the latest milestone in OpenAI's endeavour to scale up deep learning," the company announced on Tuesday in a blog post.
"We spent 6 months iteratively aligning GPT-4 using learning from our adversarial testing programme as well as ChatGPT, resulting in our best-ever factuality, steerability, and refusal to step outside of guardrails outcomes," the blog post added.
GPT-4 outperforms existing large language models (LLMs), including most state-of-the-art (SOTA) models which may include benchmark-specific construction or additional training methods, claimed the company.
"In 24 of the 26 languages examined, GPT-4 surpasses GPT-3.5 and other LLMs (Chinchilla, PaLM), including low-resource languages like Latvian, Welsh, and Swahili," the company added.
Internally, the company has been implementing this new approach, with significant effects on activities like support, sales, content moderation, and programming. This model may take a request that contains both text and graphics, as opposed to the text-only default, enabling users to specify any vision or language task.
(With Agency Inputs)