Anthropic halves hallucinations, adds 200k context as it launches Claude 2.1 with API access

With OpenAI in turmoil, rival Anthropic has released its latest model, Claude 2.1, which it says sets a new benchmark in artificial intelligence, particularly in its application within enterprise environments.

As of Nov. 21, Claude 2.1 is already live, powering claude.ai’s chat experience, and is accessible via API. The company states that the update is not just a mere incremental change; it represents a “significant” leap in AI capabilities, notably featuring a 200K token context window, enhanced accuracy through reduced hallucination rates, and introducing innovative tool use functionalities.

Doubling Capacity with 200K Token Context Window

Claude 2.1’s expansion of the context window to 200,000 tokens, equivalent to 150,000 words or more than 500 pages, responds to the growing demand for handling extensive data sets. Users can now input and process large-scale documents, ranging from comprehensive technical documentation to extensive financial reports. This advancement allows Claude to delve deeper into data analysis, offering more robust summarizations and complex comparative insights.

Anthropic also claims that Claude 2.1 has enhanced reliability by reducing the rates of false statements by half compared to its predecessor. By testing Claude 2.1 against a curated set of complex, factual questions, Anthropic claims to have ensured that the model is more adept at providing correct information or appropriately acknowledging uncertainties, bolstering trust in AI-driven solutions.

Tool Use: Bridging AI and Operational Processes

The introduction of tool use in Claude 2.1, a beta feature, marks a significant stride in integrating AI into existing operational frameworks. Claude can now interact with and orchestrate various developer-defined functions, APIs, and web sources, enhancing its utility in various business operations. This feature enables Claude to perform complex numerical reasoning tasks, translate natural language requests into structured API calls, and access information from private databases, thereby streamlining workflows and decision-making processes.

Claude 2.1 also brings advancements in the developer experience, notably through the newly designed Workbench product. This feature allows developers to experiment with prompts more interactively and efficiently, optimizing AI behavior to better suit specific project requirements. Such enhancements in the developer console are pivotal for fine-tuning AI applications and ensuring they align with particular enterprise needs.

Anthropic is run by ex-OpenAI employees and recently received billions in funding from Amazon and Google.

Source link