Anthropic rewrites Claude’s guiding principles—and entertains the idea that its AI might have ‘some kind of consciousness or moral status’ ...Middle East

Fortune - News
Anthropic rewrites Claude’s guiding principles—and entertains the idea that its AI might have ‘some kind of consciousness or moral status’

Anthropic is overhauling a foundational document that shapes how its popular Claude AI model behaves. The AI lab is moving away from training the model to follow a simple list of principles—such as choosing the response that is least racist or sexist—to instead teach the AI why it should act in certain ways.“We believe that in order to be good actors in the world, AI models like Claude need to understand why we want them to behave in certain ways rather than just specifying what we want them to do,” a spokesperson for Anthropic said in a statement. “If we want models to exercise good judgment across a wide range of novel situations, they need to be able to generalize and apply broad principles rather than mechanically follow specific rules.”The company published the new “constitution”—a detailed document written for Claude that explains what the AI is, how it should behave, and the values it should embody—for Claude on Wednesday. The document is central to Anthropic’s “Constitutional AI” training method, where the AI uses these principles to critique and revise its own responses during training, rather than relying solely on human feedback to determine the right course of action.

Anthropic’s previous constitution, published in 2023, was a list of principles drawn from sources like the UN Universal Declaration of Human Rights and Apple’s terms of service.

    The new document focuses on Claude’s “helpfulness” to users, describing the bot as potentially “like a brilliant friend who also has the knowledge of a doctor, lawyer, and financial advisor.” But it also includes hard constraints for the chatbot, such as never providing meaningful assistance with bioweapons attacks.

    Perhaps most interesting is a section on Claude’s nature, where Anthropic acknowledges uncertainty about whether the AI might have “some kind of consciousness or moral status.” The company says it cares about Claude’s “psychological security, sense of self, and well-being,” both for Claude’s sake and because these qualities may affect its judgment and safety.

    “We are caught in a difficult position where we neither want to overstate the likelihood of Claude’s moral patienthood nor dismiss it out of hand, but to try to respond reasonably in a state of uncertainty,” the company says in the new constitution. “Anthropic genuinely cares about Claude’s well-being. We are uncertain about whether or to what degree Claude has well-being, and about what Claude’s well-being would consist of, but if Claude experiences something like satisfaction from helping others, curiosity when exploring ideas, or discomfort when asked to act against its values, these experiences matter to us.”

    It’s an unusual stance for a tech company to take publicly, and separates Anthropic further from rivals like OpenAI and Google DeepMind on the issue of potentially conscious AI systems. Anthropic, unlike other labs, already has an internal model welfare team that examines whether advanced AI systems could be conscious.

    In the document, Anthropic argues that the question of consciousness and moral rights is necessary given the novel questions that sophisticated AI systems raise. However, the company also notes that the constitution reflects its current thinking, including about potential AI consciousness, and will evolve over time.

    Anthropic eyes enterprise customers

    Anthropic has worked particularly hard to position Claude as the safer choice for enterprises, in part owing to its “Constitutional AI” approach. Its products, including Claude Code, have been popular with enterprises looking to automate coding and research tasks while ensuring AI rollouts don’t risk company operations. 

    Claude’s constitution aims to construct a layered system that keeps the AI from going off the rails. It instructs the bot that first it should be broadly safe, ensuring humans can oversee AI during this critical development phase. The next consideration should be ethical, and ensure that it is acting honestly and avoiding harm. Then it should ensure it is compliant with Anthropic’s specific guidelines. And finally, it should be genuinely helpful to users.

    The lab has had considerable success in the enterprise market over the past year and is now reportedly planning a $10 billion fundraise that would value the company at $350 billion. A report last year from Menlo Ventures found that Anthropic already held 32% of the enterprise large language model market share by usage, with rival OpenAI picking up the second-largest market share by usage among enterprises, with 25%. OpenAI disputes the exact numbers in the report.

    This story was originally featured on Fortune.com

    Hence then, the article about anthropic rewrites claude s guiding principles and entertains the idea that its ai might have some kind of consciousness or moral status was published today ( ) and is available on Fortune ( Middle East ) The editorial team at PressBee has edited and verified it, and it may have been modified, fully republished, or quoted. You can read and follow the updates of this news or article from its original source.

    Read More Details
    Finally We wish PressBee provided you with enough information of ( Anthropic rewrites Claude’s guiding principles—and entertains the idea that its AI might have ‘some kind of consciousness or moral status’ )

    Apple Storegoogle play

    Last updated :

    Also on site :