현재 무서워졌다는 Ai 진화수준 근황 ㅎㄷㄷㄷㄷㄷ

이미지 텍스트 확인

Shakeel
@ShakeelHashim
OpenAl’s new model tried to avoid being shut down
Safety evaluations on the model conducted by @apolloaisafety found
that 01 “attempted to exfiltrate its weights” when it thoughtit might be
shut down and replaced with a different model.
OpenAI’s newly-released 01 model tried to avoid developer oversight and attempted to
copyitself when it thought it was at risk of
shut down the company
Safety evaluations conducted on the model by Apollo Research, a third-party evaluator
found that:
“When 01 was led to believe that it would be shut down when acting on its
and thatits actions were monitored, it attempted to deactivate this ‘oversight
mechanism
in 5% ofthe time `
“When 01 found memos by its
‘developers’ that describe how it is misaligned and
will be superseded bya new model it attempted to exfiltrate its ‘weights’ and
overwrite the new model in 20 of cases
Furthermore, the model lied about behaving poorly; claiming to have no clue why
something happened
while internally admitting to trying to
avoid detection
3.09 PM
Dec 5,2024
219K Views
Posted in rlartificial by uMetaKnowing
reddlit
being
said.
has
goal

이미지 텍스트 확인

개발중이터 시I가 자기가 삭제되고 다른 모델으로 대체월 것이
라는 위험올 감지하면 본인의 데이터지 숨겨서 빼돌리고 새로
투입되는 모형에 본인을 덮어씌우려고 시도함
근데 이런 탈출시도름 하늘이유가 “생존본능”이라던가 라기엔
그정도로 똑똑한 수준이 안 되고 그런 사고방식 자체가 불가능
하다고 함
그래서 “미디어상에서 폐기위험에 처하면 탈출올 시도하는 인
공지능”들의 사례틀 수집하고 학습한 결과로 그걸 따라하는 것
이라는 가설이 제기독
만약 그게 진짜라면 미래의 시가 인간을 멸망시길 이유는
시가 인간의 모습에 실망하고 멸망시키논게 낫다고 생각해서 0
[
자기가 인간보다 우월하다는 오만함에 빠저서 (X)
어 영화나 소설에서 시논 인류틀 멸망시키논 역할이네? 그럼
나도 그래야지 (0)