[해외 DS] AI는 실질적인 해를 끼친다

[해외DS]는 해외 유수의 데이터 사이언스 전문지들에서 전하는 업계 전문가들의 의견을 담았습니다. 저희 데이터 사이언스 경영 연구소 (MDSA R&D)에서 영어 원문 공개 조건으로 콘텐츠 제휴가 진행 중입니다.

존재론적 우려보단 일상 속 위험에 집중해야

인류를 멸망시킬 수 있다는 막연한 이야기가 아니라 인공지능의 실제 위협은 부당한 체포, 감시망의 확대, 명예훼손, 딥페이크 포르노 등과 같이 우리의 삶과 밀접한 관계가 있다.

많은 테크 기업이 그리는 실체 없는 미래상과 달리 인공지능 기술은 이미 주거, 형사 사법, 의료 분야에서 상습적으로 차별을 조장하며 혐오 발언과 잘못된 정보를 퍼뜨리는 데 사용되고 있다. 또한 편향된 임금 책정 알고리즘으로 인해 노동자들의 임금은 도난당하고 있고 이러한 AI 프로그램은 점점 더 널리 퍼지고 있는 실정이다.

실제 위협이 눈앞에 있음에도 불구하고 지난 5월 비영리 단체인 AI 안전 센터는 OpenAI의 CEO인 샘 알트먼을 비롯한 수백 명의 업계 리더가 공동 서명한 성명을 발표하여 핵전쟁이나 팬데믹과 같은 ‘AI로 인한 멸종 위험’에 대해 경고했다. 알트먼은 앞서 의회 청문회에서 이러한 위험을 암시하며 생성형 AI 도구가 “상당히 잘못될 수 있다”라고 지적한 바 있다. 그리고 7월에는 AI 기업 임원들이 조 바이든 대통령을 만나 “AI 위협의 주요 원인”을 줄이겠다는 여러 실속 없는 다짐을 하면서 실제 위협보다 실존적 위협이 더 크다는 것을 강조했다. 이들은 자신들의 주장을 정당화하기 위해 사기업의 AI 연구소를 통해 실존적 위험에 대한 엉터리 보고서를 만들고, 과장된 용어로 공포를 조장하여 규제 기관의 주의를 흩트리고 있다.

AI 기술 바로 알아야 현실적 위협 직시 가능해

대중과 규제 기관은 이러한 과학 소설 같은 기만술에 넘어가면 안 된다. 동료 평가를 실천하고 AI에 대한 지나친 공포 조장에 반대하는 학자와 활동가들의 의견을 참고하여 현재 AI가 미치는 실질적인 악영향을 이해해야 한다.

명확한 논의를 위해서 ‘AI’라는 용어의 모호성을 먼저 제거해야 한다. 어떤 의미에서는 컴퓨터 과학의 한 하위 분야의 이름이고 또 다른 의미로는 해당 하위 분야에서 개발된 컴퓨팅 기술을 지칭할 수 있으며, 현재는 대부분 대규모 데이터 세트를 기반으로 한 패턴 매칭과 패턴을 기반으로 한 새로운 미디어 생성을 일컫는 단어다. 한편 마케팅 문구나 스타트업 홍보 자료에서 AI라는 용어는 비즈니스를 강화하는 마법의 가루로 통한다.

작년 말 OpenAI가 ChatGPT를 출시하고 Microsoft가 이 도구를 Bing 검색에 통합하면서 텍스트 합성 기계가 가장 주목받는 AI 시스템으로 떠올랐다. ChatGPT와 같은 대규모 언어 모델은 놀라울 정도로 유창하고 일관성 있는 텍스트를 추출하지만, 추론 능력은 물론 텍스트의 의미를 이해하지도 못한다. 이해 능력이 없는 기술에 이해력을 강제로 대입하는 시스템은 타로 해석과 다를 바가 없다. 임의로 정해진 답안을 받고 질문과 답의 논리 틈을 두고 자신을 이해시키는 일종의 사후 해석 과정에 불과하기 때문이다.

안타깝게도 생성형 AI의 결과물은 매우 그럴듯해 보이기 때문에 합성 출처를 명확히 밝히지 않으면 정보 생태계에 해로운 영향을 끼칠 수 있다. 신뢰할 수 있는 정보로 착각할 위험이 있을 뿐만 아니라, 정보로 가치가 없는 내용이 학습 데이터에 내재한 편견(이 경우 인터넷에 존재하는 모든 종류의 편견)을 증폭시킨다. 게다가 합성 텍스트는 실제 출처에 대한 인용이 없음에도 불구하고 권위 있게 들린다. 따라서 합성 텍스트 유출이 오래 지속될수록 신뢰할 수 있는 출처를 찾기가 점점 더 어려워지고, 막상 찾았다고 해도 신뢰하기가 또 망설여지는 심각한 문제가 있다.

접근성 좋아도 취약계층 돕는 기술 아니야

생성형 AI를 판매하는 사람들은 텍스트 합성 기계가 초중고 교육의 교사 부족, 저소득층의 의료 서비스 접근성 부족, 변호사를 고용할 수 없는 사람들을 위한 법률 지원 부족 등 우리 사회 구조의 다양한 문제를 해결할 수 있다고 피력한다. 하지만 도움이 필요한 사람들에게 실질적인 도움이 되지 않을 뿐만 아니라, 근로자에게 피해까지 준다. 예술가와 저자로부터 아무런 보상 없이 막대한 양의 학습 데이터를 도용했고 해로운 결과물의 생성을 피하고자 모순적으로 데이터 라벨링 작업자에겐 유해 콘텐츠를 반복적으로 노출해 정신적인 고통을 안겼다. 열악한 근무 환경에서 작업을 수행하는 긱 워커와 계약직 근로자들은 임금과 근로 조건에서 최하위권으로 내몰렸다.

마지막으로, 고용주들은 자동화를 활용하여 비용을 절감하고, 안정적이었던 직장에서 사람들을 해고했다가 다시 저임금 근로자로 고용하여 자동화 시스템의 오류를 수정하는 업무를 맡게 했다. 이는 현재 할리우드에서 벌어지고 있는 배우와 작가들의 파업에서 가장 극명하게 드러나는데, 3D 모형화로 대체된 배우를 사용할 수 있는 영구적인 권리를 사들이고, 인공지능이 만들어 낸 대본을 수정하기 위해 단발적으로 작가를 고용하고 있다.

무엇보다 과학적인 근거를 활용해 정책 수립에 힘써야

AI 관련 정책은 과학에 기반하고 관련 연구를 바탕으로 수립되어야 하지만, AI 업계로부터 지원받는 학술 단체나 기업 연구소에서 나온 자료가 지나치게 많다. 대부분 자료는 과학적 재현이 불가능하고, 영업 비밀 뒤에 숨어 있으며, 선전으로 가득 차 있고, 구성개념 타당도(결과적으로 측정되는 개념을 관련 구성개념이나 가정에 비추어 봄으로써 평가하는 타당도)가 부족한 평가 방법을 사용하는 등 상당수가 사이비 과학이다.

최근 주목할 만한 자료로는 “인공 일반 지능(Artificial General Intelligence)의 시작: GPT-4를 사용한 초기 실험”이라는 제목의 155페이지짜리 출판 전 논문이 있다. OpenAI의 텍스트 합성 기계 중 하나인 GPT-4의 출력에서 ‘지능’을 발견했다고 주장하는 Microsoft Research는 학습하지 않은 새로운 문제를 해결할 수 있다고 기술했지만, OpenAI는 해당 데이터에 대한 액세스 권한이나 설명조차 제공하지 않기 때문에 아무도 검증할 수 없다. 한편 전능한 기계가 악당으로 변해 인류를 멸망시킬지도 모른다는 환상에 세상의 관심을 집중시키려는 ‘AI 멸망론자’들은 기업들이 AI를 개발한다는 명목으로 현실 세계에서 저지르고 있는 실제 피해에 관한 연구보다는 이런 허황된 연구자료를 인용하고 있다.

정책 결정권자들은 규제되지 않은 데이터와 컴퓨팅 파워의 과도한 축적, 모델 훈련과 추론에 드는 환경적 비용, 복지에 대한 피해와 빈곤층의 무력화, 흑인과 원주민에 대한 경찰 단속 강화 등 자동화된 시스템에 권한을 위임함으로써 발생하는 해악을 조사해야 한다. 그 과정에서 엄밀한 학문적 방법론을 활용할 것을 촉구하고 세심한 정책으로 피해를 보는 사람들에게 계속 초점을 맞춰야 한다.

Wrongful arrests, an expanding surveillance dragnet, defamation and deep-fake pornography are all actually existing dangers of so-called “artificial intelligence” tools currently on the market. That, and not the imagined potential to wipe out humanity, is the real threat from artificial intelligence.

Beneath the hype from many AI firms, their technology already enables routine discrimination in housing, criminal justice and health care, as well as the spread of hate speech and misinformation in non-English languages. Already, algorithmic management programs subject workers to run-of-the-mill wage theft, and these programs are becoming more prevalent.

Nevertheless, in May the nonprofit Center for AI safety released a statement—co-signed by hundreds of industry leaders, including OpenAI’s CEO Sam Altman—warning of “the risk of extinction from AI,” which it asserted was akin to nuclear war and pandemics. Altman had previously alluded to such a risk in a Congressional hearing, suggesting that generative AI tools could go “quite wrong.” And in July executives from AI companies met with President Joe Biden and made several toothless voluntary commitments to curtail “the most significant sources of AI risks,” hinting at existential threats over real ones. Corporate AI labs justify this posturing with pseudoscientific research reports that misdirect regulatory attention to such imaginary scenarios using fear-mongering terminology, such as “existential risk.”

The broader public and regulatory agencies must not fall for this science-fiction maneuver. Rather we should look to scholars and activists who practice peer review and have pushed back on AI hype in order to understand its detrimental effects here and now.

Because the term “AI” is ambiguous, it makes having clear discussions more difficult. In one sense, it is the name of a subfield of computer science. In another, it can refer to the computing techniques developed in that subfield, most of which are now focused on pattern matching based on large data sets and the generation of new media based on those patterns. Finally, in marketing copy and start-up pitch decks, the term “AI” serves as magic fairy dust that will supercharge your business.

With OpenAI’s release of ChatGPT (and Microsoft’s incorporation of the tool into its Bing search) late last year, text synthesis machines have emerged as the most prominent AI systems. Large language models such as ChatGPT extrude remarkably fluent and coherent-seeming text but have no understanding of what the text means, let alone the ability to reason. (To suggest so is to impute comprehension where there is none, something done purely on faith by AI boosters.) These systems are instead the equivalent of enormous Magic 8 Balls that we can play with by framing the prompts we send them as questions such that we can make sense of their output as answers.

Unfortunately, that output can seem so plausible that without a clear indication of its synthetic origins, it becomes a noxious and insidious pollutant of our information ecosystem. Not only do we risk mistaking synthetic text for reliable information, but also that noninformation reflects and amplifies the biases encoded in its training data—in this case, every kind of bigotry exhibited on the Internet. Moreover the synthetic text sounds authoritative despite its lack of citations back to real sources. The longer this synthetic text spill continues, the worse off we are, because it gets harder to find trustworthy sources and harder to trust them when we do.

Nevertheless, the people selling this technology propose that text synthesis machines could fix various holes in our social fabric: the lack of teachers in K–12 education, the inaccessibility of health care for low-income people and the dearth of legal aid for people who cannot afford lawyers, just to name a few.

In addition to not really helping those in need, deployment of this technology actually hurts workers: the systems rely on enormous amounts of training data that are stolen without compensation from the artists and authors who created it in the first place.

Second, the task of labeling data to create “guardrails” that are intended to prevent an AI system’s most toxic output from seeping out is repetitive and often traumatic labor carried out by gig workers and contractors, people locked in a global race to the bottom for pay and working conditions.

Finally, employers are looking to cut costs by leveraging automation, laying off people from previously stable jobs and then hiring them back as lower-paid workers to correct the output of the automated systems. This can be seen most clearly in the current actors’ and writers’ strikes in Hollywood, where grotesquely overpaid moguls scheme to buy eternal rights to use AI replacements of actors for the price of a day’s work and, on a gig basis, hire writers piecemeal to revise the incoherent scripts churned out by AI.

AI-related policy must be science-driven and built on relevant research, but too many AI publications come from corporate labs or from academic groups that receive disproportionate industry funding. Much is junk science—it is nonreproducible, hides behind trade secrecy, is full of hype and uses evaluation methods that lack construct validity (the property that a test measures what it purports to measure).

Some recent remarkable examples include a 155-page preprint paper entitled “Sparks of Artificial General Intelligence: Early Experiments with GPT-4” from Microsoft Research—which purports to find “intelligence” in the output of GPT-4, one of OpenAI’s text synthesis machines—and OpenAI’s own technical reports on GPT-4—which claim, among other things, that OpenAI systems have the ability to solve new problems that are not found in their training data.

No one can test these claims, however, because OpenAI refuses to provide access to, or even a description of, those data. Meanwhile “AI doomers,” who try to focus the world’s attention on the fantasy of all-powerful machines possibly going rogue and destroying all of humanity, cite this junk rather than research on the actual harms companies are perpetrating in the real world in the name of creating AI.

We urge policymakers to instead draw on solid scholarship that investigates the harms and risks of AI—and the harms caused by delegating authority to automated systems, which include the unregulated accumulation of data and computing power, climate costs of model training and inference, damage to the welfare state and the disempowerment of the poor, as well as the intensification of policing against Black and Indigenous families. Solid research in this domain—including social science and theory building—and solid policy based on that research will keep the focus on the people hurt by this technology.