Men reading newspaper isolated on white background

We Can Solve the Problem of Deepfakes and Disinformation

It’s Not About Dismantling the Lie—
But How We Can Process the Surrounding Truth to Expose It

One of the most insidious problems facing humanity is the power of synthetic media to erode our understanding of what is real.  Currently, society’s programmed response to contrived images, videos, audio tracks and documents is to wage a battle of technology over the authenticity of disputed content.  But ultimately this is a losing proposition.  When alleged deepfakes or other disinformation emerge online, laborious efforts to unwind and explain the underlying artifice fail to make victims whole or promote restraint while the veracity of the content is adjudicated.

The key to eradicating the destructive power of deepfakes is to stop fixating on the fake itself—and begin shifting the focus to all the surrounding information in cyberspace that frames and contextualizes every discrete data point in cyberspace.  Leveraging the principle that reality will always have a greater footprint in cyberspace than deepfakes, we can systematically defeat fake media, and advance humanity’s understanding of reality, by developing “truth analytics” that expose and explain the probabilistic relationships connecting every event and condition documented online.                                   

Introduction

It is early November in 2022.  Tomorrow, voters in West Lafayette, Indiana—proud home of Purdue University and its beloved Boilermakers—will elect their next mayor.  Recent polls signal that Samuel Flatbush, a physics professor with no political experience, maintains a double-digit lead over incumbent Alan Burtsfield, who spearheaded a controversial initiative to change the mascot at the local high school from the “Red Devils” to the “River Rats.” 

At 7:30 p.m. on the eve of the election, the Purdue men’s basketball team, ranked fifth in the nation, hosts its in-state rival and arch nemesis, the third-ranked Hoosiers from Indiana University.  At 6:45 p.m. a Twitter account with the handle “JoeBoiler” uses the hashtag #Mayor_Judas to tweet a picture of Samuel Flatbush clad head to toe in red and white Hoosier regalia, with “Purdue Sucks” beer coasters prominently displayed in each hand.  The local CBS affiliate is alerted to the tweet and leads its 7:00 p.m. news broadcast with the story. 

The real Samuel Flatbush—ensconced in his living room, wearing black and gold “Boiler Up” flannel pajamas and gripping only a mug of hot chocolate and the TV remote—watches helplessly as the evening news anchors predict his defeat.

At 7:04 p.m. Flatbush plunges into a deep depression.

At a local pub across town, Samuel’s best friend and campaign manager is also watching the lead story unfold on the evening news.  At 7:05 p.m. she calmly pulls out her cell phone, opens an “Event Verification” app called SilentProveE, and drags the tweeted image of Flatbush into SilentProveE’s user interface.  Immediately the app begins compiling information from the Internet while executing millions of queries across an expansive catalogue of government and commercial databases.  Every step of this process is directed by an open source machine learning algorithm—nicknamed “TruthFinder”—that has been vetted and refined through thousands of peer-reviewed trials.

SilentProveE’s job is to piece together and evaluate the mosaic of available information tending to prove or disprove the authenticity of the photograph—then assign the picture a “likelihood score” ranging from 0 (no way) to 100 (no doubt).  In less than two seconds SilentProveE assembles billions of pieces of relevant information.  TruthFinder ascertains that ten data points are particularly probative:

  • A political candidate is contesting the authenticity of a personal image posted less than twenty-four hours before an election.
  • No other image of Samuel Flatbush wearing Indiana University paraphernalia was retrieved.
  • The disputed photograph was taken in front of a service elevator on the second floor of Purdue University’s physics department.
  • The disputed photograph is a 73% match to a photograph of Flatbush (wearing a tan corduroy suit) published in a 2016 newsletter entitled “Relatively Exciting Events in Quantum Relativity.”
  • Security footage from Purdue’s physics department captured 32,458 images of students and professors wearing Boilermakers attire in the previous six months, compared with 243 images of a person wearing Hoosiers attire.
  • According to TruthFinder’s facial recognition tool, none of those 243 images was a match to a Purdue professor.
  • Samuel Flatbush is a board member of the West Lafayette Bahai Congregation.  The Bahai Faith forbids the consumption of alcohol.
  • Over the last five years no credit card belonging to Samuel Flatbush was used to make a purchase within the city limits of Bloomington, Indiana (the site of Indiana University’s flagship campus) or at an Indiana University satellite location.
  • Neither Flatbush nor any member of his family within two generations attended Indiana University. 
  • The time and date metadata typically embedded in digital photographs was removed from the disputed image.

TruthFinder calculates a likelihood score of “6 out of 100” for the tweeted image of Flatbush. SilentProveE brands this score onto the original Twitter image, links the result to an index of the most conclusive facts, and sends an alert to every user following #Mayor_Judas.  As one of three Event Verification services with an audit-certified accuracy score above 98%, SilentProveE is authorized to exercise these administrator privileges on social media platforms under the Securing the Truth Of Publicly Broadcast Statements (STOP BS) Act of 2021.

SilentProveE publishes its likelihood score at 7:07 p.m.  At 7:15 p.m. the evening news anchors update their lead story to announce that SilentProveE believes the Twitter image is a fake.

At 7:17 p.m. Samuel lifts his face off the couch cushion.

At 7:30 p.m. the Burtsfield campaign issues a formal apology to Flatbush, attributing the tweet to the misguided actions of a rogue campaign staffer.

The next morning Election Day mercifully arrives, and Samuel Flatbush is elected mayor of West Lafayette with sixty-two percent of the vote.

Deepfakes as a Psychological WMD

Malign actors can use artificial intelligence (AI) to manufacture synthetic images, videos, audio tracks or documents depicting real people engaging in provocative acts or saying inflammatory things that never happened.  These “deepfakes,” as they are popularly known, are so convincing and difficult to detect that they are easily mistaken for reality, with consequences that can range from humiliation to catastrophe.  The Washington Post’s editorial board has warned that “[t]he ability to use machine learning to simulate an individual saying or doing almost anything poses personal and political risks that societies around the world are ill-equipped to guard against.”  Ben Sasse, a Republican Senator from Nebraska, predicts that deepfakes “are likely to send American politics into a tailspin,” citing the widely held view that America’s enemies regard deepfakes as a weapon that can wreak tectonic damage if detonated along the proper social and political fault lines.

Spreading misinformation to confuse or weaken an adversary is a time-honored tactic in warfare, and a sharp but familiar gambit in politics, business, sports and other domains where people compete for high stakes.  But deepfakes pose dangers that transcend the destructive impacts of a misinformation campaign.  Coming to terms with this threat requires a more nuanced understanding of the distinction between a Potemkin village and Scarlett Johansson’s face engrafted onto a body performing pornographic acts. 

Conventional misinformation is derivative, ephemeral and transactional.  Misinformation is derivative in the sense that it deceives by obscuring or misrepresenting a truth that nonetheless remains ascertainable.  Acme Corporation may insist that its widget production process eliminates environmental contaminants.  But if this assertion is false, tests of the surrounding air and water will reveal the lie through evidence that third parties will be inclined to credit.  Misinformation is also ephemeral—the circulation of false information creates a class of victims motivated to reveal the truth, such that the lie is increasingly difficult to sustain over time.  Knowing these lies are perishable, purveyors of misinformation typically seek to secure some permanent advantage over a short time horizon.  Finally, misinformation campaigns are transactional; they aim to deceive a select audience long enough to achieve a specified objective.  The deception is most potent when it reaffirms the integrity of other genuine information the victim perceives, so as to harmonize—rather than jumble—the inputs that comprise the target’s reality. 

These qualities of misinformation limit the damage that misinformation inflicts beyond the targeted audience.  But deepfakes are not similarly constrained.  A deepfake does not distort an organic truth that remains discoverable—it originates a unique fact with no traceable lineage to a “real” condition or event.  If a deepfake video surfaces of Chief Justice John Roberts using a racial slur to describe black people, no collateral fact could be established that would exonerate Mr. Roberts.  The only recourse is to unmask the artifice that generated the deepfake, but as synthetic media become more sophisticated these efforts may well be futile, or too slow and inconclusive to make the victim whole.  Hence the fraud perpetrated by a deepfake is not ephemeral because it defies resolution, leaving uncertainty in its wake. 

Deepfakes also sow instability that is not localized or transactional, but diffused and cumulative.  Unlike misinformation, fake content contrived through AI does not reinforce surrounding realities or inexorably give way to truth.  Quite the opposite: deepfakes call into question the reliability of any fact encountered in cyberspace not verified first-hand or through a trusted source.  The social contract that enables our way of life presupposes that people experience a common reality that baselines human activity and decision-making.  Deepfakes erode this foundational support, and with it our capacity to sustain a civil society.   

The Futility of Deepfake Whack-A-Mole

The antidote to deepfakes and other AI-enabled disinformation is elusive.  Our programmed response to fake content is to wage a battle of technology over the authenticity of a disputed image, video, audio recording, or statement.  But this approach is not the only available recourse, and it arguably offers the least advantageous terrain for combating deepfakes.  Fighting a deepfake by laboring to detect and expose its underlying technology cedes to the malign actor the power to dictate the terms of engagement, with unsettling implications. 

Experts may dispute whether “white hats” or “black hats” are currently winning the arms race to alternately develop and expose new species of disinformation, but everyone agrees that the technology driving synthetic media is rapidly evolving.  One commonly cited example is Generative Adversarial Networks (GANs), a system in which one network (the “generator”) makes iterative attempts to convince a second network (the “discriminator”) that a synthetically-generated image is real.  Over numerous trials the generator “learns” to manufacture the desired output with increased fidelity as the discriminator rejects the generator’s imperfect submissions and highlights the detected flaws.  Ultimately the generator will submit an output with no discernable defects, and the discriminator will ratify the result.

Autoencoder neural networks are a second technology used to generate deepfakes.  An autoencoder learns the essential features of a human face by performing a “compression task” in which the autoencoder receives an initial image, compresses the image into a reduced-data representation of the input, then reconstructs the compressed image into an output that (to the human eye) is indistinguishable from the original image.  Once an autoencoder can discern the patterns in compressed data that identify the original input of a human face, it can apply that knowledge to process multiple decompressed facial images into a new artificial image that conforms to the specifications for a real human face, and therefore simulates an actual person.  With nuanced manipulations and sufficient experimentation, autoencoders can be calibrated to generate deepfakes with precise specifications. 

Other AI initiatives leverage machine processing to decode and replicate linguistic signatures—e.g., grammar, vocabulary, tone, style, sentence structure and narrative voice—that uniquely identify a public figure’s writing style or manner of speech.  Once perfected, these algorithms could generate copycat papers, correspondence or statements that falsely impugn the reputation and credibility of the presumed author.    

GANs, autoencoders, and machine networks that mimic patterns in human writing herald a wave of new technologies that are certain to outpace—at least temporarily—competing initiatives to reveal the telltale manipulations that identify a deepfake.  Indeed, it seems foolhardy to assume that innovators can provide a persistent capability to outperform the discriminator in a GAN, detect the synthetic neural processing of an autoencoder, and unmask AI-infused technology trained to mimic a target’s writing style.  The more reasonable assumption is that protagonists will not continually enjoy the upper hand in this struggle.  Instead of resisting this reality we should embrace it, and pursue a global solution to deepfakes that society can implement even as it plays from behind in the technology arms race.              

A means of exposing deepfakes that bypasses the underlying AI offers two significant advantages.  First, it would limit the need to rely on social media companies and other content hosts to police fake media on their own platforms.  By default we anoint the likes of Facebook, Twitter, Instagram, Gfyfcat, Pornhub and others as the arbiters of reality in their own cyberspace, with unsatisfying results.  In some cases these companies possess but do not disclose metadata and other proprietary information about the origins of challenged content on their platforms.  Certain platforms deploy tools to identify false content that perform poorly, lack transparency, and expose the innate conflicts of interest that make content hosts unfit to be the exclusive judge of authenticity on their platforms.  As long as our response to deepfakes is contingent on unraveling the underlying technology, we will continue by necessity to vest content hosts with unwarranted autonomy to act as our true eyes and ears on the Internet.

Looking beyond technology for a strategy to unmask deepfakes advances a second critical objective by enabling judgments on disputed content that can be communicated quickly, explained simply, and digested easily.  Whether an image, video, audio clip or document is ultimately deemed fake or genuine is oftentimes beside the point; deepfakes are dangerous because of their tendency to provoke extreme reactions before this verdict is delivered.  The ideal countermeasure is a tool that can render cogent, accurate and decisive verdicts on disputed content quickly enough to institutionalize a “cooling-off period” when revelatory content emerges online. 

The speed of these rulings is important, but the nature of the rationale is paramount.  Adjudicating disputed content through jargon-packed expositions on the presence or absence of an AI manipulation misses the mark; these explanations resist clarity and invite rejoinders that further muddy the waters.  The most potent weapon against deepfakes is the capability to resolve content disputes through clear verdicts that manifest the same heuristics people apply in their everyday lives to differentiate truth from lies.  We typically reconcile conflicting information by relying on first-hand knowledge, weighing direct and circumstantial evidence, assessing the credibility of sources and subjects, applying principles of logic and probability, and leveraging personal experiences to predict outcomes and analyze cause and effect.  Deepfake determinations reasoned from the same type of narrative facts that inform human judgment stand the best chance of resonating with lay audiences and promoting restraint while the veracity of challenged content is adjudicated.

These facts exist.  There are quadrillions of them in the public domain that surround, contextualize and frame every discrete piece of content in cyberspace.  The key to eradicating the destructive power of deepfakes is to stop fixating on the fake itself—and begin shifting the focus to everything else.      

The Reality Power Law

The actors who propagate deepfakes enjoy almost every conceivable advantage in cyberspace.  The software used to create deepfakes is inexpensive or free, widely available, and embedded in easy-to-use applications that require little technical expertise.  The Internet is replete with tools that cloak users in anonymity and foil attempts to attribute content to the originating user or machine.  Even in the rare case where false or destructive content can be traced to its source, there is no existing or prospective body of law that governments can enforce outside their borders to hold perpetrators accountable.

But in the midst of this darkness there is a ray of light—an immutable, insurmountable rule of the virtual world that can be operationalized to systematically defeat deepfakes and disinformation.  Simply stated, reality will always have an exponentially greater footprint in cyberspace than deepfakes.  In mathematical terms this relationship can be expressed as a power law.  Just as doubling the sides of a square will always increase its area by a factor of four, the technical architecture and cultural norms that dictate how the human experience is projected onto the Internet ensure that data points reflecting actual conditions and events will always outnumber by an order of magnitude the quantity of content fabricated by mischievous or nefarious actors to masquerade as reality.  In the same way that gravity preserves order in the physical world, we can harness the “Reality Power Law” to filter alleged facts through a predominantly truthful sieve of electronic data as a means of safeguarding the integrity of cyberspace.

In the war against deepfakes, the Reality Power Law reverses the tilt of the battlefield to give white hats the high ground.  Instead of waging an unending uphill fight to unmask the virtual traces of AI in synthetic media, we can redirect our efforts toward a more achievable and socially worthwhile endeavor—processing the wealth of information in cyberspace to map the probabilistic connections among all the diverse data points in the public domain.  Suppose, for example, an algorithm reviewing trillions of data points determines that women between the ages of twenty-seven and twenty-nine who have a high school degree and were raised by Muslim parents in rural and relatively cool climate zones are disproportionately unlikely to engage in pornography or commit violent crimes.  If Amara falls within this cohort, and a new video depicts her participating in pornographic activities or robbing a bank, these data points provide important insight into the credibility of the film.  The algorithm may also discover that Amara’s father is campaigning for a high-profile political position, or uncover images of Amara backpacking in the Swiss Alps immediately before and after the film was created.  At the end of this exercise the veracity of the film can be evaluated, estimated, articulated and understood without reference to the technology used to generate the video.  And if the video is later acknowledged or proven to be fake, this event verification process may prompt social scientists to investigate whether and why being raised by Muslim parents, living in rural or cool climate zones, and/or graduating from high school are inversely correlated with pornography and crime.

The quest for a “truth analytic”—an algorithm that can process accessible data to predict the veracity of any disputed fact—is not fanciful.  We generally regard the “truth” of a proposition as the innate, internal and organic quality of embodying reality.  But truth leaves a residue, and people have survived and evolved by priming themselves to detect this residue or register its absence.  The world compels us to discern truth to execute the full range of human activity; from distinguishing edible and poisonous plants to developing successful surgical techniques, landing on the moon, and operating criminal justice systems that sustain social confidence.  Operating without the benefit of machines, our ancestors learned to resolve uncertainty by applying cognitive tools like logic, reason, and intuition to a knowledge base fortified by data confirmed through personal experience, scientific verification, universal consensus or credible sources.  This process is imperfect, and history has witnessed a plethora of miscalculating humans inadvertently poisoning themselves, losing patients on the operating table, suffering disasters in space and convicting innocent people.  But humanity’s evolving capacity to expand, document, and share the breadth of its experiential knowledge inarguably advances the search for truth.  It underwrites a more informed approach to ambiguity while distilling those uncertainties that pose irreducible risk to the decision-maker.      

Big data processing could not only accelerate our progress along this trajectory but redefine the way we conceive and detect truth.  Algorithms can be trained to apply the same formulas human use to discern truth—to identify information tethered closely enough to a disputed fact to have direct or circumstantial evidentiary value, weigh the credibility of sources, and apply principles of logic and causality.  But analytics processing massive volumes of data can also peer into knowledge spaces that people do not see, and never seek out, when assessing the veracity of a disputed fact.  It is conceivable, for example, that individuals with particular combinations of characteristics, preferences and experiences empirically exhibit strikingly similar behaviors and make highly comparable choices.  Suppose an image is posted online purporting to depict Andrea defacing old growth trees at an arboretum.  An analytic that could fashion a “behavioral cohort” for Andrea and determine whether other members of this cohort committed similar acts of vandalism would add significant context to the disputed image with data that currently lies beyond the limits of our unaided cognizance.

These cognitive constraints also stifle consideration of “dogs that didn’t bark” when people evaluate disputed facts.  In Sir Arthur Conan Doyle’s mystery “Silver Blaze,” Sherlock Holmes famously deduces that a race horse was stolen by the horse’s trainer, and not a stranger, after establishing that a dog in the horse’s stable did not bark on the night of his disappearance.  In the search for truth, the predictive value of critical non-events is an unexplored realm that machine algorithms could unlock.  Suppose, for example, an algorithm processing thousands of data points identifies precursors that nearly always precede participation in pro-Nazi demonstrations, or behaviors that typically follow such activity.  If an image of David marching in Nazi regalia is posted online after the President announces his nomination for a cabinet position, a predictive analytic may ascribe substantial weight to the absence of credible data capturing these antecedents or after-the-fact indicators.  Just as jurors have become accustomed to expect corroborating DNA evidence when a defendant is accused of a crime involving a physical altercation with the victim, truth algorithms can condition online audiences to account for the presence or absence of machine-verified precursors or post-hoc indicators when evaluating a disputed event.             

In a similar vein, truth algorithms can identify prognostic contra-indicators for acts and behaviors depicted in deepfakes.  Suppose a video is posted purporting to display eight inter-continental ballistic missiles (ICBMs) being launched toward Moscow from a U.S. military base in Stuttgart, Germany.  If publically available data gathered from local weather sensors indicates there was no spike in the ambient air temperature over Stuttgart at or immediately after the time the missiles were allegedly launched, an algorithm correlating the weather patterns in Stuttgart with the typical fluctuation in air temperature caused by the heat trail from an ascending missile could quickly and credibly discount the authenticity of the video.  

We have the raw materials we need to begin building a viable event verification capability.  While a truth algorithm could be asked to evaluate innumerable facts, only a narrow subset of events and conditions are sufficiently flagrant or impactful to inspire the majority of deepfake and disinformation attempts.  Beginning with these end-states—pornography and other vices, violent acts, hateful or disparaging speech and behavior, and military movements—we can exploit the vast trove of accessible information in cyberspace to reverse engineer the contextual clues that both indicate and contra-indicate these circumstances. 

The volume and significance of this data cannot be overstated.  Envision an open collaboration among data scientists and social scientists to build the “pornography module” for a truth algorithm.  Researchers may glean vital information from traditional sources such as academic literature, case studies, media reports, profiles of actors and actresses, and interviews with key industry personnel.  But these results would be enriched by an equally important and untapped resource: public data about millions of individuals not linked to pornography.  The analysis of this data may reveal previously obscured indicators and patterns that help define and differentiate groups associated and not associated with pornography.  These insights, in turn, could enable fast and informed predictions about the veracity of new videos and images that depict identifiable individuals engaged in pornographic acts.          

These predictions do not have to be perfect in order to be useful.  In most cases formatting the event verification output as a likelihood score ranging from 0 to 100 will achieve the desired result—conditioning audiences to withhold judgment pending a machine examination that affirms or undermines the authenticity of inflammatory content; and suggesting additional avenues of manual inquiry that could diminish the residual uncertainty.  Ultimately the sheer number of variables relevant to an alleged fact limits the precision of a machine-generated truth assessment.  But this multiplicity of relevant factors also precludes malign actors from fooling an event verification system by reinforcing a deepfake with additional false context.  Assuming it would even be feasible to plant artificial breadcrumbs along the thousands or millions of different trails a truth analytic might explore, the ruse would invariably generate its own trail of probabilistic anomalies that confess the underlying fraud.

AInthropology

With existing tools and processes, we can reveal new dimensions of truth configured by the probabilistic relationships that connect every event and condition captured in a data point.  Synthetic media may be the trigger that induces us to explore these frontiers, but there is more at stake in this journey than our capacity to cope with advancements in synthetic media.  Often forgotten in the race to develop new algorithms, analytics, and other machine processes is that AI is not merely a technology.  It is a giant mirror that captures and reflects the full history of the human experience, distilled momentarily on a grand scale only because humanity has chosen this moment in time to aggregate, refine and instill in machines the sum total of its acquired knowledge.  These inputs have a fascinating story to tell about progress and imperfection.  If we preserve these artifacts only as ones and zeroes, we will miss a singular opportunity to better comprehend the hard-wired idiosyncrasies that lead humans to systematically misapprehend the familial bonds that unite the future with the past and present. 

As we continue to work collaboratively with machines to compensate for these shortcomings, we will hopefully retain the sensitivity and humility to apply new insights to the activities and processes that humans deliberately place beyond the reach of AI.  If, for example, truth algorithms universally conclude that men over forty years of age with Attention Deficit and Hyperactivity Disorder are exceedingly unlikely (relative to other cohorts) to kidnap a child at gunpoint, the machine’s “testimony” may have evidentiary value to a jury confronted with these circumstances even if the result cannot be rationalized by a lawyer or judge.

Like any machine analytic that provides social utility by scanning information in cyberspace, event verification raises challenging civil rights and civil liberties questions.  Truth algorithms might suggest things that we do not want to hear—perhaps that the quality of being Jewish, or Hispanic, or elderly, or having a particular gene or ancestral lineage positively correlates with an undesirable outcome.  But if, as Albus Dumbledore reassures Harry Potter, our behaviors and decisions are truly the result of our choices and not predetermined by immutable characteristics, a battle-tested algorithm with broad access to electronic data should ratify the primacy of act over attribute.  If it does not, event verification will at least remove our blinders and position society to chart a more informed path toward this ideal.

While truth algorithms also raise thorny privacy concerns, deepfakes unsettle the prevailing orthodoxy that “more is always worse” when it comes to data exposure.  All else equal, a truth analytic is more likely to exonerate an innocent subject with an extensive online presence than a victim who fastidiously preserves his anonymity in cyberspace.  In the aggregate the efficacy of an event verification service increases with enhanced access to relevant data, and vice-versa.  Time will tell whether deepfakes and disinformation pose threats to personal safety, security and reputation that increase society’s tolerance for machine processing of personal data and enlarge the universe of data types and data repositories available to machine algorithms.  If so, it will be incumbent on technologists, civil libertarians and data stewards to establish rules and architectures that preserve a more contemporary notion of privacy by prescribing unique and segregable roles for humans and machines in modern information environments. 

*Jonathan Fischbach is an attorney for the U.S. Department of Defense. The positions expressed in any of his articles do not necessarily reflect the views of any federal agency or the views of the United States government.