Is the Internet going to be an ocean of drinking water, contaminated with poison, making it useless?
Lies and Fakes are all over the Internet. They tear society apart, manipulate economic and political sanctity. People are losing faith in content published over the Internet. Liar’s Dividend is an escape route of the crooked. Peace is at stake.
How can we set things right? How should we utilize technology to stop this menace in society?
Nature of Internet Lies and Fakes
We can broadly categorize the Internet Lies and Fakes into two areas:
- Fake News, considering it text-based
- Fake Images/Audios/Videos
Nowadays AI-synthesized Fake News and Fake Images/Audios/Videos are very easy to generate. DeepFake is the name of the AI-generated Fake content. The name “DeepFake” has been generated with “Deep Learning” coupled with “Fake”. Deep learning models like Autoencoders or GAN are able to automatically synthesize Fake content. DeepFake technologies require large datasets of images and videos with facial expressions/movements to synthesize images and videos with similar expressions and movements. (Please refer to the “Generative Adversarial Networks” section in Footnote “Overview of Technologies impacting DeepFake Research” for a basic understanding of how the DeepFake technology works.)
Impact of DeepFake on Society
With negative effect on society:
- Social Media Fake Videos/Images to influence public opinion
- Negative impact on creativity that inspires the development of an original artifact
With positive effect on society:
- Completing a movie with an actor or actress who passed away
- Celebrity advertisement video without always needing celebrity to be present in person.
- Gaming industry with a synthesized image
- Prediction of climate change
- Prediction of pathogen mutation
Creation of Fake News
Fake News is created mostly for propaganda and for influencing public opinion with different sorts of bias. With technological innovation with Social Media, Fake News reach the farthest parts of the world. Influence efforts could be domestic and foreign as well influencing the people’s minds crossing the borders of the country.
Technologies like Natural Language Generation (NLG) can generate Fake News with specific bias automatically. (Please refer to the “Natural Language Generation” section in Footnote “Overview of Technologies impacting DeepFake Research” for a basic understanding of how the DeepFake technology works.)
Social bots also are also used for the automatic spread of misinformation.
Fake News is really a danger in politics, the media industry, the security of a country, and also it disrupts society with skewed opinions generated through domestic or foreign influence.
Detection of Fake News
Detection of Fake News has been a strong research area for quite some time. Automated validation of news with trusted sources along with consistency checks with images is being used. Moreover, critical thinking of citizens is very important to check the news before even sharing it with anyone else.
Table: Fake New Detection Tools/Websites
Tools URL Nature of Service
Hoaxy Database for Visualization/Tracking
Fact-Checker Media Report Verification
Politifact Independent News Verification
FactCheck Independent News Verification
Snopes.com Independent News Verification
RumorLens Database for Visualization/Tracking
TwitterLens Database for Visualization/tracking
CREDBANK Database for Visualization
Creation of Fake Images/Videos
In earlier years photo editing software like Photoshop etc had been used for image editing. Detection of those kinds of simple photo tampering was easy. Nowadays, tools to automatically synthesize Fake Images/Videos have become freely available. These tools use Deep learning models like Autoencoders or Generative Adversarial Networks (GAN) to automatically synthesize Fake content.
Table: DeepFake Creation Tools/Apps
DeepFake Content Creation Tools/Websites
Detection of Fake Images/Videos
Digital Image Forensic Techniques are being used for investigation purposes for quite some time. These techniques mostly comprise of
- Reverse Image Search
- Image Metadata Analysis
- Camera exposure parameters of Image
- Use of Flash
- Analysis of Eyes and Ears in Images
- Geometric properties of photos
- JPEG Signatures and JPEG Ghosts
- Analysis of Cloning of Images
- Facial Reconstruction – 3D face
- 3D Scene Reconstruction
- JPEG Double Compression
In September 2020, Microsoft came out with their Solution to Disinformation. Details are available in Microsoft blogs.
Blockchain is a very promising technology to vouch for Proof of Authenticity through Ethereum-based Smart Contracts that will be very effective in combating all sorts of lies and fakes on the Internet.
(Please refer to the “Blockchain” section in Footnote “Overview of Technologies impacting DeepFake Research” for a basic understanding of how the DeepFake technology works.)
Examples of companies working on Image/Video authenticity using Blockchain are as follows:
Table: DeepFake Detection Checks/Processes
|Detection Mechanism||Type of Deepfake|
|Consistency of Eye Blicking across frames with time||Video|
|Consistency of Image Parameters across Frames with Time||Video|
|Consistency of Image Parameters within Frame and with Time||Video|
|Consistency of warped face areas with the surrounding context||Video|
|Consistency of head poses||Video/Image|
|Consistency of Eye, Teeth, and face areas/textures||Video|
|Analysis of Photo Response Non-Uniformity (PRNU) of video capturing sensors||Video|
|Consistency of mouth shapes/lip syncs during pronunciation of spoken words||Video|
|Consistency of emotion influencing the audio and video||Video|
|Consistency of Appearance and Behavior considering face expression and head movements||Video|
|Using Blockchain to prove authenticity||Video/Image|
|Using difference models of Generational Adversarial Networks||Image|
Creation of Fake Audios
AI-based technologies are also being used to synthesize fake audio content generation. Coupled with audio DeepFakes, the DeepFake videos are becoming highly realistic.
Detection of Fake Audios
AI-synthesized audio is highly realistic, and it is improving in quality with the introduction of technologies like Generative Adversarial Networks (GAN). Audio forensic techniques and researches started leveraging high-frequency spectral analysis of magnitude and phases to distinguish human and synthesized speech.
Next Courses of Actions
Building awareness of citizens about DeepFake contents on the Internet. A solid legal structure around the publication of harmful content on the Internet holding the source accountable can begin the journey. Other measures could be:
- More initiative and funding via DARPA’s MediFor program
- Initiatives around Blockchain-based authenticity
- Social Media Governance through Policies
- Generation and sharing of datasets of all sorts of fake contents for further research
As a general guideline, detection and blocking the fake content at the initiation level in Social Media platforms could be more effective in preventing the spread and consequential impact on society.
Footnote: Overview of Technologies impacting DeepFake Research
Blockchain is a technology that leverages the Internet to create a distributed system where any transaction can only be appended, cannot be deleted, or modified. It is cryptographically secure. It supports data integrity.
Bad fraudsters cannot change the originality of any image. On the contrary, any video or image without any traceable creator can be assumed to be fake.
Generative Adversarial Networks
Generative Adversarial Networks (GAN) is a very recent invention (2014) in Artificial Intelligence that is being used for creating DeepFakes. Interestingly, a number of DeepFake Detection tools are also based on GAN technology.
GAN is a Deep Learning algorithm that used two competing neural networks simultaneously – Generator and Discriminator. Generator Neural Networks are trained to generate fake data when Discriminator Neural Networks are trained to spot the fake data generated by Generator. Both the neural networks are competing with one another, in that way, the performance of both the neural networks improves. Eventually, the training of the competition neural networks stops when the improvement reaches an equilibrium.
Natural Language Generation
Natural Language Generation (NLG) is an area under Artificial Intelligence that generates outputs that can be understood by humans. NLG systems take text or data as input and generate text or speech (that is expressed in language) or multimedia presentation or automated communications of conversational bots.
Natural Language Generation technologies leverage Deterministic or Probabilistic or Deep Learning Models for the generation of outputs for humans. More importantly, these models can be tweaked to impregnate bias, cultural nuances, and so on to influence the news or reports that are generated automatically through NLG systems.
Fake Photos – Hany Farid