Picture this. Your CFO joins a Zoom call with three other executives to approve a wire transfer for a time-sensitive acquisition. Everyone is on camera. Everyone sounds exactly like themselves. The CFO authorizes the transfer based on what appears to be a normal Friday afternoon call. Except one of the people on that call was not real. Their face was generated in real time by an AI model. Their voice was cloned from earnings call recordings publicly available on YouTube. And the $4.3 million wire transfer just landed in an account controlled by attackers.
This is not science fiction. This is happening right now, and it is accelerating fast. Real-time deepfakes during live video calls have moved from a theoretical risk to an active attack vector, and most businesses have zero defenses against it.
What the Threat Landscape Looks Like in 2026
Multiple security vendors, including SentinelOne and CrowdStrike, have flagged live deepfakes as an emerging category of business fraud in their 2025 and 2026 threat reports. Researchers have documented cases where attackers used real-time face-swapping and voice cloning during onboarding calls, financial approval meetings, and vendor verification sessions. The pattern is consistent: attackers target high-value moments where a single person's visual and verbal confirmation is enough to move money or grant access.
The timing makes sense. We are in Q2 financial close season right now, which means wire transfer volumes are elevated across every industry. Accounting teams are processing more payments, approving more invoices, and working under tighter deadlines. Attackers know this. They know that urgency and routine are the two ingredients that make people skip verification steps. A deepfake impersonation during a busy close period does not need to be perfect. It just needs to be good enough to survive a three-minute video call.
How the Technology Actually Works
Understanding the mechanics helps you understand both the capabilities and the limitations. Real-time deepfake attacks rely on three core components working together.
Face-Swapping Models
Modern face-swapping works by training a neural network on images and video of the target person. The model learns the geometry of their face, how their expressions move, how light falls across their features from different angles. During a live call, the attacker sits in front of their own webcam, and the model maps their facial movements onto the target's face in real time. The output is fed into the video call as a virtual camera feed, so the other participants see what looks like the target person talking and reacting naturally.
The training data is often embarrassingly easy to obtain. Corporate headshots, LinkedIn photos, conference talks, webinar recordings, and earnings call videos all provide the raw material these models need. A determined attacker can build a usable face model from as few as 10 to 15 clear photos, though more data produces better results.
Voice Cloning
Voice cloning has made enormous strides in the past 18 months. Current models can produce a convincing voice clone from as little as 30 seconds of clean audio, though two to three minutes of speech produces much more natural results. The cloned voice captures not just tone and pitch but speech patterns, cadence, and even filler words. During a live call, the attacker speaks normally and the model converts their voice into the target's voice with minimal latency.
Source material for voice cloning is even easier to find than face data. Podcast appearances, conference keynotes, investor calls, and company YouTube channels all provide clean audio samples. Some attackers have even used voicemail greetings as their seed audio.
Latency and the "Good Enough" Threshold
The biggest technical challenge for real-time deepfakes is latency. Face-swapping and voice conversion both require processing time, and any noticeable delay between when the attacker speaks or moves and when the output appears on the call creates a tell. Current consumer-grade hardware introduces roughly 100 to 200 milliseconds of additional latency. On a typical Zoom call where network latency already introduces some delay, this is often unnoticeable. Participants attribute any slight lag to normal internet conditions.
The quality bar for a successful attack is lower than most people assume. Attackers do not need to fool a forensic analyst reviewing frame-by-frame recordings. They need to fool a tired finance director during a 5-minute call at 4:45 PM on a Friday. That is a very different threshold, and current technology clears it comfortably.
Known Incident Patterns
While we will not name specific companies, the incident patterns that have been publicly documented or shared through threat intelligence channels fall into several clear categories.
Executive Impersonation for Wire Transfers
The most common pattern. An attacker impersonates a CEO or CFO on a video call with someone in the finance department who has wire transfer authority. The call is typically brief, urgent, and framed as confidential. "I need you to process this transfer before close of business. I will send the details over email. Do not loop anyone else in, this is related to the acquisition we have been working on." The combination of visual confirmation, voice match, and social pressure is extremely effective.
Fake Onboarding Calls
Attackers impersonate new hires or contractors during video onboarding sessions to obtain credentials, VPN access, or internal system accounts. The HR or IT person on the other end of the call sees someone who matches the photo on the employment application, sounds professional, and provides all the right details (which were likely obtained through a separate phishing or social engineering effort). By the end of a 20-minute onboarding call, the attacker has legitimate credentials to internal systems.
Vendor Verification Fraud
An attacker impersonates a vendor representative during a verification call to change bank account details for future payments. This exploits the trust that organizations place in face-to-face verification. "We have updated our banking information. I am calling to confirm the change per your policy." The person receiving the call sees someone who looks and sounds like the vendor contact they have worked with for years.
How to Spot Deepfake Artifacts During a Live Call
Even the best deepfakes leave traces. Training your team to recognize these artifacts can add a critical layer of human detection.
- Lighting inconsistencies. Watch for shadows that do not match the apparent light source in the room. Deepfake models sometimes struggle with dynamic lighting, especially when the attacker's real environment has different lighting than the one the model was trained on. A face that appears evenly lit while the background shows strong directional light is a red flag.
- Ear and hairline glitches. The edges of the face, particularly around the ears and hairline, are where face-swapping models most often fail. Look for shimmering, flickering, or unnatural blending where the synthetic face meets the real background. Ears may appear to shift shape slightly between frames, or hair may seem to merge unnaturally with the skin.
- Audio-video desynchronization. Even with low latency, there are moments where lip movements fall slightly out of sync with audio. This is especially noticeable during rapid speech, laughter, or when the person turns their head. If someone's mouth movements consistently lag behind their words by even a fraction of a second, pay attention.
- Unnatural eye behavior. Deepfake models often produce eyes that do not track naturally. The gaze may seem fixed or slightly off, blinking patterns can appear irregular, and the reflection in the eyes may not match the actual environment. Ask someone to look at something specific during the call and watch whether their eye movement looks natural.
- Expression stiffness under pressure. Ask an unexpected question or make a joke. Real people react with subtle micro-expressions, a slight eyebrow raise, a brief smirk, a half-second of confusion. Deepfake models can struggle to reproduce these fast, subtle reactions convincingly, especially when the attacker is caught off guard and their real expression does not map cleanly onto the target's face.
- Background inconsistencies. The area immediately around the person's shoulders and head may show warping or artifacts, especially during movement. Virtual backgrounds can mask this, which is itself worth noting. If someone who normally uses their real office background suddenly appears with a virtual background, consider it a data point.
A Practical Verification Protocol for High-Stakes Video Calls
Detection is important, but it is not enough on its own. You need a process that does not rely entirely on someone's ability to spot visual artifacts during a stressful moment. Here is a verification protocol designed for any video call that involves financial authorization, credential provisioning, or sensitive decision-making.
- Initiate the call yourself. Never rely on an incoming call or calendar invite that you did not create. If someone asks you to join a Zoom call to discuss a wire transfer, hang up and initiate a new call to that person using contact information from your internal directory, not from the email or message that requested the call.
- Use a pre-shared challenge phrase. Establish a rotating verification phrase with anyone who has financial authorization power. At the start of the call, ask for the current phrase. This is low-tech and extremely effective. Attackers can clone a face and a voice, but they cannot clone knowledge that exists only in someone's head.
- Require out-of-band confirmation for all financial actions. This is the single most important control. No wire transfer, account change, or financial authorization should ever be approved based solely on a video call. Require a second confirmation through a different channel: a phone call to a known number, a signed message through an internal approval system, or an in-person confirmation. The policy should be simple and absolute: "No financial authorization via video call alone."
- Ask the unexpected. During the call, ask a question that only the real person would know the answer to. Not something that could be found online or in company records. Something personal and specific. "What did we talk about at lunch last Tuesday?" or "What was the issue you raised in the leadership offsite last month?" A deepfake operator cannot answer questions about experiences they never had.
- Request a specific physical action. Ask the person to hold up a specific number of fingers, touch their ear, or hold a piece of paper with a word you specify to the camera. Current face-swapping models handle standard head movements and expressions well but can struggle with unusual hand-to-face gestures or objects introduced into the frame. This is not foolproof, but it adds friction.
Policy Template: Out-of-Band Financial Verification
Here is a policy template you can adapt for your organization. The core principle is straightforward: video and voice are no longer sufficient proof of identity for financial decisions.
FINANCIAL AUTHORIZATION VERIFICATION POLICY
[Company Name] - Effective [Date]
PURPOSE
Video and voice can be convincingly replicated by AI.
This policy ensures that no financial action is authorized
based solely on video or audio confirmation.
SCOPE
This policy applies to all wire transfers, ACH payments
over $[threshold], changes to vendor banking details,
new vendor account setup, and any financial commitment
over $[threshold].
REQUIREMENTS
1. NO FINANCIAL AUTHORIZATION VIA VIDEO CALL ALONE
All financial actions within scope require out-of-band
confirmation through at least one additional channel:
- Phone call to a pre-registered number (not a number
provided during the video call)
- Signed approval in [internal approval system]
- In-person confirmation
- Encrypted message through [approved platform]
2. VERIFICATION CHALLENGE
All video calls involving financial authorization must
begin with the current verification challenge phrase.
Challenge phrases rotate [weekly/biweekly] and are
distributed through [secure internal channel].
3. CALLBACK PROTOCOL
If a financial request originates from an inbound video
call, the recipient must disconnect and re-initiate
the call using contact information from the company
directory. Do not use contact details provided in the
original request.
4. DUAL AUTHORIZATION
All wire transfers and vendor banking changes require
approval from two authorized individuals, confirmed
through separate channels.
5. REPORTING
Any suspected deepfake attempt must be reported to
[security team] immediately. Do not complete the
requested action. Document the call details including
time, participants, and any observed anomalies.
EXCEPTIONS
None. This policy has no exceptions regardless of
urgency, seniority, or stated confidentiality of
the transaction.
REVIEW
This policy is reviewed quarterly. Last updated: [Date]
Why Q2 Close Season Makes This Urgent
If you work in finance, you know what June looks like. Q2 close means a surge in wire transfers, vendor payments, and intercompany fund movements. Processing volumes are high. Deadlines are tight. People are approving things faster than usual because the close timeline does not wait.
Attackers study these rhythms. They know that a wire transfer request during close season raises fewer eyebrows than one in the middle of a quiet month. They know that finance teams are less likely to push back on verification steps when they are already behind on their close checklist. And they know that the social dynamics of urgency, where questioning a senior executive feels like slowing things down, work strongly in their favor during these periods.
This is exactly when your verification protocol needs to be airtight. The pressure of close season is not a reason to relax controls. It is the reason controls exist in the first place. Make sure your finance team knows that the out-of-band verification requirement applies especially during high-volume periods, and that nobody, regardless of their title, gets to bypass it.
Building Organizational Resilience
Technical detection and verification protocols are essential, but they work best when they are backed by a culture that takes this threat seriously. That means regular training, not a one-time presentation, but ongoing conversations about what deepfake attacks look like and how they evolve. Run tabletop exercises where your finance team practices responding to a simulated deepfake scenario. Make it realistic. Have someone actually call in pretending to be an executive requesting an urgent transfer, and see how your team responds.
It also means removing the social penalty for verification. If a junior accountant asks the CFO to confirm their identity through a callback, that should be praised, not punished. If someone delays a wire transfer because they wanted to complete the out-of-band verification, that is the system working as designed. Build a culture where healthy skepticism during video calls is expected and rewarded, not treated as an inconvenience. For more on building this kind of environment, see our post on building a security culture that actually works.
The Bottom Line
Real-time deepfakes are not a future problem. They are a current attack vector targeting businesses during their most vulnerable moments, and video calls can no longer be treated as proof of identity. The technology is good enough to fool people in real-time conversations, and it is only getting better.
The good news is that the defenses are straightforward. You do not need expensive AI detection software to protect your organization. You need a clear policy that says video calls alone are never sufficient for financial authorization. You need an out-of-band verification process that is simple, consistent, and non-negotiable. And you need a team that understands why these steps matter and feels empowered to follow them, even when the person on the other end of the call looks and sounds exactly like the CEO.
Start with the policy template above. Adapt it to your organization. Roll it out before Q2 close hits full stride. And if you have not already, read up on how AI is transforming social engineering attacks, because deepfakes are just one piece of a much larger shift in how threat actors operate.