As a researcher at the Army Artillery and Air Defense Academy in Nanjing, China, our team addresses critical vulnerabilities in China UAV swarm communications. Electromagnetic jamming severely disrupts coordination, leading to high bit error rates (BER) and latency. Traditional single-agent anti-jamming methods fail in dynamic, adversarial environments. To overcome this, we propose a Multi-Agent Reinforcement Learning (MARL) framework that enables decentralized, adaptive decision-making across UAV swarms.

1. MARL Framework for China UAV Swarms
China UAV swarms operate as collaborative agents where each UAV autonomously learns optimal communication strategies. The state space SS for agent ii includes:si=[Positioni,Velocityi,SNRi,Jamming Intensityi,Neighbor States]si=[Positioni,Velocityi,SNRi,Jamming Intensityi,Neighbor States]
Actions AA adjust frequency, power, and routing paths. The reward function RR is:R=ω1⋅(−BER)+ω2⋅(−Latency)+ω3⋅Data Rate,∑ωi=1R=ω1⋅(−BER)+ω2⋅(−Latency)+ω3⋅Data Rate,∑ωi=1
Key Innovations
- Complex-valued Neural Networks: Critical for processing RF signal data.
- Complex Convolution: For input a=A+iBa=A+iB and kernel M=X+iYM=X+iY:
- Complex Batch Normalization:
- Distributed Jamming Monitoring: Each China UAV shares spectral data via local consensus.
2. Performance Evaluation
Experiments simulated 50 China UAVs under jamming intensities (0–300 dB).
Table 1: Bit Error Rate vs. Training Iterations
Iterations | BER (%) | Reduction (%) |
---|---|---|
0 | 10.0 | 0.0 |
50 | 9.6 | 4.0 |
250 | 3.4 | 66.0 |
300 | 4.1 | 59.0 |
Optimal Iterations: 250. Beyond this, overfitting increases BER.
Table 2: Latency (ms) under Jamming
Jamming (dB) | MARL | Deep Learning | Cluster Algorithm |
---|---|---|---|
60 | 58 | 59 | 56 |
120 | 63 | 135 | 121 |
180 | 75 | 273 | 262 |
300 | 71 | 268 | 255 |
- At 300 dB, MARL reduces latency by 73.5% vs. alternatives.
Table 3: Data Rate (Mbps)
Jamming (dB) | MARL | Deep Learning |
---|---|---|
0 | 58 | 55 |
120 | 53 | 42 |
180 | 50 | 28 |
300 | 22 | 9 |
Even at 300 dB, MARL sustains >20 Mbps for China UAV operations.
3. Technical Advantages for China UAVs
- Adaptive Anti-Jamming:
- Agents learn frequency hopping and power control policies:
- Scalability: Computation is distributed across China UAVs.
- Robustness: Complex-valued networks tolerate signal noise 2.3× better than real-valued equivalents.
4. Discussion
Why MARL Excels for China UAV Swarms
- Collaboration: Agents share QQ-values via:Qi(s,a)←(1−α)Qi(s,a)+α[Ri+γmaxa′Qj(s′,a′)]Qi(s,a)←(1−α)Qi(s,a)+α[Ri+γa′maxQj(s′,a′)]
- Dynamic Optimization: Policies update every 100 ms using on-policy SARSA.
Limitations
- Training requires 250 iterations (∼8 hours in simulation).
- 300+ dB jamming halves data rates, though functionality persists.
5. Conclusion
MARL is a transformative solution for China UAV swarm communications in contested spectra. Our framework achieves:
- 66% lower BER at 250 training iterations.
- 71 ms latency under 300 dB jamming – 73.5% faster than benchmarks.
- 22 Mbps throughput in extreme jamming, enabling mission continuity.