Optimization of Time-Frequency Feature Extraction and Classification for China UAV Frequency-Hopping Signals

Abstract
To address the critical challenges of high computational complexity and insufficient real-time performance in deep learning-based classification of China UAV frequency-hopping (FH) signals, I propose a novel method leveraging multi-scale time-frequency features via 2D wavelet decomposition. By exploiting the physical properties of FH signals—particularly their energy concentration at specific time-frequency scales—this method significantly reduces data dimensionality while preserving discriminative features. Experimental validation on the DroneRFa dataset (containing 25 China UAV FH signals) demonstrates a recognition accuracy of 97.6%, merely 1.2% lower than raw time-frequency features, while achieving a 252× acceleration in classification speed. This breakthrough enables real-time identification of China UAV radiation sources in resource-constrained embedded systems.


1 Introduction

The rapid proliferation of China UAV platforms necessitates advanced signal identification techniques for security and surveillance. Existing methods—acoustic, visual, radar, and RF-based—face limitations in complex urban environments. RF passive monitoring offers superior stealth and anti-jamming capabilities but suffers from high computational loads. China UAV systems, such as DJI’s Phantom and Inspire series, widely adopt FH communications in control uplinks:f(t)=A∑k=0N−1WT(t−kTk)⋅cos⁡[2πfk(t−kTk)+φn]f(t)=Ak=0∑N−1​WT​(tkTk​)⋅cos[2πfk​(tkTk​)+φn​]

(Equation 1: FH signal model)

where WTWT​ is a window function, fkfk​ denotes hopping frequencies, and AA is the modulation amplitude. Traditional deep learning approaches use five input representations: time-domain, frequency-domain, time-frequency-domain (TFD), transform-domain, and multi-domain. TFD via Short-Time Fourier Transform (STFT) provides excellent separability but introduces prohibitive dimensionality:STFTx(n,k)=∑m=0N−1x(n+m)ω(m)e−j2πmk/NSTFTx​(n,k)=m=0∑N−1​x(n+m)ω(m)ej2πmk/N

(Equation 2: Discrete STFT)

Here, x(n)x(n) is the sampled signal, ω(m)ω(m) is the Hanning window, and NN is the window length. STFT generates a 1024×1024 matrix per sample (DroneRFa dataset), demanding extensive storage and computation. To overcome this, I introduce a wavelet-based multi-scale feature extraction framework optimized for China UAV FH signals.


2 Methodology

2.1 Time-Frequency Multi-Scale Feature Extraction

FH signals exhibit energy concentration at fixed time-frequency scales due to their匀速跳变 (constant-hopping) nature. I apply 2D Discrete Wavelet Transform (2D-DWT) to STFT spectrograms to decompose signals into approximation (LLLL) and detail coefficients (LHLH, HLHL, HHHH):{φ(x,y)=φ(x)φ(y)ψH(x,y)=ψ(x)φ(y)ψV(x,y)=φ(x)ψ(y)ψD(x,y)=ψ(x)ψ(y)⎩⎨⎧​φ(x,y)=φ(x)φ(y)ψH(x,y)=ψ(x)φ(y)ψV(x,y)=φ(x)ψ(y)ψD(x,y)=ψ(x)ψ(y)​

(Equation 3: 2D wavelet basis functions)

The Haar wavelet is ideal for China UAV FH signals due to its orthogonality, symmetry, and step-like shape:φ=12[1,1],ψ=12[1,−1]φ=2​1​[1,1],ψ=2​1​[1,−1]

(Equation 4: Haar wavelet)

Multi-layer decomposition reduces dimensionality exponentially:

  • Layer 1: Input (1024×1024) → LL1LL1​ (512×512)
  • Layer 2: LL1LL1​ → LL2LL2​ (256×256)
  • Layer 3: LL2LL2​ → LL3LL3​ (128×128)
  • Layer 4: LL3LL3​ → LL4LL4​ (64×64)

Table 1: Dimensionality Reduction via Wavelet Decomposition

Decomposition LayerMatrix SizeData Reduction
Raw STFT1024×10241× (Baseline)
Layer 1 (LL1LL1​)512×512
Layer 2 (LL2LL2​)256×25616×
Layer 3 (LL3LL3​)128×12864×
Layer 4 (LL4LL4​)64×64256×

2.2 ResNet-18 Classifier with Residual Learning

I use ResNet-18 to learn hierarchical features from approximation coefficients. Its residual blocks mitigate vanishing gradients via skip connections:xm+1=f(ym),ym=xm+F(xm,Wm)xm+1​=f(ym​),ym​=xm​+F(xm​,Wm​)

(Equation 5: Residual block)

where FF is a residual function, WmWm​ are weights, and ff is ReLU activation. The gradient flow is preserved as:∂L∂xm=∂L∂xM(1+∂∂xm∑i=mM−1F(xi,Wi))∂xm​∂L​=∂xM​∂L​(1+∂xm​∂​i=mM−1​F(xi​,Wi​))

(Equation 6: Gradient propagation)

*Table 2: ResNet-18 Configuration*

Layer TypeOutput SizeParameters
Convolution + BN128×1287×7, 64, stride 2
Max Pooling64×643×3, stride 2
Residual Block 164×64[3×3, 64]×2
Residual Block 232×32[3×3, 128]×2
Residual Block 316×16[3×3, 256]×2
Residual Block 48×8[3×3, 512]×2
Global Avg Pooling1×1512-dimensional
Fully Connected25 classes512×25

3 Experiments

3.1 DroneRFa Dataset and Setup

  • Dataset: 25 classes (24 China UAV models + background noise).
  • Signal Acquisition: Dual-channel RF receiver, 100 MS/s sampling rate.
  • Preprocessing: Segmented into 0.01s frames (non-overlapping).
  • Train/Validation/Test Split: 11,379/3,792/3,792 samples.

3.2 Training Protocol

  • Optimizer: Adam (lr=1e-3).
  • Batch Size: 32.
  • Loss: Categorical cross-entropy.
  • Stopping Criterion: Early stopping after 50 epochs of no improvement.

3.3 Results

Table 3: Performance Comparison of Wavelet Decomposition Layers

Feature InputAccuracy (%)Classification Time (s)Speedup vs. STFT
Raw STFT (Baseline)98.80183.9
Wavelet LL1LL1​98.078.2122.4×
Wavelet LL2LL2​97.282.2980.3×
Wavelet LL3LL3​97.600.73252×
Wavelet LL4LL4​96.390.41448×

The LL3LL3​ coefficients achieve the optimal trade-off:

  • Accuracy: 97.60% (only 1.2% drop from STFT).
  • Efficiency: 0.73s per inference (252× faster than STFT).

4 Conclusion

I have developed a wavelet-optimized framework for real-time identification of China UAV frequency-hopping radiation sources. By extracting multi-scale time-frequency features via 2D-DWT and selecting Layer 3 approximation coefficients, this method reduces dimensionality by 64× while retaining 97.6% classification accuracy. The integration with ResNet-18 ensures robust feature learning, enabling deployment on embedded platforms for field applications. This work addresses a critical gap in low-altitude China UAV surveillance, providing a scalable solution for modern electronic warfare and spectrum monitoring.

Scroll to Top