With the rapid advancement of artificial intelligence (AI) technology, the computational demands of AI models have skyrocketed. Traditional computing cores like the CPU and the GPU, which emerged as a cornerstone of AI tasks, face limitations in performance and energy efficiency when deployed in terminal devices. To address the need for localize