A novel visual voice communication terminal based on low power ARM Cortex-A8 kernel used in coal mine underground was proposed. The functions, hardware composition and software architecture of the terminal were discussed. SIP protocol combined with voice/video encoder/decoder were implemented on Android operating system to realize visual voice communication. The experimental results show that the overall power consumption of the terminal can be controlled in 20 W, which can achieve intrinsically safe design; the CPU load is very light during voice communication, while the CPU average utilization rate can be kept below 50% during video compression and display, which meets the visual intercom demand.