As the use of smartphones become popular, people heavily depend on smartphone applications to deal with their social activities. For this reason, traditional message texting between mobile applications does not fulfill the versatile requirements of social networking. Many mobile applications use multimodality to deliver multimedia messages including sticker, voice and photo message, video call, and snap movie to enhance the communicative capability. However, without face-to-face interaction, people may fail to detect the other side's non-verbal social behavior such as fine-grain facial expressions, body movements, or hand gesture. During social interaction, non-verbal behavior conveys information about the involved individuals and help the speakers express their social emotion in an implicit way. It is so important for real-world face-to-face interaction but is often blocked on the mobile telephony. To cope with this problem, we propose an affective computing model to assist the representation of social emotion and then help the progress of social interaction on the mobile telephony. In this model, for the purpose of real-time affective analysis, we delegate the computing loading to the cloud side service and enhance the system's scalability and availability. The result of this experiment approves the feasibility of our system design for the applications of social intelligent. Also, the system provides a research framework of the social intelligent system on the mobile telephony.